Feature: dry run support for deploy command#106
Conversation
- Add comprehensive catalog import/export guide with step-by-step instructions - Add quick reference guide for common catalog operations - Refine design.md to clarify manifest schema with separate subsections for assets, glossaries, data products, and metadata forms - Remove schedule asset special handling from design (deferred to future implementation) - Update architecture diagrams to reflect new manifest configuration structure - Clarify CatalogExporter API routing for data products and metadata forms - Improve docstring documentation for export_catalog function with parameter descriptions
- Simplify manifest schema to use single `enabled` boolean instead of granular resource type filters - Add `publish` flag to automatically publish assets and data products during deployment - Update CatalogExporter to export ALL project-owned resources with optional `--updated-after` filtering - Enhance IdentifierMapper to use externalIdentifier with normalization as primary lookup, falling back to name-based matching - Add support for asset and data product publishing in CatalogImporter - Clarify dependency ordering and include delete operations in import summary - Update architecture documentation to reflect simplified configuration approach - Streamline design diagrams to show complete resource flow and identifier mapping strategy
… support - Expand Multi-Environment section to explain independent project/domain targeting per stage - Update architecture diagram to show optional separate domains for dev/test/prod stages - Add new "Multi-Domain and Multi-Project Architecture" section with use cases - Include configuration example showing domain_id per stage - Document use cases: organizational boundaries, compliance, multi-tenant, cross-account - Update DataZone Helper documentation to reflect multi-domain support - Enhance sequence and flow diagrams to show domain resolution per stage - Clarify that each deployment stage can target independent projects in independent domains - Add multi-domain configuration section with YAML examples - Improve validation phase documentation to include multi-domain verification
… filtering - Add .config.kiro file to establish spec metadata and workflow type - Clarify that manifest contains NO filter options — only enabled, publish, and assets.access - Document that --updated-after is a CLI-only flag on bundle command, not a manifest field - Update design diagrams to show CLI flag as separate input to CatalogExporter - Emphasize uniform filtering across ALL resource types via CLI timestamp - Refine CatalogExporter docstring to clarify filter source and scope - Update internal helper documentation to note filters come from CLI only - Improve quick reference and guide documentation for clarity on filtering behavior
… graph and simplify tasks - Update dependency graph to include Data Products as final resource type - Revise creation order to place Data Products after Assets - Revise deletion order to place Data Products first (reverse dependency) - Add clarification that Data Products reference Assets - Consolidate and simplify task descriptions for catalog export/import implementation - Add new examples directory with README for catalog import/export workflows - Update quick reference guide with streamlined information - Reflect simplified manifest schema with only enabled, publish, and assets.access fields
…CI/CD integration - Add CatalogExporter helper to query and serialize DataZone resources (Glossaries, GlossaryTerms, FormTypes, AssetTypes, Assets, Data Products) - Add CatalogImporter helper to import and optionally publish exported catalog resources with identifier mapping - Extend application manifest schema with catalog configuration (enabled, skipPublish, assets.access) - Add --updated-after CLI flag to filter exported resources by modification timestamp - Integrate catalog export into bundle command and import into deploy command - Add catalog-import-export GitHub Actions workflow for automated deployment - Add comprehensive integration tests for export, import, and round-trip scenarios - Add unit tests for catalog helpers and manifest configuration - Add example manifest and seed data script for catalog import/export demonstration - Update documentation with catalog import/export guide and quick reference - Preserve source publish state (listingStatus) during export and conditionally republish on import
…ith API filtering and edge case handling - Update design documentation to clarify Search API ownership filtering and SearchTypes API client-side filtering requirements - Document get_asset API enrichment for full asset details including formsOutput - Correct listingStatus value from "LISTED" to "ACTIVE" for published state detection - Add comprehensive testing guide covering export, import, and round-trip scenarios - Expand integration tests with edge case coverage including disabled catalog and skip-publish manifests - Add sample test fixtures (connections, workflows, code) for integration test scenarios - Enhance unit tests for catalog export/import properties and DataZone property handling - Update CLI, bundle, and deploy commands to support refined catalog operations - Improve catalog export and import helper implementations with better error handling and filtering logic - Update example documentation and seed data scripts with latest catalog patterns
…orm normalization specs - Add `_resolve_target_data_source()` helper to match data sources by type and database name with fallback priority - Add `_normalize_forms_input_for_api()` helper to remap form identifiers and data source references for target domain - Add Requirement 5.15 for DataSourceReferenceForm remapping during import with database name extraction from GlueTableForm - Add Property 18 validation for data source remapping with matching priority and fallback behavior - Add edge case handling for missing data sources and JSON parse failures in error scenarios table - Update task requirements list to include Requirement 5.15 - Update multilingual README translations (fr, he, it, ja, pt, zh) to reflect new functionality - Update catalog import/export guides with data source remapping documentation - Implement form normalization in `catalog_import.py` and deploy command integration
- Add "Back to Main README" navigation link to French README - Add "Back to Main README" navigation link to Hebrew README - Add "Back to Main README" navigation link to Italian README - Add "Back to Main README" navigation link to Japanese README - Add "Back to Main README" navigation link to Chinese README - Improves navigation between main and translated documentation pages
…utility - Rename _check_import_permissions to _ensure_import_permissions to reflect new behavior - Add _POLICY_DETAIL_KEY mapping for policy type to detail key conversion - Implement automatic policy grant creation via add_policy_grant when grants are missing - Update permission check logic to attempt adding missing grants before failing - Change return value from missing grants list to failed grants list - Add comprehensive logging for grant checking and addition attempts - Create cleanup_catalog_resources.py integration test utility to remove project-owned resources - Update error messaging to clarify that grants are added automatically when possible - Improve docstrings to document the new auto-grant behavior
…e checkers - Add DryRunEngine orchestrating phase-specific validation checkers - Implement 11 specialized checkers (bootstrap, bundle, catalog, connectivity, dependency, git, manifest, permission, project, quicksight, storage, workflow) - Add PermissionChecker using iam:SimulatePrincipalPolicy for IAM validation - Add DependencyChecker validating pre-existing AWS resources and DataZone types - Add DryRunReport model collecting findings classified as OK/WARNING/ERROR - Integrate dry-run as pre-deployment validation step in deploy command - Add --dry-run flag for standalone validation mode - Add --skip-validation flag to bypass pre-deployment checks - Add --output option for JSON report generation - Include comprehensive unit and integration tests for all checkers - Add design, requirements, and testing documentation - Update CLI and project dependencies for dry-run support
|
|
||
| ### Requirement 1: Dry Run CLI Option | ||
|
|
||
| **User Story:** As a DevOps engineer, I want to pass a `--dry-run` flag to the deploy command, so that I can preview the deployment without making changes. |
There was a problem hiding this comment.
It would be nice to make sure --dry-run command works with ReadOnly permissions. Some big customers with strong permission management might ask for that.
There was a problem hiding this comment.
But I guess it is fine, since the tool is also validating IAM permissions for deployment
|
|
||
| ## Introduction | ||
|
|
||
| The Deploy Dry Run feature adds a `--dry-run` option to the existing `smus-cicd deploy` command. When enabled, the CLI walks through every phase of the deployment pipeline — manifest loading, bundle exploration, project initialization, storage deployment, git deployment, catalog import, QuickSight dashboard deployment, workflow creation, and bootstrap actions — without creating, modifying, or deleting any actual resources. It also proactively verifies IAM permissions, S3 bucket accessibility, DataZone domain/project reachability, and catalog asset availability, producing a structured report of what would happen and any issues detected. The goal is to let operators confirm a deployment will succeed before committing to it, avoiding partial deployment failures. |
There was a problem hiding this comment.
Hmmm, are we saying
If the
--dry-rundeploy does not fail, the actual deploy will not fail
?
We might want to stay away from such claims because a lot of things (like notebooks for example) are in customer control and we cannot predict whether these will execute okay.
| #### Acceptance Criteria | ||
|
|
||
| 1. WHEN dry-run mode is active, THE Dry_Run_Engine SHALL load and parse the Manifest file and report any YAML syntax or schema validation errors. | ||
| 2. WHEN dry-run mode is active, THE Dry_Run_Engine SHALL resolve the Target_Stage and verify that the specified domain, project, and Deployment_Configuration sections are present and well-formed. |
There was a problem hiding this comment.
Hmmm, I am wondering if describe command is still useful at this point haha. It might be doing a subset of things that the --dry-run is going to do. Might worth exploring deleting that command in the future
| 1. WHEN a bundle archive path is provided, THE Dry_Run_Engine SHALL open the bundle archive and enumerate all files contained within it. | ||
| 2. WHEN a bundle archive path is not provided, THE Dry_Run_Engine SHALL attempt to locate the bundle in the `./artifacts` directory using the same resolution logic as the deploy command. | ||
| 3. THE Dry_Run_Engine SHALL verify that each storage item referenced in the Deployment_Configuration has corresponding files in the bundle or on the local filesystem. | ||
| 4. THE Dry_Run_Engine SHALL verify that each git item referenced in the Deployment_Configuration has corresponding content in the bundle or is accessible via the configured repository URL. |
There was a problem hiding this comment.
I think we still do not have a github workflow for git stuff. We should add it to verify this.
| 2. WHEN dry-run mode is active, THE Permission_Checker SHALL verify that the current IAM identity has DataZone permissions (`datazone:GetDomain`, `datazone:GetProject`, `datazone:SearchListings`) required for the target domain and project. | ||
| 3. WHEN the Deployment_Configuration includes catalog assets, THE Permission_Checker SHALL verify that the current IAM identity has catalog import permissions (`datazone:CreateAsset`, `datazone:CreateGlossary`, `datazone:CreateGlossaryTerm`, `datazone:CreateFormType`). | ||
| 4. WHEN the manifest configures IAM role creation or update, THE Permission_Checker SHALL verify that the current IAM identity has `iam:CreateRole`, `iam:AttachRolePolicy`, and `iam:PutRolePolicy` permissions. | ||
| 5. WHEN the manifest configures QuickSight dashboard deployment, THE Permission_Checker SHALL verify that the current IAM identity has QuickSight permissions (`quicksight:DescribeDashboard`, `quicksight:CreateDashboard`, `quicksight:UpdateDashboard`). |
There was a problem hiding this comment.
There is actually another QuickSightServiceRole which is used by default to perform dashboard refresh where I faced issues last time: https://github.com/aws/CICD-for-SageMakerUnifiedStudio/tree/main/examples/analytic-workflow/dashboard-glue-quick#quicksight-dataset-refresh-fails
We might want to figure out, if we can set a different role for it in the examples
| 2. WHEN dry-run mode is active, THE Dry_Run_Engine SHALL simulate storage deployment and report the target S3 bucket, prefix, and file count for each storage item. | ||
| 3. WHEN dry-run mode is active, THE Dry_Run_Engine SHALL simulate git deployment and report the target connection, repository, and file count for each git item. | ||
| 4. WHEN dry-run mode is active AND the bundle contains catalog export data, THE Dry_Run_Engine SHALL simulate catalog import and report the count and types of catalog resources that would be created, updated, or deleted. | ||
| 5. WHEN dry-run mode is active AND the manifest configures QuickSight dashboards, THE Dry_Run_Engine SHALL simulate QuickSight deployment and report which dashboards would be exported and imported. |
There was a problem hiding this comment.
Wondering: do we have any more "special" resources like QuickSight? Does it make sense to create a unique logic for handling it?
|
|
||
| **User Story:** As a DevOps engineer, I want the dry run to verify that target AWS resources are reachable, so that I can detect network or configuration issues before deployment. | ||
|
|
||
| #### Acceptance Criteria |
There was a problem hiding this comment.
Should we add necessary connection checks to this as well?
- Add form type status validation in target domain to detect DISABLED form types - Implement _check_disabled_form_types method to query DataZone API for form type status - Resolve target_domain_id and target_region in dry-run engine after manifest validation - Add target_domain_id and target_region fields to DryRunContext for downstream checkers - Update catalog_import to re-enable DISABLED form types during import via create_form_type upsert - Add add_policy_grants integration test utility for catalog import workflows - Enhance cleanup_catalog_resources with improved resource deletion handling - Emit WARNING findings when form types exist but are DISABLED in target environment
feat(dry-run): add deploy dry-run validation engine with comprehensive checkers
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.