refactor(test): restructure integration tests with tiered CI, parallel jobs, and Finch cross-platform smoke tests#8666
Conversation
… add test tiering, scope Docker cleanup - Remove container_runtime matrix (docker/finch/no-container) in favor of explicit test_suite entries - Split build tests into build-x86, build-arm64, build-x86-container, build-arm64-container - Merge local-start1/local-start2 into local-start with -n 2 parallelism - Add cloud-based-tests job using @pytest.mark.requires_credential marker - Skip credential-requiring tests (SAR, layers, STS) in local-only jobs via SAM_CLI_NO_CREDENTIALS - Scope Docker container/image cleanup to per-test-class snapshots to prevent cross-worker interference - Extract credential setup/reset into setup_testing_resources.py and reset_testing_resources.py - Update credential test runtimes: dotnet10, java25, python3.12, ruby3.4, nodejs22.x, provided.al2023 for Go - Remove Free up disk space step - Add uv availability check in Makefile init
…dates, schema registry lookup - Add requires_credential marker to terraform S3 backend and layer tests - Exclude requires_credential tests from terraform job, run in cloud-based-tests - Remove AWS credentials from terraform job (no longer needed) - Add Terraform install to cloud-based-tests job - Fix java25 dir rename and pom.xml compiler version - Add go.sum for Go STS credential test - Remove global Docker image cleanup from WarmContainersRemoteLayers tests - Add xdist_group markers for durable tests (port 9014) and rapid image tests - Use --dist loadgroup for local-invoke and local-start jobs - Fix EventBridge schema registry tests with dynamic position lookup - Update CONTRIBUTING.md with integration test guidelines
… jobs - All jobs now get credentials unconditionally (remove conditional OIDC/ECR/reset) - Use -m "not requires_credential" in build/local jobs instead of SAM_CLI_NO_CREDENTIALS env var - Remove SKIP_CREDENTIAL_TESTS and skipIf patterns from test files - Remove requires_credential from terraform tests (run in terraform jobs directly) - Split terraform into terraform-build (-n 4) and terraform-local (sequential) - Mark TestBuildCommand_LayerBuilds as requires_credential - Add xdist groups: durable (callback, execution), remote_layers, docker_watcher - Add TestSamPython36HelloWorldIntegrationImages to docker_images group - Update CONTRIBUTING.md with simplified marker-only approach
…st groups - Fix verify_pulled_image/verify_docker_container_cleanedup to use python3.11 matching the actual template runtimes (was incorrectly hardcoded to python3.12) - Make test_delete_no_prompts_with_s3_prefix_present_zip robust against deploy failures (CloudFormation EarlyValidation hooks) - Add xdist groups for TestSamPython36HelloWorldIntegrationImages (docker_images), TestLocalCallback/TestLocalExecution (durable), WarmContainersRemoteLayers* (remote_layers), Watching*Image*/DockerFileLocation* (docker_watcher)
- Split local-start into local-start-api and local-start-lambda - Split sync into sync-code (-n 2) and sync-watch (sequential) - Split package-delete-deploy into package (+ delete) and deploy - Remove Docker image cleanup from layer tests (images persist, reused) - Add lambda_layers xdist group to all layer test classes - Add flaky(reruns=3) at class level for TestLayerVersion - Update Node.js STS SDK to ^3.700.0 (fix @smithy/protocol-http missing)
- build-x86 -> build-x86-1 (general) + build-x86-2 (language-specific) - build-x86-container -> build-x86-container-1 + build-x86-container-2 - build-arm64-container -> build-arm64-container-1 (non-java) + build-arm64-container-2 (java) - terraform-local -> terraform-start-api + terraform-invoke-start-lambda - Update all setup step conditions for new job names - Remove samcli image cleanup from layer tests, add lambda_layers xdist group
… test files are auto-included
- Move test_build_cmd_python.py and test_build_cmd_java.py to build-x86-1/container-1 - Move test_sync_build_in_source.py from sync-code to sync-watch
…ts as requires_credential - Add early reset step after ECR login for build/local jobs to free test account - Skip final reset for jobs that already released early - Mark TestSamPython36HelloWorldIntegrationImages and TestDeleteOldRapidImages as requires_credential
…iner assertion - Merge S3 report upload into reset_testing_resources.py (always uploads, conditionally resets) - Use SKIP_ACCOUNT_RESET env var for local-only jobs instead of workflow condition - Delete standalone upload_test_report.py - Fix TestWarmContainersHandlesSigTermInterrupt to use assertGreaterEqual for container count
- Add @pytest.mark.tier1 to 23 test classes across all feature areas - Fix S3 report upload: use configure-aws-credentials role-chaining for OIDC->RoleA->RoleB - Simplify upload_test_reports to use default credentials (set by workflow) - Add tier1 markers for: durable, layers, sync, deploy, terraform, regression, callback, execution - Update TIER1_TESTS.md with complete coverage table
…ust import - Add tier1-finch to matrix with conditional Finch setup, ECR login, and runtime - Move setup_finch.sh from scripts/ to tests/ - Add tier1-finch to all toolchain setup conditions - Remove separate tier1-finch job (now part of matrix) - Fix missing pytest import in test_build_cmd_rust.py
… cases, add missing commands - Move tier1 from class-level to method-level with dedicated test_tier1_* methods - Add container + non-container tier1 for each runtime (Python, Java, Node, Dotnet, Rust, Provided) - Remove duplicated parameterized cases covered by tier1 methods - Add tier1 for: sam delete, sam package, symlink builds, layer builds, sam sync - Fix missing pytest imports in delete and package tests - Revert cargo-lambda to pip install - Update CONTRIBUTING.md and TIER1_TESTS.md
- Fix tier1 methods that called parameterized methods (inline logic instead) - Fix dotnet validate_build_command params (mode=None, use_container=True) - Fix Python tier1 skip condition to check template and codeuri - Add ARM64 tier1 build tests: Python, Java, Node, Ruby, Provided, Rust - Update TIER1_TESTS.md with ARM64 section
…ams, skip ARM64 without Docker - Restore test_tier1_rust_build_in_container (was accidentally deleted) - Fix dotnet container tier1 to use dotnet8 (dotnet10 container image not available) - Fix layer tier1 to use correct overrides (LayerContentUri, python3.11) - Add skipIf(SKIP_DOCKER_TESTS) to ARM64 tier1 tests (need Docker for invoke) - All container tier1 methods have _in_container suffix for -k filter compatibility
- Dotnet container: use mount_mode=MountMode.WRITE (not use_container=True) - Layer build: use python3.12 (matching original test_build_single_layer) - Rust: remove container tier1 (cargo-lambda doesn't support container builds) - These match the exact parameter combinations that pass in the original tests
…emove temp docs - Revert ARM64 tier1 names (no _in_container since use_container=False) - Enable QEMU for build-arm64 job (needed for local invoke) - Fix mypy import error in setup_testing_resources.py - Remove Rust container tier1 (cargo-lambda doesn't support container builds) - Fix dotnet container tier1 to use MountMode.WRITE - Fix layer tier1 to use python3.12 - Remove CLOUD_VS_LOCAL_REPORT.md and TIER1_TESTS.md
…ogin, add skipIf to dotnet tier1
| @@ -854,6 +834,9 @@ def tearDownClass(cls): | |||
| ], | |||
| ) | |||
| @skipIf(SKIP_LAYERS_TESTS, "Skip layers tests in Appveyor only") | |||
There was a problem hiding this comment.
For all the skip ifs lets review after we are actually moved out
…al jobs - Add tests/free_disk_space.sh: cleanup if <25GB free, reduce swap to 1GB, nohup rm - Mark git function tests and WarmContainersRemoteLayers as requires_credential - Skip Get testing resources for non-credential jobs, clear AWS creds after ECR login - Add skipIf(SKIP_DOCKER_TESTS) to dotnet tier1 non-container test
| if container.id not in cls._pre_existing_container_ids: | ||
| try: |
There was a problem hiding this comment.
I guess we're now deleting all the containers that didn't exist when this class started. But in theory if there are things running in parallel, we could still be deleting a container created by a different class, as long as it was created after this class started, right?
So basically the change is that we're just "not deleting the containers that existed when this class started". I guess that makes a difference?
There was a problem hiding this comment.
Yeah that's correct. To make it more bulletproof probably we need to know what exactly are the containers created. But that seems not easy.
| - name: Initialize project | ||
| run: | | ||
| export CONTAINER_RUNTIME=${{ matrix.container_runtime }} | ||
| if [[ "${{ matrix.test_suite }}" == "build-x86-1" || "${{ matrix.test_suite }}" == "build-x86-2" || "${{ matrix.test_suite }}" == "build-arm64" ]]; then |
There was a problem hiding this comment.
Not too important, but we probably want to be consistent if we're doing contains(fromJSON( in all the other ifs here.
Re-add the samcli/lambda-* Docker image cleanup in tearDown for TestLayerVersionBase and TestLayerVersionThatDoNotCreateCache. This was removed in PR #8666 to avoid cross-test interference in parallel runs, but these test classes are already serialized via xdist_group markers. Without the cleanup, stale cached images cause layer version tests to use outdated layers.
* Restore layer test tearDown image cleanup removed in #8666 Re-add the samcli/lambda-* Docker image cleanup in tearDown for TestLayerVersionBase and TestLayerVersionThatDoNotCreateCache. This was removed in PR #8666 to avoid cross-test interference in parallel runs, but these test classes are already serialized via xdist_group markers. Without the cleanup, stale cached images cause layer version tests to use outdated layers. * limit chardet < 6 in cargo lambda * dep * reuse cleanup_samcli_images * nit
Which issue(s) does this change fix?
Addresses integration test CI performance and reliability issues. No specific issue number.
Why is this change necessary?
The integration test workflow was slow (~2+ hours), flaky due to cross-worker Docker interference, and lacked cross-platform validation. The
container_runtimematrix (docker/finch/no-container) created too many combinations, and tests that required AWS credentials were mixed with local-only tests.How does it address the issue?
1. Workflow restructuring (21 parallel matrix jobs)
container_runtimematrix dimension (docker/finch/no-container)2. Credential-based test separation
@pytest.mark.requires_credentialmarker for tests needing AWS credentials-m "not requires_credential"to exclude cloud testscloud-based-testsjob collects them via-m requires_credential3. Docker container cleanup isolation
xdist_groupmarkers:durable,docker_images,remote_layers,docker_watcher,lambda_layers4. Tier 1 cross-platform smoke tests (~60 curated tests)
A curated subset marked with
@pytest.mark.tier1runs on every OS/container-runtime combination via thetier1-finchjob. Each test is a dedicatedtest_tier1_*method calling existing logic with one specific parameter set.test_tier1_python_buildtest_tier1_python_build_in_containertest_tier1_java_buildtest_tier1_java_build_in_containertest_tier1_node_buildtest_tier1_node_build_in_containertest_tier1_dotnet_buildtest_tier1_dotnet_build_in_containertest_building_ruby_3_2(parameterized)test_tier1_rust_buildtest_tier1_provided_buildtest_tier1_provided_build_in_containertest_nested_build_invoke_in_containerTestBuildWithNestedStacks3LevelWithSymlinktest_samconfig_parameters_are_overriddentest_build_and_invoke_lambda_functionstest_tier1_layer_buildtest_tier1_python_arm64_buildtest_tier1_java_arm64_buildtest_tier1_node_arm64_buildtest_tier1_ruby_arm64_buildtest_tier1_provided_arm64_buildtest_tier1_rust_arm64_buildtest_invoke_returncode_is_zerotest_local_zip_layerstest_tier1_durable_invoketest_calling_proxy_endpointtest_invoke_with_datatest_generate_event_substitutiontest_tier1_callbacktest_tier1_executiontest_init_command_passes_and_dir_createdtest_default_template_file_choicetest_deploy_guided_ziptest_tier1_deletetest_tier1_packagetest_tier1_sync_infra5. Test fixes and improvements
verify_pulled_imageruntime mismatch (python3.12 → python3.11)6. Infrastructure
tests/setup_testing_resources.py— credential setup scripttests/reset_testing_resources.py— account reset + S3 report uploadtests/setup_finch.sh— Finch installation scriptCONTRIBUTING.mdwith integration test guidelinesWhat side effects does this change have?
test_tier1_*methods (same coverage, no duplication)build-arm64job now sets up QEMU (needed for ARM64 tier1 local invoke)Mandatory Checklist
PRs will only be reviewed after checklist is complete
make prpassesmake update-reproducible-reqsif dependencies were changedBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.