refactor(test): restructure integration tests with tiered CI, parallel jobs, and Finch cross-platform smoke tests by roger-zhangg · Pull Request #8666 · aws/aws-sam-cli

roger-zhangg · 2026-02-19T09:02:36Z

Which issue(s) does this change fix?

Addresses integration test CI performance and reliability issues. No specific issue number.

Why is this change necessary?

The integration test workflow was slow (~2+ hours), flaky due to cross-worker Docker interference, and lacked cross-platform validation. The container_runtime matrix (docker/finch/no-container) created too many combinations, and tests that required AWS credentials were mixed with local-only tests.

How does it address the issue?

1. Workflow restructuring (21 parallel matrix jobs)

Removed container_runtime matrix dimension (docker/finch/no-container)
Split large jobs into smaller parallel ones: build-x86-1/2, build-arm64, build-x86-container-1/2, build-arm64-container-1/2, terraform-build/start-api/invoke-start-lambda, package, deploy, sync-code/watch, local-invoke/start-api/start-lambda, other-and-e2e, cloud-based-tests, tier1-finch
Local-only jobs release test accounts early after ECR login

2. Credential-based test separation

@pytest.mark.requires_credential marker for tests needing AWS credentials
Build/local jobs use -m "not requires_credential" to exclude cloud tests
cloud-based-tests job collects them via -m requires_credential

3. Docker container cleanup isolation

Scoped container cleanup to per-test-class snapshots
Added xdist_group markers: durable, docker_images, remote_layers, docker_watcher, lambda_layers

4. Tier 1 cross-platform smoke tests (~60 curated tests)

A curated subset marked with @pytest.mark.tier1 runs on every OS/container-runtime combination via the tier1-finch job. Each test is a dedicated test_tier1_* method calling existing logic with one specific parameter set.

Category	Test	File
Build: Python	`test_tier1_python_build`	test_build_cmd_python.py
Build: Python (container)	`test_tier1_python_build_in_container`	test_build_cmd_python.py
Build: Java	`test_tier1_java_build`	test_build_cmd_java.py
Build: Java (container)	`test_tier1_java_build_in_container`	test_build_cmd_java.py
Build: Node.js	`test_tier1_node_build`	test_build_cmd_node.py
Build: Node.js (container)	`test_tier1_node_build_in_container`	test_build_cmd_node.py
Build: .NET	`test_tier1_dotnet_build`	test_build_cmd_dotnet.py
Build: .NET (container)	`test_tier1_dotnet_build_in_container`	test_build_cmd_dotnet.py
Build: Ruby	`test_building_ruby_3_2` (parameterized)	test_build_cmd.py
Build: Rust	`test_tier1_rust_build`	test_build_cmd_rust.py
Build: Provided	`test_tier1_provided_build`	test_build_cmd_provided.py
Build: Provided (container)	`test_tier1_provided_build_in_container`	test_build_cmd_provided.py
Build: Nested stacks	`test_nested_build_invoke_in_container`	test_build_cmd.py
Build: Symlink	`TestBuildWithNestedStacks3LevelWithSymlink`	test_build_cmd.py
Build: Samconfig	`test_samconfig_parameters_are_overridden`	test_build_samconfig.py
Build: Terraform	`test_build_and_invoke_lambda_functions`	test_build_terraform_applications.py
Build: Layer	`test_tier1_layer_build`	test_build_cmd.py
ARM64: Python	`test_tier1_python_arm64_build`	test_build_cmd_arm64.py
ARM64: Java	`test_tier1_java_arm64_build`	test_build_cmd_arm64.py
ARM64: Node.js	`test_tier1_node_arm64_build`	test_build_cmd_arm64.py
ARM64: Ruby	`test_tier1_ruby_arm64_build`	test_build_cmd_arm64.py
ARM64: Provided	`test_tier1_provided_arm64_build`	test_build_cmd_arm64.py
ARM64: Rust	`test_tier1_rust_arm64_build`	test_build_cmd_arm64.py
local invoke	`test_invoke_returncode_is_zero`	test_integrations_cli.py
local invoke (layers)	`test_local_zip_layers`	test_integrations_cli.py
local invoke (durable)	`test_tier1_durable_invoke`	test_invoke_durable.py
local start-api	`test_calling_proxy_endpoint`	test_start_api.py
local start-lambda	`test_invoke_with_data`	test_start_lambda.py
local generate-event	`test_generate_event_substitution`	test_cli_integ.py
local callback	`test_tier1_callback`	test_callback.py
local execution	`test_tier1_execution`	test_execution.py
sam init	`test_init_command_passes_and_dir_created`	test_init_command.py
sam validate	`test_default_template_file_choice`	test_validate_command.py
sam deploy	`test_deploy_guided_zip`	test_deploy_command.py
sam delete	`test_tier1_delete`	test_delete_command.py
sam package	`test_tier1_package`	test_package_command_zip.py
sam sync	`test_tier1_sync_infra`	test_sync_infra.py

5. Test fixes and improvements

Fixed verify_pulled_image runtime mismatch (python3.12 → python3.11)
Fixed EventBridge schema registry tests with dynamic position lookup
Updated credential test runtimes (dotnet10, java25, python3.12, ruby3.4, nodejs22.x)
Made delete test robust against CloudFormation EarlyValidation failures
Fixed warm container SIGTERM test assertion

6. Infrastructure

tests/setup_testing_resources.py — credential setup script
tests/reset_testing_resources.py — account reset + S3 report upload
tests/setup_finch.sh — Finch installation script
Updated CONTRIBUTING.md with integration test guidelines

What side effects does this change have?

Some parameterized test cases moved to dedicated test_tier1_* methods (same coverage, no duplication)
Finch tests run as a new matrix entry (~15 min, parallel with other jobs)
build-arm64 job now sets up QEMU (needed for ARM64 tier1 local invoke)

Mandatory Checklist

PRs will only be reviewed after checklist is complete

Review the generative AI contribution guidelines
Add input/output type hints to new functions/methods
Write design document if needed
Write/update unit tests
Write/update integration tests
Write/update functional tests if needed
make pr passes
make update-reproducible-reqs if dependencies were changed
Write documentation

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

… add test tiering, scope Docker cleanup - Remove container_runtime matrix (docker/finch/no-container) in favor of explicit test_suite entries - Split build tests into build-x86, build-arm64, build-x86-container, build-arm64-container - Merge local-start1/local-start2 into local-start with -n 2 parallelism - Add cloud-based-tests job using @pytest.mark.requires_credential marker - Skip credential-requiring tests (SAR, layers, STS) in local-only jobs via SAM_CLI_NO_CREDENTIALS - Scope Docker container/image cleanup to per-test-class snapshots to prevent cross-worker interference - Extract credential setup/reset into setup_testing_resources.py and reset_testing_resources.py - Update credential test runtimes: dotnet10, java25, python3.12, ruby3.4, nodejs22.x, provided.al2023 for Go - Remove Free up disk space step - Add uv availability check in Makefile init

…dates, schema registry lookup - Add requires_credential marker to terraform S3 backend and layer tests - Exclude requires_credential tests from terraform job, run in cloud-based-tests - Remove AWS credentials from terraform job (no longer needed) - Add Terraform install to cloud-based-tests job - Fix java25 dir rename and pom.xml compiler version - Add go.sum for Go STS credential test - Remove global Docker image cleanup from WarmContainersRemoteLayers tests - Add xdist_group markers for durable tests (port 9014) and rapid image tests - Use --dist loadgroup for local-invoke and local-start jobs - Fix EventBridge schema registry tests with dynamic position lookup - Update CONTRIBUTING.md with integration test guidelines

… jobs - All jobs now get credentials unconditionally (remove conditional OIDC/ECR/reset) - Use -m "not requires_credential" in build/local jobs instead of SAM_CLI_NO_CREDENTIALS env var - Remove SKIP_CREDENTIAL_TESTS and skipIf patterns from test files - Remove requires_credential from terraform tests (run in terraform jobs directly) - Split terraform into terraform-build (-n 4) and terraform-local (sequential) - Mark TestBuildCommand_LayerBuilds as requires_credential - Add xdist groups: durable (callback, execution), remote_layers, docker_watcher - Add TestSamPython36HelloWorldIntegrationImages to docker_images group - Update CONTRIBUTING.md with simplified marker-only approach

…st groups - Fix verify_pulled_image/verify_docker_container_cleanedup to use python3.11 matching the actual template runtimes (was incorrectly hardcoded to python3.12) - Make test_delete_no_prompts_with_s3_prefix_present_zip robust against deploy failures (CloudFormation EarlyValidation hooks) - Add xdist groups for TestSamPython36HelloWorldIntegrationImages (docker_images), TestLocalCallback/TestLocalExecution (durable), WarmContainersRemoteLayers* (remote_layers), Watching*Image*/DockerFileLocation* (docker_watcher)

- Split local-start into local-start-api and local-start-lambda - Split sync into sync-code (-n 2) and sync-watch (sequential) - Split package-delete-deploy into package (+ delete) and deploy - Remove Docker image cleanup from layer tests (images persist, reused) - Add lambda_layers xdist group to all layer test classes - Add flaky(reruns=3) at class level for TestLayerVersion - Update Node.js STS SDK to ^3.700.0 (fix @smithy/protocol-http missing)

- build-x86 -> build-x86-1 (general) + build-x86-2 (language-specific) - build-x86-container -> build-x86-container-1 + build-x86-container-2 - build-arm64-container -> build-arm64-container-1 (non-java) + build-arm64-container-2 (java) - terraform-local -> terraform-start-api + terraform-invoke-start-lambda - Update all setup step conditions for new job names - Remove samcli image cleanup from layer tests, add lambda_layers xdist group

… test files are auto-included

- Move test_build_cmd_python.py and test_build_cmd_java.py to build-x86-1/container-1 - Move test_sync_build_in_source.py from sync-code to sync-watch

…ts as requires_credential - Add early reset step after ECR login for build/local jobs to free test account - Skip final reset for jobs that already released early - Mark TestSamPython36HelloWorldIntegrationImages and TestDeleteOldRapidImages as requires_credential

…iner assertion - Merge S3 report upload into reset_testing_resources.py (always uploads, conditionally resets) - Use SKIP_ACCOUNT_RESET env var for local-only jobs instead of workflow condition - Delete standalone upload_test_report.py - Fix TestWarmContainersHandlesSigTermInterrupt to use assertGreaterEqual for container count

- Add @pytest.mark.tier1 to 23 test classes across all feature areas - Fix S3 report upload: use configure-aws-credentials role-chaining for OIDC->RoleA->RoleB - Simplify upload_test_reports to use default credentials (set by workflow) - Add tier1 markers for: durable, layers, sync, deploy, terraform, regression, callback, execution - Update TIER1_TESTS.md with complete coverage table

…ions)

…ust import - Add tier1-finch to matrix with conditional Finch setup, ECR login, and runtime - Move setup_finch.sh from scripts/ to tests/ - Add tier1-finch to all toolchain setup conditions - Remove separate tier1-finch job (now part of matrix) - Fix missing pytest import in test_build_cmd_rust.py

…o conflict

… cases, add missing commands - Move tier1 from class-level to method-level with dedicated test_tier1_* methods - Add container + non-container tier1 for each runtime (Python, Java, Node, Dotnet, Rust, Provided) - Remove duplicated parameterized cases covered by tier1 methods - Add tier1 for: sam delete, sam package, symlink builds, layer builds, sam sync - Fix missing pytest imports in delete and package tests - Revert cargo-lambda to pip install - Update CONTRIBUTING.md and TIER1_TESTS.md

- Fix tier1 methods that called parameterized methods (inline logic instead) - Fix dotnet validate_build_command params (mode=None, use_container=True) - Fix Python tier1 skip condition to check template and codeuri - Add ARM64 tier1 build tests: Python, Java, Node, Ruby, Provided, Rust - Update TIER1_TESTS.md with ARM64 section

…ams, skip ARM64 without Docker - Restore test_tier1_rust_build_in_container (was accidentally deleted) - Fix dotnet container tier1 to use dotnet8 (dotnet10 container image not available) - Fix layer tier1 to use correct overrides (LayerContentUri, python3.11) - Add skipIf(SKIP_DOCKER_TESTS) to ARM64 tier1 tests (need Docker for invoke) - All container tier1 methods have _in_container suffix for -k filter compatibility

- Dotnet container: use mount_mode=MountMode.WRITE (not use_container=True) - Layer build: use python3.12 (matching original test_build_single_layer) - Rust: remove container tier1 (cargo-lambda doesn't support container builds) - These match the exact parameter combinations that pass in the original tests

…emove temp docs - Revert ARM64 tier1 names (no _in_container since use_container=False) - Enable QEMU for build-arm64 job (needed for local invoke) - Fix mypy import error in setup_testing_resources.py - Remove Rust container tier1 (cargo-lambda doesn't support container builds) - Fix dotnet container tier1 to use MountMode.WRITE - Fix layer tier1 to use python3.12 - Remove CLOUD_VS_LOCAL_REPORT.md and TIER1_TESTS.md

roger-zhangg · 2026-02-19T09:04:33Z

https://github.com/aws/aws-sam-cli/actions/runs/22175040297/job/64121459309

…ogin, add skipIf to dotnet tier1

.github/workflows/integration-tests.yml

tests/integration/buildcmd/test_build_cmd.py

vicheey · 2026-02-19T17:38:48Z

tests/integration/local/invoke/test_integrations_cli.py

@@ -854,6 +834,9 @@ def tearDownClass(cls):
    ],
 )
 @skipIf(SKIP_LAYERS_TESTS, "Skip layers tests in Appveyor only")


Is this remain true?

For all the skip ifs lets review after we are actually moved out

…al jobs - Add tests/free_disk_space.sh: cleanup if <25GB free, reduce swap to 1GB, nohup rm - Mark git function tests and WarmContainersRemoteLayers as requires_credential - Skip Get testing resources for non-credential jobs, clear AWS creds after ECR login - Add skipIf(SKIP_DOCKER_TESTS) to dotnet tier1 non-container test

roger-zhangg · 2026-02-19T18:54:38Z

https://github.com/aws/aws-sam-cli/actions/runs/22206731469
latest run

…iering

valerena · 2026-02-19T22:44:26Z

tests/integration/local/start_api/start_api_integ_base.py

+                if container.id not in cls._pre_existing_container_ids:
+                    try:


I guess we're now deleting all the containers that didn't exist when this class started. But in theory if there are things running in parallel, we could still be deleting a container created by a different class, as long as it was created after this class started, right?

So basically the change is that we're just "not deleting the containers that existed when this class started". I guess that makes a difference?

Yeah that's correct. To make it more bulletproof probably we need to know what exactly are the containers created. But that seems not easy.

valerena · 2026-02-19T23:03:48Z

.github/workflows/integration-tests.yml

      - name: Initialize project
        run: |
-          export CONTAINER_RUNTIME=${{ matrix.container_runtime }}
+          if [[ "${{ matrix.test_suite }}" == "build-x86-1" || "${{ matrix.test_suite }}" == "build-x86-2" || "${{ matrix.test_suite }}" == "build-arm64" ]]; then


Not too important, but we probably want to be consistent if we're doing contains(fromJSON( in all the other ifs here.

…iering

Re-add the samcli/lambda-* Docker image cleanup in tearDown for TestLayerVersionBase and TestLayerVersionThatDoNotCreateCache. This was removed in PR #8666 to avoid cross-test interference in parallel runs, but these test classes are already serialized via xdist_group markers. Without the cleanup, stale cached images cause layer version tests to use outdated layers.

* Restore layer test tearDown image cleanup removed in #8666 Re-add the samcli/lambda-* Docker image cleanup in tearDown for TestLayerVersionBase and TestLayerVersionThatDoNotCreateCache. This was removed in PR #8666 to avoid cross-test interference in parallel runs, but these test classes are already serialized via xdist_group markers. Without the cleanup, stale cached images cause layer version tests to use outdated layers. * limit chardet < 6 in cargo lambda * dep * reuse cleanup_samcli_images * nit

roger-zhangg added 27 commits February 16, 2026 14:34

test

6bc6762

skip credential setup

96461ef

test

eb9b00d

test

cd3f513

Fix setup step conditions for split jobs (sync -> sync-code/sync-watch)

c7b17bd

Use --ignore pattern for build-x86-2 and build-x86-container-2 so new…

8d2ae5c

… test files are auto-included

Rebalance build splits, move sync_build_in_source to sync-watch

ca2e194

- Move test_build_cmd_python.py and test_build_cmd_java.py to build-x86-1/container-1 - Move test_sync_build_in_source.py from sync-code to sync-watch

Apply autofix formatting to workflow

a6795d0

Apply autofix formatting

17ce3a4

Fix: remove secrets context from if condition (not allowed in express…

1db5b6a

…ions)

Fix: add missing pytest import in test_build_cmd_rust.py

000ada6

Fix Finch install: remove Docker packages first to avoid containerd.i…

a1a9287

…o conflict

roger-zhangg requested a review from a team as a code owner February 19, 2026 09:02

github-actions bot added the pr/internal label Feb 19, 2026

Merge branch 'develop' into test-tiering

14eadd4

Skip credentials for non-credential jobs, clear AWS creds after ECR l…

63d9889

…ogin, add skipIf to dotnet tier1

vicheey reviewed Feb 19, 2026

View reviewed changes

roger-zhangg added 7 commits February 19, 2026 10:54

Merge branch 'develop' into test-tiering

cd0caca

Merge branch 'develop' into test-tiering

7c9611e

remove duplicates

a5c5d70

Merge branch 'test-tiering' of github.com:aws/aws-sam-cli into test-t…

5cdd277

…iering

remove duplicate, add arm64 test with container

39c9f8a

remove dup

2133d73

Merge branch 'develop' into test-tiering

d165c1d

vicheey previously approved these changes Feb 19, 2026

View reviewed changes

valerena reviewed Feb 19, 2026

View reviewed changes

add paramter back, add tier1 extra

02e2b50

roger-zhangg dismissed vicheey’s stale review via 02e2b50 February 20, 2026 00:43

roger-zhangg added 4 commits February 19, 2026 16:50

extra

9c0b0db

change temp to mnt if exist

720944f

Merge branch 'develop' into test-tiering

6b8cb6a

Merge branch 'test-tiering' of github.com:aws/aws-sam-cli into test-t…

a395a2c

…iering

vicheey approved these changes Feb 20, 2026

View reviewed changes

valerena approved these changes Feb 20, 2026

View reviewed changes

roger-zhangg added this pull request to the merge queue Feb 20, 2026

Merged via the queue into develop with commit 8b878af Feb 20, 2026
63 of 64 checks passed

roger-zhangg mentioned this pull request Feb 23, 2026

Restore layer test tearDown image cleanup removed in #8666 #8687

Merged

roger-zhangg deleted the test-tiering branch February 27, 2026 23:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(test): restructure integration tests with tiered CI, parallel jobs, and Finch cross-platform smoke tests#8666

refactor(test): restructure integration tests with tiered CI, parallel jobs, and Finch cross-platform smoke tests#8666
roger-zhangg merged 42 commits intodevelopfrom
test-tiering

roger-zhangg commented Feb 19, 2026 •

edited

Loading

Uh oh!

roger-zhangg commented Feb 19, 2026

Uh oh!

Uh oh!

Uh oh!

vicheey Feb 19, 2026

Uh oh!

roger-zhangg Feb 19, 2026

Uh oh!

roger-zhangg commented Feb 19, 2026 •

edited

Loading

Uh oh!

valerena Feb 19, 2026

Uh oh!

roger-zhangg Feb 20, 2026

Uh oh!

valerena Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

roger-zhangg commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue(s) does this change fix?

Why is this change necessary?

How does it address the issue?

What side effects does this change have?

Mandatory Checklist

Uh oh!

roger-zhangg commented Feb 19, 2026

Uh oh!

Uh oh!

Uh oh!

vicheey Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

roger-zhangg Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

roger-zhangg commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

valerena Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

roger-zhangg Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

valerena Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

roger-zhangg commented Feb 19, 2026 •

edited

Loading

roger-zhangg commented Feb 19, 2026 •

edited

Loading