Skip to content

Adding integration test for GKE A4X#4828

Merged
vikramvs-gg merged 1 commit into
GoogleCloudPlatform:developfrom
vikramvs-gg:a4x_intgr_test
Nov 11, 2025
Merged

Adding integration test for GKE A4X#4828
vikramvs-gg merged 1 commit into
GoogleCloudPlatform:developfrom
vikramvs-gg:a4x_intgr_test

Conversation

@vikramvs-gg

Copy link
Copy Markdown
Contributor

This PR introduces a suite of new integration tests for the A4X GKE blueprint including validation of post-deployment functionality. These tests cover critical aspects of A4X cluster operation:

  • SMI Job Post-Deploy Test: Verifies the successful execution and behavior of SMI (Scalable Multi-instance) jobs on the deployed GKE cluster.
  • NCCL Test: Ensures proper functionality and performance of NCCL (NVIDIA Collective Communications Library) for inter-GPU communication, crucial for high-performance workloads.
  • Kueue Configuration Test: Validates the correct setup and operation of Kueue, ensuring proper workload scheduling and resource management within the A4X environment.

These additions improve the robustness of our A4X GKE deployments by providing automated verification of key features and configurations.

@vikramvs-gg vikramvs-gg added the release-improvements Added to release notes under the "Improvements" heading. label Nov 10, 2025
@vikramvs-gg vikramvs-gg marked this pull request as ready for review November 10, 2025 12:55
@vikramvs-gg vikramvs-gg requested review from a team and samskillman as code owners November 10, 2025 12:55
@vikramvs-gg vikramvs-gg added the test-enhancement Tests enhancement or coverage improvement label Nov 10, 2025

@SwarnaBharathiMantena SwarnaBharathiMantena left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for highlighting the A4X specific requirements for Kueue and TAS checks in the PR description.

@vikramvs-gg vikramvs-gg merged commit 92f1d92 into GoogleCloudPlatform:develop Nov 11, 2025
13 of 67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-improvements Added to release notes under the "Improvements" heading. test-enhancement Tests enhancement or coverage improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants