Skip to content

e2e tests gang scheduling#242

Merged
gflarity merged 15 commits into
ai-dynamo:mainfrom
gflarity:gflarity/e2e_tests_gang_scheduling
Nov 20, 2025
Merged

e2e tests gang scheduling#242
gflarity merged 15 commits into
ai-dynamo:mainfrom
gflarity:gflarity/e2e_tests_gang_scheduling

Conversation

@gflarity

@gflarity gflarity commented Nov 3, 2025

Copy link
Copy Markdown
Contributor

What type of PR is this?

Tests for gang scheduling as per the 'Grove Multi-Node QA Test Plan'.

What this PR does / why we need it:

Implements e2e tests for gang scheduling test scheduling.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a API change?

NONE

Additional documentation e.g., enhancement proposals, usage docs, etc.:


@gflarity gflarity force-pushed the gflarity/e2e_tests_gang_scheduling branch from b0f7495 to 6ca3861 Compare November 3, 2025 19:17

@shayasoolin shayasoolin left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the infra code (shared_cluster.go and setup.go) changes, and the first new test in gang_scheduling_test.go.
I stopped there since my main comment is that I strongly suggest to invest time on having as many utility test functions as possible, and then reuse them for the test cases themselves.
We'll not just gain reusability and make it easier to maintain, but the test cases themselves will get much clearer as they will show as high-level function calls that look ideally as the comment above the test case: a list of steps.
Since this requires a non-trivial refactoring, I prefer to review the test logic itself after it's done. Thanks!

Comment thread operator/e2e/tests/setup.go Outdated
Comment thread operator/e2e/tests/setup.go Outdated
Comment thread operator/e2e/tests/gang_scheduling_test.go Outdated
Comment thread operator/e2e/tests/gang_scheduling_test.go Outdated
Comment thread operator/e2e/tests/gang_scheduling_test.go Outdated
Comment thread operator/e2e/tests/gang_scheduling_test.go
@gflarity gflarity force-pushed the gflarity/e2e_tests_gang_scheduling branch from 94db608 to dfc20b3 Compare November 4, 2025 21:43
@gflarity

gflarity commented Nov 4, 2025

Copy link
Copy Markdown
Contributor Author

Thanks @shayasoolin. I abstracted as much as I could into utility functions. PTAL.

shayasoolin
shayasoolin previously approved these changes Nov 5, 2025
Comment thread operator/e2e/tests/setup.go Outdated
Comment thread operator/e2e/tests/setup.go Outdated
@gflarity

gflarity commented Nov 6, 2025

Copy link
Copy Markdown
Contributor Author

@shayasoolin @sanjaychatterjee updated to incorporate last suggestions. PTAL.

shayasoolin
shayasoolin previously approved these changes Nov 6, 2025
@gflarity gflarity force-pushed the gflarity/e2e_tests_gang_scheduling branch from ac591b6 to 52e7b92 Compare November 6, 2025 15:08
Comment thread operator/e2e/tests/setup.go Outdated
Comment thread operator/e2e/tests/gang_scheduling_test.go Outdated
julienmancuso
julienmancuso previously approved these changes Nov 18, 2025

@julienmancuso julienmancuso left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 nits, otherwise lgtm. Great job

Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
@gflarity gflarity force-pushed the gflarity/e2e_tests_gang_scheduling branch from 049b9a6 to 4cf27c8 Compare November 19, 2025 20:03
@gflarity gflarity merged commit 0617444 into ai-dynamo:main Nov 20, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants