Skip to content

Replace scheduler backend global mutable state with well defined interface and dependency injection#512

Merged
kangclzjc merged 12 commits into
ai-dynamo:mainfrom
kangclzjc:remove_global_state
May 14, 2026
Merged

Replace scheduler backend global mutable state with well defined interface and dependency injection#512
kangclzjc merged 12 commits into
ai-dynamo:mainfrom
kangclzjc:remove_global_state

Conversation

@kangclzjc

Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

Refactor: Replace Scheduler Backend Global State with Dependency Injection

Which issue(s) this PR fixes:

Fixes #509

Special notes for your reviewer:

Does this PR introduce a API change?


Additional documentation e.g., enhancement proposals, usage docs, etc.:


@kangclzjc kangclzjc marked this pull request as ready for review April 8, 2026 01:50
@unmarshall unmarshall changed the title remove global variables Replace scheduler backend global mutable state with well defined interface and dependency injection Apr 8, 2026
@unmarshall unmarshall self-assigned this Apr 8, 2026
@unmarshall unmarshall added enhancement New feature or request area/quality Output qualification (tests, checks, scans, automation in general, etc.) related area/test Issue/PR is for correcting, enhancing test or its related frameworks labels Apr 8, 2026
@copy-pr-bot

copy-pr-bot Bot commented Apr 8, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

unmarshall
unmarshall previously approved these changes Apr 10, 2026
unmarshall
unmarshall previously approved these changes May 8, 2026
@unmarshall unmarshall force-pushed the remove_global_state branch from 220debf to ac77c50 Compare May 8, 2026 06:19
@unmarshall unmarshall self-requested a review May 9, 2026 16:26
unmarshall
unmarshall previously approved these changes May 9, 2026
@kangclzjc kangclzjc force-pushed the remove_global_state branch from a92a1c9 to 5f31516 Compare May 13, 2026 11:23
kangclzjc and others added 12 commits May 14, 2026 07:59
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
* Removed nil pointer checks in the scheduler registry code.
* Introduced FakeSchedulerBackend to be used in unit tests.
* Minor changes in scheduler registry implementation.
* Fixed tests after nil pointer checks were removed.

Signed-off-by: Madhav Bhargava <madhav.bhargava@sap.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
… backoff and reduce RU flakiness

Signed-off-by: Kang Zhang <kangz@nvidia.com>
Signed-off-by: Kang Zhang <kangz@nvidia.com>
…e errors and unify with ERR_REQUEUE_AFTER

Signed-off-by: Kang Zhang <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: Kang Zhang <kangz@nvidia.com>
* Added GetOrDefault and AllTopologyAware to scheduler.Registry type.
* changed clustertopology controller to now use tas enabled backends only.

Signed-off-by: Madhav Bhargava <madhav.bhargava@sap.com>
Signed-off-by: Kang Zhang <kangz@nvidia.com>
@kangclzjc kangclzjc force-pushed the remove_global_state branch from 36314c4 to 1213809 Compare May 13, 2026 23:59
@kangclzjc kangclzjc merged commit 09463f2 into ai-dynamo:main May 14, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/quality Output qualification (tests, checks, scans, automation in general, etc.) related area/test Issue/PR is for correcting, enhancing test or its related frameworks enhancement New feature or request run-e2e

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refine global mutable state in scheduler backend framework

3 participants