fix: propagate PodCliqueSet annotations to PodGang#573
Conversation
ebffd7d to
15f5f1e
Compare
|
this fixes the KAI queue issue, but are we intentionally making “all PCS annotations propagate to PodGang” part of the API contract? for scheduler queue annotations that seems right. my only concern is the broader blast radius: PCS annotations may include user/tooling metadata that another controller or admission webhook did not expect on PodGang. should this be narrowed to known scheduler/backend annotations, or should the PR explicitly document that PodGang inherits PCS annotations and topology-derived annotations win on collisions? |
IMO it should inherit the PCS annotations, cherry-picking annotations/label could never end |
See review comment, I found an issue with lo.Assign, I think mirror is ok but we should protect grove.io/* annotations. |
15f5f1e to
4e99dd4
Compare
buildResource was overwriting PodGang labels and annotations rather than merging into them. Because buildResource runs inside controllerutil.CreateOrPatch, on update paths the PodGang arrives with existing metadata (e.g. set by admission webhooks or other controllers); the previous code wiped these. Use lo.Assign to merge: - pg.Labels gets existing labels merged with the operator-managed labels from getLabels(pcs.Name). - pg.Annotations gets existing annotations merged with pcs.Annotations and the topology annotation, with reconciliation-derived values taking precedence on key collisions. This propagates scheduler-queue annotations such as nvidia.com/kai-scheduler-queue from the PodCliqueSet to the PodGang so kai-scheduler can resolve the queue. Tests cover both create and update paths, with TAS enabled and disabled, and assert exact map equality so stray entries are caught. Fixes ai-dynamo#490 Co-authored-by: Ron Kahn <122778260+Ronkahn21@users.noreply.github.com> Co-authored-by: Gerald Flarity <gflarity@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Dan Bar <dan.bar@run.ai>
4e99dd4 to
3117407
Compare
What type of PR is this?
/kind bug
What this PR does / why we need it:
Rebased and review-addressed continuation of #500 (original author: @w1t, with feedback from @Ronkahn21 and @gflarity). #500 went stale after
mainmoved on; opening a fresh PR from a feature branch since the original was filed from the author's fork'smain, which prevented an in-place rebase.Problem
When a
PodCliqueSetcarries scheduler annotations such asnvidia.com/kai-scheduler-queue, the resultingPodGangis created without them. kai-scheduler reads queue assignment from thePodGang, so it raisesQueueDoesNotExisteven though the queue exists and was validated by the operator.Additionally,
buildResourcewas overwritingpg.Labelsandpg.Annotationson every reconcile. BecausebuildResourceruns insidecontrollerutil.CreateOrPatch, on update paths thePodGangarrives with existing metadata (e.g. set by admission webhooks or other controllers), and that metadata was being wiped.Fix
In
buildResource:pg.Labels = lo.Assign(pg.Labels, getLabels(pcs.Name))— preserves existing labels while adding operator-managed labels.pg.Annotations = lo.Assign(pg.Annotations, pcs.Annotations, topologyAnnotations)— preserves existing annotations, propagates PCS annotations, and lets reconciliation-derived topology annotations take precedence on key collisions.Tests
TestBuildResourcecovers four cases with strict map equality (so stray entries are caught):PodGangsurviveWhich issue(s) this PR fixes:
Fixes #490
Supersedes #500.
Does this PR introduce an API change?
NONE