Update README.md#1
Merged
Merged
Conversation
saturley-hall
approved these changes
Mar 4, 2025
kylehh
pushed a commit
to kylehh/dynamo
that referenced
this pull request
Apr 11, 2025
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Sep 9, 2025
* add metrics to disconnect * fmt * fmt Signed-off-by: michaelfeil <me@michaelfeil.eu>
grahamking
pushed a commit
that referenced
this pull request
Sep 10, 2025
ayushag-nv
pushed a commit
that referenced
this pull request
Sep 15, 2025
zhongdaor-nv
pushed a commit
that referenced
this pull request
Sep 15, 2025
elyasmnvidian
added a commit
that referenced
this pull request
Sep 22, 2025
Signed-off-by: Elyas Mehtabuddin <emehtabuddin@nvidia.com>
3 tasks
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
May 29, 2026
Squashed from 8 development commits. See PR description for full context. Infrastructure-only — builtin plugins + dual-path parity tests land in the follow-up PR. Signed-off-by: Kang Zhang <kangz@nvidia.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
May 29, 2026
Squashed from 8 development commits. See PR description for full context. Infrastructure-only — builtin plugins + dual-path parity tests land in the follow-up PR. Signed-off-by: Kang Zhang <kangz@nvidia.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
May 31, 2026
Squashed from 8 development commits. See PR description for full context. Infrastructure-only — builtin plugins + dual-path parity tests land in the follow-up PR. Signed-off-by: Kang Zhang <kangz@nvidia.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Jun 2, 2026
``OrchestratorEngineAdapter.__init__`` hard-coded ``WallClock()``, which left no seam for replay paths to propagate trace time into the plugin layer. Plugin scheduler ``_is_due``, CircuitBreaker cooldown, and HOLD_LAST cache age all read ``self._clock.monotonic()`` — under a fast-forward replay (e.g. 1hr trace in <10s real time), wall-clock barely moves and any plugin with ``execution_interval_seconds`` larger than the real-time duration never re-fires after its first call. This is invisible in PR #1's current ship surface (PR #1 has no builtin plugins, K8s smoke runs in real time, and PSM-only replay goes through ``_PSMEngineAdapter`` instead) but would block PR #10 (``use_orchestrator=True`` default) — once orchestrator becomes the default path, mooncake replay must work. Fix: - ``OrchestratorEngineAdapter.__init__`` accepts an optional ``clock: Clock`` kwarg, defaulting to ``WallClock`` so production behaviour is unchanged. - ``engine_adapter.tick()`` bumps the clock to ``tick_input.now_s`` at the start of every tick when a ``VirtualClock`` is in play (``advance(delta)`` only if ``delta > 0`` — backwards trace time is a silent no-op rather than a crash). - ``ReplayPlannerEngine`` constructs a ``VirtualClock`` and passes it to the adapter on the orchestrator path so plugin scheduler sees trace time. Regression tests in ``test_engine_adapter.py``: - ``test_tick_advances_injected_virtual_clock_to_trace_time``: drive two ticks at trace time 180s and 360s, assert clock follows. - ``test_tick_does_not_advance_clock_backwards``: pre-advance the clock past tick_input.now_s, assert no exception and clock stays put. - ``test_default_clock_is_wallclock``: lock production default so a future refactor that flips it doesn't silently break K8s. Full planner suite: 830 passed, 1 skipped, 0 failed. Signed-off-by: Kang Zhang <kangz@nvidia.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Jun 2, 2026
Squashed from 8 development commits. See PR description for full context. Infrastructure-only — builtin plugins + dual-path parity tests land in the follow-up PR. Signed-off-by: Kang Zhang <kangz@nvidia.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Jun 2, 2026
``OrchestratorEngineAdapter.__init__`` hard-coded ``WallClock()``, which left no seam for replay paths to propagate trace time into the plugin layer. Plugin scheduler ``_is_due``, CircuitBreaker cooldown, and HOLD_LAST cache age all read ``self._clock.monotonic()`` — under a fast-forward replay (e.g. 1hr trace in <10s real time), wall-clock barely moves and any plugin with ``execution_interval_seconds`` larger than the real-time duration never re-fires after its first call. This is invisible in PR #1's current ship surface (PR #1 has no builtin plugins, K8s smoke runs in real time, and PSM-only replay goes through ``_PSMEngineAdapter`` instead) but would block PR #10 (``use_orchestrator=True`` default) — once orchestrator becomes the default path, mooncake replay must work. Fix: - ``OrchestratorEngineAdapter.__init__`` accepts an optional ``clock: Clock`` kwarg, defaulting to ``WallClock`` so production behaviour is unchanged. - ``engine_adapter.tick()`` bumps the clock to ``tick_input.now_s`` at the start of every tick when a ``VirtualClock`` is in play (``advance(delta)`` only if ``delta > 0`` — backwards trace time is a silent no-op rather than a crash). - ``ReplayPlannerEngine`` constructs a ``VirtualClock`` and passes it to the adapter on the orchestrator path so plugin scheduler sees trace time. Regression tests in ``test_engine_adapter.py``: - ``test_tick_advances_injected_virtual_clock_to_trace_time``: drive two ticks at trace time 180s and 360s, assert clock follows. - ``test_tick_does_not_advance_clock_backwards``: pre-advance the clock past tick_input.now_s, assert no exception and clock stays put. - ``test_default_clock_is_wallclock``: lock production default so a future refactor that flips it doesn't silently break K8s. Full planner suite: 830 passed, 1 skipped, 0 failed. Signed-off-by: Kang Zhang <kangz@nvidia.com>
krishung5
added a commit
that referenced
this pull request
Jun 2, 2026
…g launch summary
Two changes to examples/backends/sglang/launch/agg_multimodal_router.sh:
1. Replace the custom echo banner block with print_launch_banner per
.ai/bash-launch-guidelines.md (matches sibling agg_router.sh:60).
--no-curl is set because our own wait_ready loop later handles the
smoke test, and --multimodal flags the script's nature.
2. Build a KV_EVENTS_PORTS array alongside WORKER_PORTS and reference
both in the summary section, so when the harness sets DYN_SYSTEM_PORT{i}
the printed URLs match what was actually launched (instead of always
showing the default formula). Previously CI logs lied about ports
under dynamic test-port allocation.
Addresses #9561 review: Devin comments #1 + #2, CodeRabbit nitpick on
agg_multimodal_router.sh:65-68.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
krishung5
added a commit
that referenced
this pull request
Jun 2, 2026
- decode_handler._extract_mm_hashes docstring now states 16-char hex for the SGLang path (was incorrectly claiming the vLLM 64-char shape; Devin #5). - sglang launch banner drops the "Lightseek" prefix; uses "MM Exact Routing (SGLang)" to match the public name (Ryan suggestion #1). - ModelRuntimeConfig.backend_framework field gains a doc-block on its motivation — frontend uses it for backend-specific routing hints (Ryan #3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Jun 3, 2026
Squashed from 8 development commits. See PR description for full context. Infrastructure-only — builtin plugins + dual-path parity tests land in the follow-up PR. Signed-off-by: Kang Zhang <kangz@nvidia.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Jun 3, 2026
``OrchestratorEngineAdapter.__init__`` hard-coded ``WallClock()``, which left no seam for replay paths to propagate trace time into the plugin layer. Plugin scheduler ``_is_due``, CircuitBreaker cooldown, and HOLD_LAST cache age all read ``self._clock.monotonic()`` — under a fast-forward replay (e.g. 1hr trace in <10s real time), wall-clock barely moves and any plugin with ``execution_interval_seconds`` larger than the real-time duration never re-fires after its first call. This is invisible in PR #1's current ship surface (PR #1 has no builtin plugins, K8s smoke runs in real time, and PSM-only replay goes through ``_PSMEngineAdapter`` instead) but would block PR #10 (``use_orchestrator=True`` default) — once orchestrator becomes the default path, mooncake replay must work. Fix: - ``OrchestratorEngineAdapter.__init__`` accepts an optional ``clock: Clock`` kwarg, defaulting to ``WallClock`` so production behaviour is unchanged. - ``engine_adapter.tick()`` bumps the clock to ``tick_input.now_s`` at the start of every tick when a ``VirtualClock`` is in play (``advance(delta)`` only if ``delta > 0`` — backwards trace time is a silent no-op rather than a crash). - ``ReplayPlannerEngine`` constructs a ``VirtualClock`` and passes it to the adapter on the orchestrator path so plugin scheduler sees trace time. Regression tests in ``test_engine_adapter.py``: - ``test_tick_advances_injected_virtual_clock_to_trace_time``: drive two ticks at trace time 180s and 360s, assert clock follows. - ``test_tick_does_not_advance_clock_backwards``: pre-advance the clock past tick_input.now_s, assert no exception and clock stays put. - ``test_default_clock_is_wallclock``: lock production default so a future refactor that flips it doesn't silently break K8s. Full planner suite: 830 passed, 1 skipped, 0 failed. Signed-off-by: Kang Zhang <kangz@nvidia.com>
jthomson04
added a commit
that referenced
this pull request
Jun 4, 2026
The cache-realloc (#1) and metric-handle (#2/#3) commits left a few lines unformatted — `longest_prefix_match`'s collapsed signature and the `merged` binding in l1.rs, and the per-worker gauge assignment + a test assert in metrics.rs. `cargo fmt --all --check` (run by the rust-tests CI job) flagged them, failing the job. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jthomson04
added a commit
that referenced
this pull request
Jun 4, 2026
The cache-realloc (#1) and metric-handle (#2/#3) commits left a few lines unformatted — `longest_prefix_match`'s collapsed signature and the `merged` binding in l1.rs, and the per-worker gauge assignment + a test assert in metrics.rs. `cargo fmt --all --check` (run by the rust-tests CI job) flagged them, failing the job. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Jun 4, 2026
Squashed from 8 development commits. See PR description for full context. Infrastructure-only — builtin plugins + dual-path parity tests land in the follow-up PR. Signed-off-by: Kang Zhang <kangz@nvidia.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Jun 4, 2026
``OrchestratorEngineAdapter.__init__`` hard-coded ``WallClock()``, which left no seam for replay paths to propagate trace time into the plugin layer. Plugin scheduler ``_is_due``, CircuitBreaker cooldown, and HOLD_LAST cache age all read ``self._clock.monotonic()`` — under a fast-forward replay (e.g. 1hr trace in <10s real time), wall-clock barely moves and any plugin with ``execution_interval_seconds`` larger than the real-time duration never re-fires after its first call. This is invisible in PR #1's current ship surface (PR #1 has no builtin plugins, K8s smoke runs in real time, and PSM-only replay goes through ``_PSMEngineAdapter`` instead) but would block PR #10 (``use_orchestrator=True`` default) — once orchestrator becomes the default path, mooncake replay must work. Fix: - ``OrchestratorEngineAdapter.__init__`` accepts an optional ``clock: Clock`` kwarg, defaulting to ``WallClock`` so production behaviour is unchanged. - ``engine_adapter.tick()`` bumps the clock to ``tick_input.now_s`` at the start of every tick when a ``VirtualClock`` is in play (``advance(delta)`` only if ``delta > 0`` — backwards trace time is a silent no-op rather than a crash). - ``ReplayPlannerEngine`` constructs a ``VirtualClock`` and passes it to the adapter on the orchestrator path so plugin scheduler sees trace time. Regression tests in ``test_engine_adapter.py``: - ``test_tick_advances_injected_virtual_clock_to_trace_time``: drive two ticks at trace time 180s and 360s, assert clock follows. - ``test_tick_does_not_advance_clock_backwards``: pre-advance the clock past tick_input.now_s, assert no exception and clock stays put. - ``test_default_clock_is_wallclock``: lock production default so a future refactor that flips it doesn't silently break K8s. Full planner suite: 830 passed, 1 skipped, 0 failed. Signed-off-by: Kang Zhang <kangz@nvidia.com>
copy-pr-bot Bot
pushed a commit
that referenced
this pull request
Jun 4, 2026
…oder reuse, doc fixes Low-risk cleanup batch from the independent review (no decision-path change): - #4 chain_augment: add ``predicted_kv_hit_rate`` to ``_PREDICTION_FIELDS`` so it participates in first-writer-wins partial merge like the other three predicted_* fields (was silently dropped in any 2+ plugin PREDICT chain, contradicting the proto/Pydantic contract). +2 chain_augment tests. - #10 engine_adapter: add ``scale_down_capped_by_throughput`` to ``_aggregate_disagg_load_reason`` priority (PSM disagg emits it; placed between scale_up and scale_down to mirror PSM's _PRIORITY). - #11 dead code: drop ``contributing_plugin_ids`` (built, never read) in pipeline._run_fanout_stage; drop ``_set_enabled`` + ``_plugin_ids`` (no caller in PR #1; would KeyError if reached). - #18 _encode_fpm: use the canonical ``dynamo.common.forward_pass_metrics.encode`` (shared module-level encoder) instead of allocating a fresh ``msgspec.msgpack.Encoder`` per tick and re-implementing the encoding. Byte-identical wire format; keeps FPM serialization in lock-step with the rest of dynamo. - #17 transport ABC docstring: timeout is enforced by the transport (``call()`` wraps ``asyncio.wait_for``), not the orchestrator — the pipeline uses a bare gather to avoid double-counting the deadline. - #20 scheduler docstring: note the heartbeat-eviction monitor is not wired in this PR (last_heartbeat_at is recorded but unread; monitor is follow-up). - #21 transport contract test: 7 inputs (not 8) → 14 cases (multi_pool fixture was removed with component_name; comments were stale). - #22 metrics test: remove the dead no-op ``pass`` loop in _sample_value. 828 planner tests pass (was 825; +3 chain-augment / merge tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jthomson04
added a commit
that referenced
this pull request
Jun 4, 2026
The cache-realloc (#1) and metric-handle (#2/#3) commits left a few lines unformatted — `longest_prefix_match`'s collapsed signature and the `merged` binding in l1.rs, and the per-worker gauge assignment + a test assert in metrics.rs. `cargo fmt --all --check` (run by the rust-tests CI job) flagged them, failing the job. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
kangclzjc
added a commit
that referenced
this pull request
Jun 4, 2026
kaim-eng
added a commit
that referenced
this pull request
Jun 8, 2026
Three Dockerfile bugs combined to make the DCGM-mode image unbuildable on a fresh checkout. Fixing any one in isolation leaves the build broken, so they travel together: 1. DCGM_IMAGE default 'nvcr.io/nvidia/cloud-native/dcgm:4.2.3-2-ubuntu22.04' does not exist on NGC (verified 2026-05-21 via 'docker manifest inspect' → 404). Bump to 4.5.1-1-ubuntu22.04, the only resolvable 4.x tag. 2. DCGM 4.5+ relocated python bindings from /usr/local/dcgm/bindings/python3/ to /usr/share/datacenter-gpu-manager-4/bindings/python3/. The previous COPY would silently copy zero files under the new pin. Switch the source path to the 4.5+ location. 3. NGC's DCGM 4.5+ runtime image ships pydcgm with DcgmGroup.py:20 doing 'import logger' — but logger.py lives in DCGM's source tree under testing/python3/ and is NOT packaged. Without a shim every DcgmGroup construction raises ModuleNotFoundError. Add a 10-line stdlib-logging adapter at components/power_agent/logger.py and COPY it into /opt/dcgm/python/logger.py during the runtime stage. This unblocks 'docker build -f components/power_agent/Dockerfile' on a fresh clone (verified locally via 'docker buildx build --build-arg DCGM_IMAGE=...4.5.1-1-ubuntu22.04' against viking-prod-216 on 2026-05-21, image pushed to ttl.sh/dynamo-pa-kaim-dcgm45-v2:24h and used by the Path-B live test on aks-a100b-22138447-vmss000000). Refs: PR #9790 review, Power Agent live-test findings #1/#2/#6. Signed-off-by: Kai Ma <kaim@nvidia.com>
tmonty12
pushed a commit
that referenced
this pull request
Jun 8, 2026
6 tasks
1 task
nnshah1
added a commit
that referenced
this pull request
Jun 10, 2026
Address graham-code-review feedback on PR #10351: - Drop the secondary `owners` map; store `Owner = (instance_id, lora_slug)` inline with each entry value. One lock, one source of truth, no nested-write-lock hazard, no two-map sync risk. - `register` takes `&Owner` (one clone inside, not per-file). - Panic on collision: re-registering the same (slug, suffix, filename) with a different owner is a programming error (two attaches of the same model+suffix in one process would let detach-#1 wipe files detach-#2 still needs). Same-owner re-register is fine and just updates the path. - Doc + local var naming aligned on `instance_id` to match `local_model.rs`'s existing usage (the value populates `DiscoveryInstance::Model.instance_id`). - Tests: collision panic + same-owner update path coverage. Signed-off-by: nnshah1 <neelays@nvidia.com>
Broduker
pushed a commit
to Broduker/dynamo
that referenced
this pull request
Jun 12, 2026
…i-dynamo#10124) Signed-off-by: Kang Zhang <kangz@nvidia.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: shenls <shenlinshan@kanzhun.com>
Broduker
pushed a commit
to Broduker/dynamo
that referenced
this pull request
Jun 12, 2026
…i-dynamo#10124) Signed-off-by: Kang Zhang <kangz@nvidia.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: shenls <shenlinshan@kanzhun.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.