feat(self-host): worker advertises non-typed metadata siblings (gh-8749) by nnshah1 · Pull Request #9707 · ai-dynamo/dynamo

nnshah1 · 2026-05-18T19:16:34Z

Overview

Worker side of the multimodal-completeness gap left after #9610 (frontend HF-sibling harvest). When self_host_metadata is enabled, the worker now advertises non-weight, non-typed, non-ignored sibling files (preprocessor_config.json, special_tokens_map.json, added_tokens.json, …) via a new ModelDeploymentCard.extra_files field. Frontend's existing resolve_metadata_files pipeline is generic over slot count, so these resolve, verify-and-cache, and symlink into slug_dir exactly like the typed slots.

Why now

#9610 covers HF-backed models by harvesting siblings from hf_snapshots at the frontend. For custom/private worker dirs without an HF source, the only way to surface those files is for the worker to advertise them. This closes the last hole before flipping DYN_SELF_HOST_METADATA default ON.

Details

ModelDeploymentCard.extra_files: Vec<CheckedFile> — #[serde(default, skip_serializing_if = "Vec::is_empty")], so wire-format roundtrip with older peers is invisible.
iter_metadata_files{,_mut} now yield extra_files after typed slots; downstream resolve / verify / cache loops pick them up with zero changes.
Harvest helper in local_model.rs is non-recursive (no surprise walks into nested checkpoints). "Is this junk?" delegates to modelexpress_common::providers::HuggingFaceProvider::is_ignored so the dotfile/README deny-list lives in one place. "Is this a weight?" uses model_card::is_weight_file — wider than mx's list (covers .pt / .gguf / .onnx / .tflite / .h5 / .pth / .msgpack in addition to .safetensors / .bin) so a stray weight can't ride the metadata HTTP path. .safetensors.index.json is correctly kept (extension is .json).
Called from move_to_self_host before the existing http-rewrite loop, so harvested files get the same /v1/metadata/{slug}/{suffix}/{filename} rewrite + registry registration as typed slots.

Test

One pure unit test on the harvest helper: tmpdir with one typed file, one weight, one dotfile, one README, two genuine extras, one subdir — asserts only the two extras land. Wire-format back-compat is statically guaranteed by #[serde(default)] (no separate roundtrip test needed).

Out of scope

DYN_SELF_HOST_METADATA default flip — still gated on auto-cleanup-on-detach landing (PR4 capstone).
Recursive harvest — current scan is the model dir only.
Per-file size caps for extras — not added; size is captured at CheckedFile::from_disk time but no upper bound enforced. Frontend's existing resolve-side caps still apply.

DEP (light): Worker self-hosted metadata files #8749 — self-hosted metadata stack
fix(self-host): harvest non-weight siblings into slug_dir (gh-8749) #9610 — frontend HF-sibling harvest (covers the HF-backed path; this PR covers the custom-worker path)
feat(self-host): worker hosts metadata files via system_status_server (gh-8749) #8855, feat(self-host): frontend MDC verify-and-cache pipeline (gh-8749) #9057 — earlier merges in the same stack

Summary by CodeRabbit

New Features
- Self-hosting now automatically discovers and manages additional metadata files (such as preprocessor configs and special token mappings) alongside core model components, ensuring complete model configurations are properly cached and deployed.

coderabbitai · 2026-05-18T19:20:32Z

Walkthrough

This PR extends model deployment card handling to harvest and advertise additional metadata files alongside typed slots during self-hosting. It adds an extra_files field to ModelDeploymentCard, integrates it into the metadata iteration APIs, and implements directory scanning in move_to_self_host to discover and populate eligible files while filtering weights, typed names, and ignored entries.

Changes

Extra metadata files harvesting for self-hosted models

Layer / File(s)	Summary
Extra files field and metadata iteration extension `lib/llm/src/model_card.rs`	Added `extra_files: Vec<CheckedFile>` to `ModelDeploymentCard` struct. Extended `iter_metadata_files()` documentation and implementation to yield extra_files after typed slots, and updated `iter_metadata_files_mut()` capacity calculation and append logic to mirror the ordering.
Harvesting helper and self-host integration `lib/llm/src/local_model.rs`	Added `HashSet` and imports for `HuggingFaceProvider` and `CheckedFile`. Implemented `harvest_extra_files()` helper to scan model directory non-recursively, filtering out weights, typed filenames, dotfiles/README, and subdirectories. Integrated harvesting into `move_to_self_host` to compute typed filenames from existing metadata and append discovered extras to `card.extra_files` before registration. Added unit tests validating correct filtering of typed entries, weights, ignored files, and subdirectories.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding worker-side advertisement of non-typed metadata sibling files for self-hosted models.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The pull request description is comprehensive and well-structured, covering all template sections with clear details about changes, motivation, implementation specifics, testing, and scope boundaries.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Three dynamo-ops review threads on #9707: 1. Name-only LocalModel placeholders have an empty `full_path` and would fail on `read_dir("")` once self-host is enabled. Skip the harvest when `model_dir` isn't a directory. 2. `DirEntry::file_type()` and `DirEntry::metadata()` use lstat on Unix — they do not follow symlinks. HF snapshot dirs are full of symlinks pointing into `blobs/`, so we'd skip the very files we want to harvest. Switch to `Path::is_file()` which calls `fs::metadata` and follows symlinks. 3. `mdcsum()` previously omitted `extra_files`, so two workers with different siblings published the same checksum and the frontend cache served stale metadata. Hash each extra file's checksum in sorted order — sorting eliminates dependence on the harvest's filesystem-defined `read_dir` order. Extends the harvest test fixture with a symlink-to-real-file and adds a `returns_empty_for_missing_dir` test for the name-only path. Signed-off-by: nnshah1 <neelays@nvidia.com>

When `self_host_metadata` is enabled, the worker scans its model directory and pushes non-weight, non-typed, non-ignored siblings (`preprocessor_config.json`, `special_tokens_map.json`, ...) into a new `ModelDeploymentCard.extra_files` field. Frontend pipeline is already generic over slot count, so these ride the same resolve-and-cache path as typed slots and land in `slug_dir`. External preprocessors that load via `from_pretrained(slug_dir)` now see a complete model dir. Weight-file and ignore checks delegate to modelexpress-common's `HuggingFaceProvider::{is_weight_file, is_ignored}` so policy lives in one place across mx download and dynamo's self-host harvest. Signed-off-by: nnshah1 <neelays@nvidia.com>

- Sort harvested extras by filename so `mdcsum()` is reproducible across filesystems whose `read_dir` order differs. Without this, multi-replica deployments with byte-identical model dirs could publish different MDCs and thrash the frontend cache. - Switch `with_context` error formatters from `{:?}` to `Path::display()` for consistency with surrounding code. - Replace magic-5 capacity hint with `Self::TYPED_SLOT_COUNT` const so future slot additions touch one place. Signed-off-by: nnshah1 <neelays@nvidia.com>

Worker harvest was using `HuggingFaceProvider::is_weight_file` from mx, whose deny-list is narrower than the one used by the frontend's gh-8749 hf-sibling harvest (#9610): mx misses `.gguf`, `.onnx`, `.tflite`, `.pt`, `.pth`. That asymmetry meant a stray PyTorch / ONNX / gguf weight in the model dir would slip past the worker filter, get advertised as an `extra_files` entry, and ride the metadata HTTP path to the frontend — turning a multi-GB weight blob into a config-cache resident. Promote `model_card::is_weight_file` to `pub(crate)` and call it from the worker harvest. Single source of truth for "is this a weight?" across both gh-8749 harvests; no new constants. Bumps the worker harvest test to also assert `.pt` is filtered. Signed-off-by: nnshah1 <neelays@nvidia.com>

Three dynamo-ops review threads on #9707: 1. Name-only LocalModel placeholders have an empty `full_path` and would fail on `read_dir("")` once self-host is enabled. Skip the harvest when `model_dir` isn't a directory. 2. `DirEntry::file_type()` and `DirEntry::metadata()` use lstat on Unix — they do not follow symlinks. HF snapshot dirs are full of symlinks pointing into `blobs/`, so we'd skip the very files we want to harvest. Switch to `Path::is_file()` which calls `fs::metadata` and follows symlinks. 3. `mdcsum()` previously omitted `extra_files`, so two workers with different siblings published the same checksum and the frontend cache served stale metadata. Hash each extra file's checksum in sorted order — sorting eliminates dependence on the harvest's filesystem-defined `read_dir` order. Extends the harvest test fixture with a symlink-to-real-file and adds a `returns_empty_for_missing_dir` test for the name-only path. Signed-off-by: nnshah1 <neelays@nvidia.com>

…8749) Extend the mocker self-host test to also `GET /v1/metadata/{slug}/_base/vocab.json` — `vocab.json` is a non-typed sibling in the Qwen3 HF snapshot, so its presence in the route proves: - `harvest_extra_files` picked it up - the runtime registry registered it under `(slug, suffix, filename)` - the system_status_server route serves it Also exercises the symlink-follow fix from `048ba5af0c`: every entry in HF snapshot dirs is a symlink into `blobs/`, so a non-following stat would skip these files. Signed-off-by: nnshah1 <neelays@nvidia.com>

- `move_to_self_host`: drop stale "(mx's deny-list)" attribution; the weight-file check now goes through `model_card::is_weight_file`. - `mdcsum()`: compress the 6-line extras-hashing rationale to 3. No behavior change. Signed-off-by: nnshah1 <neelays@nvidia.com>

Hashing only the content checksum means two workers advertising the same bytes under different filenames produce identical mdcsums, so the frontend cache aliases them and the slug_dir resolved for worker A is reused for worker B (with the wrong sibling names). Fold sorted (basename, checksum) pairs into the mdcsum input. NUL separator between fields prevents a basename-vs-hash boundary collision. Tests: two new unit tests cover (a) basename distinguishes equal-byte extras, (b) read-order independence is preserved after the sort. Signed-off-by: nnshah1 <neelays@nvidia.com>

grahamking · 2026-05-26T18:58:39Z

@nnshah1 What decides if something is extra_files or it has it's own field?

Could we move everything into files? Or should the extra_files have their own fields?

Per graham-code-review: - mdcsum() already sorts extras by (basename, checksum), so harvest_extra_files's filename sort is dead weight — drop the sort and the doc claim it justified. - is_weight_file doc: trim to a one-liner listing callers; design rationale belongs in the PR description. - extra_files field doc: rewrite tighter, drop AI-flavored phrasing. Signed-off-by: nnshah1 <neelays@nvidia.com>

…namogh-8749) (ai-dynamo#9707) Signed-off-by: nnshah1 <neelays@nvidia.com>

) The frontend's `download_config` pipeline already resolves every typed MDC file and harvested sibling into a content-addressed `slug_dir` with blake3-verified copies (#9610 + #9707). The vllm and sglang native preprocessor factories were nonetheless calling `fetch_model` against the worker URI on top of that — a redundant HF download whenever the worker-side path wasn't reachable from the frontend pod. Expose the resolve target on the MDC: * Rust: `ModelDeploymentCard::local_dir() -> PathBuf` — pure path computation matching `mdc_slug_dir(slug, mdcsum)`. * Python binding + .pyi stub: `mdc.local_dir() -> str`. Both engine factories now use `mdc.local_dir()` directly: * `components/src/dynamo/frontend/vllm_processor.py` * `components/src/dynamo/frontend/sglang_processor.py` Drops the unused `fetch_model` import in both. Adds one Rust unit test asserting `local_dir()` produces the same path the resolve pipeline writes into. Signed-off-by: nnshah1 <neelays@nvidia.com>

nnshah1 requested a review from a team as a code owner May 18, 2026 19:16

pull-request-size Bot added the size/L label May 18, 2026

github-actions Bot added the feat label May 18, 2026

dynamo-ops reviewed May 18, 2026

View reviewed changes

Comment thread lib/llm/src/local_model.rs

Comment thread lib/llm/src/local_model.rs Outdated

Comment thread lib/llm/src/local_model.rs Outdated

Comment thread lib/llm/src/model_card.rs

copy-pr-bot Bot temporarily deployed to GITLAB May 18, 2026 20:42 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 18, 2026 21:50 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 18, 2026 22:43 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 18, 2026 23:23 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 18, 2026 23:26 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 19, 2026 00:49 Inactive

nnshah1 requested review from a team as code owners May 19, 2026 01:38

copy-pr-bot Bot temporarily deployed to GITLAB May 19, 2026 01:38 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 19, 2026 01:50 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 19, 2026 05:50 Inactive

nnshah1 added 6 commits May 19, 2026 22:09

nnshah1 force-pushed the nnshah1/gh-8749-mdc-extra-files branch from 05c037f to e602df6 Compare May 20, 2026 13:51

copy-pr-bot Bot temporarily deployed to GITLAB May 20, 2026 13:51 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 20, 2026 13:52 Inactive

nnshah1 enabled auto-merge (squash) May 20, 2026 14:55

dynamo-ops reviewed May 20, 2026

View reviewed changes

Comment thread lib/llm/src/model_card.rs Outdated

copy-pr-bot Bot temporarily deployed to GITLAB May 21, 2026 14:43 Inactive

grahamking reviewed May 26, 2026

View reviewed changes

Comment thread lib/llm/src/local_model.rs Outdated

grahamking reviewed May 26, 2026

View reviewed changes

Comment thread lib/llm/src/local_model.rs Outdated

grahamking reviewed May 26, 2026

View reviewed changes

Comment thread lib/llm/src/model_card.rs Outdated

grahamking reviewed May 26, 2026

View reviewed changes

Comment thread lib/llm/src/model_card.rs Outdated

grahamking approved these changes May 26, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to GITLAB May 27, 2026 00:07 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 27, 2026 01:46 Inactive

nnshah1 merged commit 53aa9b7 into main May 27, 2026
101 checks passed

nnshah1 deleted the nnshah1/gh-8749-mdc-extra-files branch May 27, 2026 05:12

nnshah1 mentioned this pull request May 27, 2026

feat(self-host): native preprocessor consumes resolved slug_dir (gh-8749) #10037

Merged

MartinRepo pushed a commit to MartinRepo/dynamo that referenced this pull request May 28, 2026

feat(self-host): worker advertises non-typed metadata siblings (ai-dy…

947bd99

…namogh-8749) (ai-dynamo#9707) Signed-off-by: nnshah1 <neelays@nvidia.com>

nnshah1 mentioned this pull request Jun 5, 2026

feat(self-host): auto-clean metadata-registry on detach (gh-8749) #10351

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(self-host): worker advertises non-typed metadata siblings (gh-8749)#9707

feat(self-host): worker advertises non-typed metadata siblings (gh-8749)#9707
nnshah1 merged 8 commits into
mainfrom
nnshah1/gh-8749-mdc-extra-files

nnshah1 commented May 18, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grahamking commented May 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nnshah1 commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Why now

Details

Test

Out of scope

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grahamking commented May 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nnshah1 commented May 18, 2026 •

edited

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading