feat(llm): add P2P model config file delivery from workers to frontends by Kaonael · Pull Request #3 · Kaonael/dynamo

Kaonael · 2026-03-29T18:14:24Z

Overview:

Add P2P model config file delivery from workers to frontends. Workers automatically serve config files (config.json, tokenizer.json, etc.) over the request plane. Frontends
discover and download them from any available worker before falling back to HuggingFace. Enables private models without shared filesystems.

Details:

Problem: Frontends need model config files for preprocessing. The only paths were HuggingFace and ModelExpress — breaking for private/custom models not on HF.

Solution: Workers register a model-config AsyncEngine endpoint alongside inference endpoints. Frontends discover it through standard discovery and download files directly
over the current request plane (HTTP, TCP, or NATS).

Download priority chain (new):

Local HF cache — instant, no network
ModelExpress server — server-only, no hidden HF fallback
P2P from any available worker — parallel downloads, atomic writes
Direct HuggingFace download — last resort

Implementation:

config_endpoint.rs (new) — ModelConfigEngine, request/response types, P2pConfigDownloader trait
hub.rs — split from_hf() into get_cached_model_path(), try_model_express_server(), mx_download_direct() for fine-grained fallback control
model_card.rs — download_config() orchestrates the 4-step chain; new config_filenames(), verify_local_checksums(), checked_file() accessors
watcher.rs — WatcherP2pDownloader implements P2pConfigDownloader with parallel downloads and atomic writes (temp file + rename)
local_model.rs — registers model-config endpoint in attach()
checked_file.rs — filename() method, update_dir() refactored to use it

Safety:

Blake3 checksum verification after P2P download
Path traversal sanitization in engine (../etc/passwd → passwd)
Atomic file writes prevent corruption from concurrent discovery events
String-based response avoids Vec<u8> JSON serialization blowup (4-7x)

Docs updated:

frontend/README.md — download fallback chain
frontend/configuration.md — new "Model Config File Delivery" section
discovery-plane.md — new "Model Config Endpoint" section

Tests:

checked_file.rs — filename() from paths/URLs, update_dir() conversion
config_endpoint.rs — engine file serving, path traversal, serde round-trip, cache dir uniqueness
model_card.rs — config_filenames(), verify_local_checksums() on valid and tampered files

Where should the reviewer start?

lib/llm/src/config_endpoint.rs — new file, core types and engine
lib/llm/src/model_card.rs:537-598 — download_config() 4-step fallback chain
lib/llm/src/hub.rs:149-175 — try_model_express_server() using request_model_with_provider (not _and_fallback)
lib/llm/src/discovery/watcher.rs:832-931 — WatcherP2pDownloader with parallel downloads and atomic writes

Workers register a "model-config" AsyncEngine endpoint alongside inference endpoints. Frontends discover it through standard discovery and download config files directly from any available worker over the request plane (HTTP, TCP, or NATS). Download priority chain: 1. Local HF cache (instant, no network) 2. ModelExpress server (no hidden HF fallback) 3. P2P from any available worker (parallel downloads, atomic writes) 4. Direct HuggingFace download (last resort) This enables private models that aren't on HuggingFace without requiring shared filesystems or manual file copying. Key implementation details: - Split hub::from_hf() into individual steps for fine-grained fallback - P2pConfigDownloader trait decouples model_card from discovery internals - Blake3 checksum verification after P2P download - Atomic writes (temp file + rename) for concurrent safety - String-based response to avoid Vec<u8> JSON serialization blowup Signed-off-by: Nikita Sukharev <kaonael@gmail.com>

github-actions · 2026-04-29T11:13:02Z

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions · 2026-05-04T11:20:28Z

This PR has been closed due to inactivity. If you believe this PR is still relevant, please feel free to reopen it with additional context or information.

github-actions Bot added feat documentation Improvements or additions to documentation labels Mar 29, 2026

Kaonael mentioned this pull request Apr 22, 2026

[CONTRIBUTION]: DEP: P2P model config file delivery from workers to frontends ai-dynamo/dynamo#7672

Open

nnshah1 mentioned this pull request Apr 27, 2026

DEP (light): Worker self-hosted metadata files ai-dynamo/dynamo#8749

Open

github-actions Bot added the Stale label Apr 29, 2026

github-actions Bot closed this May 4, 2026

github-actions Bot deleted the feat-p2p-config-delivery branch May 4, 2026 11:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): add P2P model config file delivery from workers to frontends#3

feat(llm): add P2P model config file delivery from workers to frontends#3
Kaonael wants to merge 1 commit into
mainfrom
feat-p2p-config-delivery

Kaonael commented Mar 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kaonael commented Mar 29, 2026

Overview:

Details:

Where should the reviewer start?

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant