feat(worker): add LOCALAI_PREFETCH_MODELS for boot-time gallery prefetch by localai-bot · Pull Request #10108 · mudler/LocalAI

localai-bot · 2026-05-31T09:19:59Z

Summary

In LocalAI distributed mode the master streams a model GGUF to a worker on first inference. On bandwidth-constrained cluster networks (libp2p circuit-v2 relays under NAT, double-NAT residential, slow overlays) that transfer can be slow or unreliable — meanwhile each worker's outbound internet is usually fine.

LOCALAI_PREFETCH_MODELS lets the operator name gallery model IDs to download at worker boot, before the worker subscribes to backend.install events. Reuses gallery.InstallModelFromGallery so the on-disk /models layout matches what the master would have pushed, and the master can still push files on demand if the gallery is unreachable at boot (prefetch is non-fatal on every error path).

Design notes

LOCALAI_PREFETCH_MODELS (alias PREFETCH_MODELS) is comma-separated gallery IDs, mirroring LOCALAI_MODELS,MODELS on the master.
Added a companion LOCALAI_GALLERIES,GALLERIES field to the worker config (the gallery URL list is needed to resolve IDs; the master already reads it).
Prefetch runs before registration / NATS subscription / the "Worker ready" log, so a still-warming worker isn't announced as ready.
Reuses the master's gallery install entrypoint (gallery.InstallModelFromGallery) so URL resolution, SHA verification, and the idempotent file-exists/SHA-match skip path are shared.
The installer is wrapped in a private function-typed indirection so tests can swap a fake — production code never reassigns the binding.

Tests

11 new Ginkgo specs in `core/services/worker/prefetch_test.go`:

happy path
idempotent restart (file already on disk)
non-fatal per-model failure
empty config
whitespace trimming
malformed `LOCALAI_GALLERIES`
empty galleries list

Build + vet clean; `go test ./core/services/worker/... ./core/cli/... ./core/gallery/... ./core/startup/...` all pass.

In LocalAI distributed mode the master streams a model GGUF to a worker on first inference. On bandwidth-constrained cluster networks (libp2p circuit-v2 relays under NAT, double-NAT residential, slow overlays) that transfer can be slow or unreliable — meanwhile each worker's outbound internet is usually fine. LOCALAI_PREFETCH_MODELS lets the operator name gallery model IDs to download at worker boot, BEFORE the worker subscribes to backend.install events. Reuses gallery.InstallModelFromGallery so the on-disk /models layout matches what the master would have pushed, and the master can still push files on demand if the gallery is unreachable at boot (prefetch is non-fatal on every error path). The installer is wrapped in a function-value indirection so tests can swap a fake without touching the real gallery; production never reassigns the binding. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the feat/worker-prefetch-models branch from 54251c7 to 92be4d6 Compare May 31, 2026 09:32

mudler merged commit 0d57957 into master May 31, 2026
58 checks passed

mudler deleted the feat/worker-prefetch-models branch May 31, 2026 10:22

localai-bot added the enhancement New feature or request label Jun 10, 2026

BrewTestBot mentioned this pull request Jun 10, 2026

localai 4.4.0 Homebrew/homebrew-core#287347

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(worker): add LOCALAI_PREFETCH_MODELS for boot-time gallery prefetch#10108

feat(worker): add LOCALAI_PREFETCH_MODELS for boot-time gallery prefetch#10108
mudler merged 1 commit into
masterfrom
feat/worker-prefetch-models

localai-bot commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

localai-bot commented May 31, 2026

Summary

Design notes

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants