fix(agents): route multi-instance Ollama providers by model baseUrl#63218
fix(agents): route multi-instance Ollama providers by model baseUrl#63218matthewpapa07 wants to merge 2 commits into
Conversation
Greptile SummaryThis PR fixes multi-instance Ollama provider routing by replacing a closed-over static URL pattern with a Confidence Score: 5/5Safe to merge; the core routing fix is correct and well-tested, and the only remaining findings are P2 documentation clarity issues. The Ollama boundary-aware stream function reads model.baseUrl at call time, which is the correct fix for the shared-transport closure bug. The dedicated-agent fallback scoping is intentional, tested, and the behavior change is acknowledged in the PR description. All remaining findings (docs gap for the no-global-primary edge case) are P2 and do not affect correctness. docs/gateway/configuration-reference.md and docs/concepts/model-failover.md — both miss the edge case where agents.defaults.model.primary is unset; src/agents/agent-scope.ts lines 213-214 for the corresponding code condition.
|
| if (!defaultPrimary) { | ||
| return resolveAgentModelFallbackValues(cfg.agents?.defaults?.model).length > 0; |
There was a problem hiding this comment.
Dedicated-lane trigger covers undocumented case
When agents.defaults.model has no primary but does have fallbacks, any agent with an explicit primary is treated as a dedicated lane and receives [] fallbacks. This is broader than what both doc updates describe ("primary that differs from agents.defaults.model.primary"). The test for cfgInheritDefaults (no global primary, global fallbacks set, linus has own primary) confirms this path is hit, but users whose config matches that pattern will see a silent behavior change with no guidance in the docs.
Consider either narrowing the condition to require a global primary to exist, or adding a sentence in the updated doc sections noting that this also applies when agents.defaults.model.primary is unset.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/agent-scope.ts
Line: 213-214
Comment:
**Dedicated-lane trigger covers undocumented case**
When `agents.defaults.model` has no `primary` but does have `fallbacks`, any agent with an explicit `primary` is treated as a dedicated lane and receives `[]` fallbacks. This is broader than what both doc updates describe ("primary that differs from `agents.defaults.model.primary`"). The test for `cfgInheritDefaults` (no global primary, global fallbacks set, `linus` has own primary) confirms this path is hit, but users whose config matches that pattern will see a silent behavior change with no guidance in the docs.
Consider either narrowing the condition to require a global primary to exist, or adding a sentence in the updated doc sections noting that this also applies when `agents.defaults.model.primary` is unset.
How can I resolve this? If you propose a fix, please make it concise.| - `id`: stable agent id (required). | ||
| - `default`: when multiple are set, first wins (warning logged). If none set, first list entry is default. | ||
| - `model`: string form overrides `primary` only; object form `{ primary, fallbacks }` overrides both (`[]` disables global fallbacks). Cron jobs that only override `primary` still inherit default fallbacks unless you set `fallbacks: []`. | ||
| - `model`: string form overrides `primary` only; object form `{ primary, fallbacks }` overrides both (`[]` disables global fallbacks). Agents with a dedicated `primary` that differs from `agents.defaults.model.primary` no longer implicitly fall back to the global primary; set `fallbacks` explicitly if you want that behavior. Cron jobs that only override `primary` still inherit default fallbacks unless you set `fallbacks: []`. |
There was a problem hiding this comment.
Docs miss the no-global-primary edge case
The new sentence says "Agents with a dedicated primary that differs from agents.defaults.model.primary…" but hasDedicatedAgentPrimaryWithoutFallbackOverride also triggers — and suppresses global fallbacks — when agents.defaults.model.primary is absent altogether, as long as agents.defaults.model.fallbacks is non-empty. The same gap exists in docs/concepts/model-failover.md. A user who relies on a config like defaults: { model: { fallbacks: ["..."] } } plus an agent-level primary will see their fallbacks silently drop after this change without a clear pointer in the docs.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/gateway/configuration-reference.md
Line: 1565
Comment:
**Docs miss the no-global-primary edge case**
The new sentence says "Agents with a dedicated `primary` that differs from `agents.defaults.model.primary`…" but `hasDedicatedAgentPrimaryWithoutFallbackOverride` also triggers — and suppresses global fallbacks — when `agents.defaults.model.primary` is absent altogether, as long as `agents.defaults.model.fallbacks` is non-empty. The same gap exists in `docs/concepts/model-failover.md`. A user who relies on a config like `defaults: { model: { fallbacks: ["..."] } }` plus an agent-level `primary` will see their fallbacks silently drop after this change without a clear pointer in the docs.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 68531b3d13
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| normalizeLowercaseStringOrEmpty(explicitPrimary) !== | ||
| normalizeLowercaseStringOrEmpty(defaultPrimary) |
There was a problem hiding this comment.
Compare canonical model refs before marking dedicated lanes
hasDedicatedAgentPrimaryWithoutFallbackOverride currently decides dedicated-lane behavior by lowercasing raw primary strings, so equivalent model refs written in different valid forms (for example "gpt-5.4" vs "openai/gpt-5.4") are treated as different. In that scenario resolveEffectiveModelFallbacks now returns [], which silently disables inherited fallback candidates for agents that did not actually change models—only notation—leading to unexpected fallback failures after this commit.
Useful? React with 👍 / 👎.
|
Thanks for the detailed repro and test plan. This is now superseded by current The multi-instance/custom-provider Ollama routing bug is fixed on
Focused verification passed: Closing this PR as superseded. The final mainline fix is narrower than this branch and avoids carrying the unrelated fallback-scope/docs changes from this PR. |
Summary
baseUrlat request time, and dedicated-agent fallback resolution is kept scoped to the child agent instead of bleeding into the parent session context.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
api: "ollama"could fall through a shared transport registry path that effectively closed over the wrong OllamabaseUrl, so provider/model selection looked correct while the request still went to the wrong backend.Regression Test Plan (if applicable)
src/agents/provider-stream.test.tssrc/agents/provider-transport-stream.test.tssrc/agents/pi-embedded-runner/stream-resolution.test.tssrc/agents/agent-scope.test.tsapi: "ollama"but point at differentbaseUrlvalues, the selected provider/model must resolve to the correct transport destination at run time; additionally, a dedicated child agent without configured fallbacks must not reuse parent fallback/provider context.User-visible / Behavior Changes
baseUrlinstead of potentially landing on the first registered Ollama backend.Diagram (if applicable)
Security Impact (required)
Yes/No) NoYes/No) NoYes/No) NoYes/No) NoYes/No) NoYes, explain risk + mitigation: N/ARepro + Verification
Environment
v2026.4.7; self-hosted Ollama on a separate hostollama->http://127.0.0.1:11434ollama-5090/ollama_shard->http://127.0.0.1:11435qwen3.5:27bollama/qwen3.5:27bnexus/shardonollama-5090/qwen3.5:27bapi: "ollama"but use differentbaseUrlvalues (and different pinned GPUs)Steps
api: "ollama"but point to differentbaseUrlvalues.11434) and a subagent likenexusto provider B (11435)./api/psactivity on both instances.Expected
ollama-5090/...should send its request to11435and only touch the shard GPU/instance.Actual
11434, unloading/reloading the wrong model on the wrong GPU even though the selected provider appeared to beollama-5090.Evidence
Attach at least one:
Human Verification (required)
nexusrun recordedprovider: "ollama-5090"/api/psactivity advanced on11435while11434stayed unchangedapi: "ollama"providers with differentbaseUrlvaluesReview Conversations
Compatibility / Migration
Yes/No) YesYes/No) NoYes/No) NoRisks and Mitigations
Risk: Shared Ollama transport resolution now depends on the active model/provider boundary at call time and could affect other custom Ollama provider combinations.
Risk: Child-agent fallback scoping may change behavior for setups that were unintentionally relying on parent fallback/provider bleed.