Skip to content

fix(agents): route multi-instance Ollama providers by model baseUrl#63218

Closed
matthewpapa07 wants to merge 2 commits into
openclaw:mainfrom
matthewpapa07:matt/ollama-shared-api-routing
Closed

fix(agents): route multi-instance Ollama providers by model baseUrl#63218
matthewpapa07 wants to merge 2 commits into
openclaw:mainfrom
matthewpapa07:matt/ollama-shared-api-routing

Conversation

@matthewpapa07

Copy link
Copy Markdown

Summary

  • Problem: With at least two separate Ollama instances it seems that Openclaw forces all routing to the first instance it finds.
  • Why it matters: In multi-GPU/ollama setups this can unload/reload the wrong model on the first Ollama instance, break provider isolation, and make subagents silently run on the parent GPU instead of the configured shard GPU. Generally speaking this breaks a ton of use cases for OpenClaw
  • What changed: Shared Ollama transport resolution now uses the active model/provider boundary to choose the correct baseUrl at request time, and dedicated-agent fallback resolution is kept scoped to the child agent instead of bleeding into the parent session context.
  • What did NOT change (scope boundary): This PR does not change Ollama model loading/warmup behavior, timeouts, or first-request latency; it only fixes provider routing and child fallback scoping.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes # (none)
  • Related # (none)
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: Custom providers sharing api: "ollama" could fall through a shared transport registry path that effectively closed over the wrong Ollama baseUrl, so provider/model selection looked correct while the request still went to the wrong backend.
  • Missing detection / guardrail: There was no regression coverage for multiple custom Ollama providers sharing the same API family but requiring different transport destinations, and no guardrail covering dedicated child agents inheriting parent fallback/provider context.
  • Contributing context (if known): The bug is easiest to see when different Ollama instances are pinned to different GPUs or host different model inventories, because misrouting causes visible model unload/reload behavior on the wrong instance.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/agents/provider-stream.test.ts
    • src/agents/provider-transport-stream.test.ts
    • src/agents/pi-embedded-runner/stream-resolution.test.ts
    • src/agents/agent-scope.test.ts
  • Scenario the test should lock in: When two custom providers share api: "ollama" but point at different baseUrl values, the selected provider/model must resolve to the correct transport destination at run time; additionally, a dedicated child agent without configured fallbacks must not reuse parent fallback/provider context.
  • Why this is the smallest reliable guardrail: The bug lives at the provider stream resolution boundary and child agent scope resolution boundary, so targeted stream/scope tests catch it without requiring a full live Ollama environment.
  • Existing test that already covers this (if any): None before this PR.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • Multi-instance Ollama providers now route by the configured provider/model baseUrl instead of potentially landing on the first registered Ollama backend.
  • Dedicated agents without configured fallbacks no longer accidentally inherit parent provider/model fallback state.
  • Docs now explicitly describe the multi-instance Ollama behavior and routing expectation.

Diagram (if applicable)

Before:
[subagent selects ollama-5090/model] -> [shared ollama transport closes over first provider] -> [request hits 11434] -> [wrong GPU/model reload]

After:
[subagent selects ollama-5090/model] -> [transport resolves model/provider boundary at call time] -> [request hits 11435] -> [correct GPU/provider]

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: OpenClaw Gateway v2026.4.7; self-hosted Ollama on a separate host
  • Model/provider:
    • ollama -> http://127.0.0.1:11434
    • ollama-5090 / ollama_shard -> http://127.0.0.1:11435
    • Repro model: qwen3.5:27b
  • Integration/channel (if any): Local gateway / agent runtime
  • Relevant config (redacted):
    • main agent on ollama/qwen3.5:27b
    • nexus / shard on ollama-5090/qwen3.5:27b
    • dual Ollama providers share api: "ollama" but use different baseUrl values (and different pinned GPUs)

Steps

  1. Configure two custom Ollama providers that both use api: "ollama" but point to different baseUrl values.
  2. Route the main agent to provider A (11434) and a subagent like nexus to provider B (11435).
  3. Trigger a subagent run and inspect the provider/model metadata plus Ollama /api/ps activity on both instances.

Expected

  • A subagent configured for ollama-5090/... should send its request to 11435 and only touch the shard GPU/instance.

Actual

  • Before the fix, the run could still hit 11434, unloading/reloading the wrong model on the wrong GPU even though the selected provider appeared to be ollama-5090.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios:
    • reproduced the original dual-Ollama misrouting locally before the fix
    • verified after the fix that a nexus run recorded provider: "ollama-5090"
    • verified that /api/ps activity advanced on 11435 while 11434 stayed unchanged
    • verified dedicated-agent fallback scoping with targeted tests
  • Edge cases checked:
    • shared api: "ollama" providers with different baseUrl values
    • child agent with no explicit fallbacks
  • What you did not verify:
    • full end-to-end coverage across every provider family that uses shared transport registration
    • separate Ollama model warmup/load timeout behavior (observed locally, but out of scope for this PR)

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: Shared Ollama transport resolution now depends on the active model/provider boundary at call time and could affect other custom Ollama provider combinations.

    • Mitigation: Added targeted regression tests for shared-provider stream resolution and documented the intended multi-instance behavior.
  • Risk: Child-agent fallback scoping may change behavior for setups that were unintentionally relying on parent fallback/provider bleed.

    • Mitigation: The change is limited to dedicated child-agent scope resolution and aligns behavior with explicit per-agent config rather than accidental inheritance.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime agents Agent runtime and tooling size: M labels Apr 8, 2026
@greptile-apps

greptile-apps Bot commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes multi-instance Ollama provider routing by replacing a closed-over static URL pattern with a createBoundaryAwareOllamaStreamFn that reads model.baseUrl dynamically at call time. It also introduces hasDedicatedAgentPrimaryWithoutFallbackOverride to prevent dedicated child agents from inheriting global fallback state. The Ollama transport fix is clean and correct; the fallback scoping change is intentional and well-tested.

Confidence Score: 5/5

Safe to merge; the core routing fix is correct and well-tested, and the only remaining findings are P2 documentation clarity issues.

The Ollama boundary-aware stream function reads model.baseUrl at call time, which is the correct fix for the shared-transport closure bug. The dedicated-agent fallback scoping is intentional, tested, and the behavior change is acknowledged in the PR description. All remaining findings (docs gap for the no-global-primary edge case) are P2 and do not affect correctness.

docs/gateway/configuration-reference.md and docs/concepts/model-failover.md — both miss the edge case where agents.defaults.model.primary is unset; src/agents/agent-scope.ts lines 213-214 for the corresponding code condition.

Vulnerabilities

No security concerns identified. The change only affects internal provider routing and fallback resolution logic; no new network surfaces, token handling, or privilege boundaries are introduced.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/agent-scope.ts
Line: 213-214

Comment:
**Dedicated-lane trigger covers undocumented case**

When `agents.defaults.model` has no `primary` but does have `fallbacks`, any agent with an explicit `primary` is treated as a dedicated lane and receives `[]` fallbacks. This is broader than what both doc updates describe ("primary that differs from `agents.defaults.model.primary`"). The test for `cfgInheritDefaults` (no global primary, global fallbacks set, `linus` has own primary) confirms this path is hit, but users whose config matches that pattern will see a silent behavior change with no guidance in the docs.

Consider either narrowing the condition to require a global primary to exist, or adding a sentence in the updated doc sections noting that this also applies when `agents.defaults.model.primary` is unset.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/gateway/configuration-reference.md
Line: 1565

Comment:
**Docs miss the no-global-primary edge case**

The new sentence says "Agents with a dedicated `primary` that differs from `agents.defaults.model.primary`…" but `hasDedicatedAgentPrimaryWithoutFallbackOverride` also triggers — and suppresses global fallbacks — when `agents.defaults.model.primary` is absent altogether, as long as `agents.defaults.model.fallbacks` is non-empty. The same gap exists in `docs/concepts/model-failover.md`. A user who relies on a config like `defaults: { model: { fallbacks: ["..."] } }` plus an agent-level `primary` will see their fallbacks silently drop after this change without a clear pointer in the docs.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(agents): route multi-instance Ollama..." | Re-trigger Greptile

Comment thread src/agents/agent-scope.ts
Comment on lines +213 to +214
if (!defaultPrimary) {
return resolveAgentModelFallbackValues(cfg.agents?.defaults?.model).length > 0;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dedicated-lane trigger covers undocumented case

When agents.defaults.model has no primary but does have fallbacks, any agent with an explicit primary is treated as a dedicated lane and receives [] fallbacks. This is broader than what both doc updates describe ("primary that differs from agents.defaults.model.primary"). The test for cfgInheritDefaults (no global primary, global fallbacks set, linus has own primary) confirms this path is hit, but users whose config matches that pattern will see a silent behavior change with no guidance in the docs.

Consider either narrowing the condition to require a global primary to exist, or adding a sentence in the updated doc sections noting that this also applies when agents.defaults.model.primary is unset.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/agent-scope.ts
Line: 213-214

Comment:
**Dedicated-lane trigger covers undocumented case**

When `agents.defaults.model` has no `primary` but does have `fallbacks`, any agent with an explicit `primary` is treated as a dedicated lane and receives `[]` fallbacks. This is broader than what both doc updates describe ("primary that differs from `agents.defaults.model.primary`"). The test for `cfgInheritDefaults` (no global primary, global fallbacks set, `linus` has own primary) confirms this path is hit, but users whose config matches that pattern will see a silent behavior change with no guidance in the docs.

Consider either narrowing the condition to require a global primary to exist, or adding a sentence in the updated doc sections noting that this also applies when `agents.defaults.model.primary` is unset.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread docs/gateway/configuration-reference.md Outdated
- `id`: stable agent id (required).
- `default`: when multiple are set, first wins (warning logged). If none set, first list entry is default.
- `model`: string form overrides `primary` only; object form `{ primary, fallbacks }` overrides both (`[]` disables global fallbacks). Cron jobs that only override `primary` still inherit default fallbacks unless you set `fallbacks: []`.
- `model`: string form overrides `primary` only; object form `{ primary, fallbacks }` overrides both (`[]` disables global fallbacks). Agents with a dedicated `primary` that differs from `agents.defaults.model.primary` no longer implicitly fall back to the global primary; set `fallbacks` explicitly if you want that behavior. Cron jobs that only override `primary` still inherit default fallbacks unless you set `fallbacks: []`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Docs miss the no-global-primary edge case

The new sentence says "Agents with a dedicated primary that differs from agents.defaults.model.primary…" but hasDedicatedAgentPrimaryWithoutFallbackOverride also triggers — and suppresses global fallbacks — when agents.defaults.model.primary is absent altogether, as long as agents.defaults.model.fallbacks is non-empty. The same gap exists in docs/concepts/model-failover.md. A user who relies on a config like defaults: { model: { fallbacks: ["..."] } } plus an agent-level primary will see their fallbacks silently drop after this change without a clear pointer in the docs.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/gateway/configuration-reference.md
Line: 1565

Comment:
**Docs miss the no-global-primary edge case**

The new sentence says "Agents with a dedicated `primary` that differs from `agents.defaults.model.primary`…" but `hasDedicatedAgentPrimaryWithoutFallbackOverride` also triggers — and suppresses global fallbacks — when `agents.defaults.model.primary` is absent altogether, as long as `agents.defaults.model.fallbacks` is non-empty. The same gap exists in `docs/concepts/model-failover.md`. A user who relies on a config like `defaults: { model: { fallbacks: ["..."] } }` plus an agent-level `primary` will see their fallbacks silently drop after this change without a clear pointer in the docs.

How can I resolve this? If you propose a fix, please make it concise.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68531b3d13

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/agent-scope.ts
Comment on lines +217 to +218
normalizeLowercaseStringOrEmpty(explicitPrimary) !==
normalizeLowercaseStringOrEmpty(defaultPrimary)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Compare canonical model refs before marking dedicated lanes

hasDedicatedAgentPrimaryWithoutFallbackOverride currently decides dedicated-lane behavior by lowercasing raw primary strings, so equivalent model refs written in different valid forms (for example "gpt-5.4" vs "openai/gpt-5.4") are treated as different. In that scenario resolveEffectiveModelFallbacks now returns [], which silently disables inherited fallback candidates for agents that did not actually change models—only notation—leading to unexpected fallback failures after this commit.

Useful? React with 👍 / 👎.

@steipete

Copy link
Copy Markdown
Contributor

Thanks for the detailed repro and test plan. This is now superseded by current main.

The multi-instance/custom-provider Ollama routing bug is fixed on main through the provider API-owner path plus provider-specific Ollama stream config:

Focused verification passed: pnpm test src/plugins/provider-runtime.test.ts extensions/ollama/index.test.ts.

Closing this PR as superseded. The final mainline fix is narrower than this branch and avoids carrying the unrelated fallback-scope/docs changes from this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation gateway Gateway runtime size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants