fix(inference): set compat.supportsUsageInStreaming for ollama-local (#2747)#3683
Conversation
…dboxInferenceConfig (#2747) Follow-up to #3678. That PR added a Python-side conditional in `generate-openclaw-config.py` that toggled `supportsUsageInStreaming` when `NEMOCLAW_PROVIDER_KEY` was `"ollama"` or `"ollama-local"`. However, in real onboard runs `getSandboxInferenceConfig("model", "ollama-local")` hits the default switch arm and returns `providerKey = "inference"` — the same value used by all managed-inference providers — so the build-time `NEMOCLAW_PROVIDER_KEY` is never `"ollama"` and the conditional was dead code. The earlier regression tests in `test/generate-openclaw-config.test.ts` passed only because they set `NEMOCLAW_PROVIDER_KEY=ollama` directly, which never happens in production. Verified locally with the python script and the real env values NemoClaw passes through at docker build time (`NEMOCLAW_PROVIDER_KEY=inference`, `NEMOCLAW_INFERENCE_BASE_URL=https://inference.local/v1`, `NEMOCLAW_INFERENCE_COMPAT_B64=e30=` for an empty compat): the generated `openclaw.json` has `model.compat = null`, confirming the flag never landed. Move the wiring to the proper layer: add an explicit `"ollama-local"` case in `getSandboxInferenceConfig` that sets `inferenceCompat = { supportsUsageInStreaming: true }`. The existing encode-into-`NEMOCLAW_INFERENCE_COMPAT_B64` path then carries the flag into the python script's `inference_compat` dict, which writes it onto `model.compat` exactly as before — except now it actually fires for ollama-local builds. Cloud providers and other local backends keep their existing `inferenceCompat` (or absence thereof). Drop the now-redundant python conditional, drop the three unrealistic-fixture vitest cases from `test/generate-openclaw-config.test.ts`, and add two new ones in `src/lib/inference/config.test.ts` that exercise `getSandboxInferenceConfig` with realistic provider names. Test plan - `npx vitest run src/lib/inference/config.test.ts` — 24/24 pass, including the two new ones (`forces supportsUsageInStreaming for ollama-local`, `does not force supportsUsageInStreaming for non-ollama providers`) - `npx vitest run test/generate-openclaw-config.test.ts` — 67/67 pass (was 70/70 with the three deleted unrealistic cases) - Direct script run with prod-shaped env values: `NEMOCLAW_INFERENCE_COMPAT_B64={"supportsUsageInStreaming":true}` (the shape TypeScript will emit for ollama-local) → `model.compat = {'supportsUsageInStreaming': True}` in generated openclaw.json ✓ Signed-off-by: Shawn Xie <shaxie@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughRelocates Ollama/ ChangesOllama supportsUsageInStreaming migration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: None Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
|
|
Additional empirical evidence — confirmed by running a fresh ollama-local onboard on the current main (post-#3678) and pulling ``` So on main today ( Onboard command used: Environment: Ubuntu 24.04, x86_64, A100x8 + Docker 29.4.2 + Node 22.22.2 + Ollama 0.24.0 (loopback), OpenShell 0.0.39, NemoClaw v0.0.44+ from current main. After landing this PR, the same onboard should produce |
Resolve conflicts from the generate-openclaw-config .py->.mts migration (landed on main in 9641ce0): - scripts/generate-openclaw-config.py: deleted on main. This PR had only removed the dead `provider_key in {ollama,ollama-local}` block from it, so accept the deletion — the file is gone. - test/generate-openclaw-config.test.ts: rewritten on main. This PR only deleted the old .py-era tests here (+0/-64), which no longer exist in the rewritten file, so take main's version (keeps main's new tests). The real fix is unchanged and lives in src/lib/inference/config.ts: getSandboxInferenceConfig("model","ollama-local") sets inferenceCompat.supportsUsageInStreaming=true, which flows to NEMOCLAW_INFERENCE_COMPAT_B64 (dockerfile-patch.ts) and into openclaw.json. Note main's migration re-ported the same provider-key -gated block into the .mts script; that block is inert in production (NEMOCLAW_PROVIDER_KEY resolves to "inference" for ollama-local), so the two coexist and config.ts is the layer that actually fires. Verified: build:cli clean; 130/130 tests pass across src/lib/inference/config.test.ts and test/generate-openclaw-config.test.ts. Signed-off-by: Shawn Xie <shaxie@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
E2E Scenario Advisor RecommendationRequired scenario E2E: None Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 0 worth checking, 0 nice ideas This is an automated advisory review. A human maintainer must make the final merge decision. |
|
Addressed the latest advisory feedback:
Validation run locally:
|
Summary
getSandboxInferenceConfig(\"model\", \"ollama-local\")falls through to the default switch arm and returnsproviderKey = \"inference\"(the same value all managed-inference providers use), so the build-timeNEMOCLAW_PROVIDER_KEYis never\"ollama\"/\"ollama-local\"and the conditional never fired. The earlier vitest cases only passed because they setNEMOCLAW_PROVIDER_KEY=ollamadirectly — an env shape no real onboard produces.case \"ollama-local\":ingetSandboxInferenceConfigthat setsinferenceCompat = { supportsUsageInStreaming: true }. The existingNEMOCLAW_INFERENCE_COMPAT_B64round-trip then carries the flag intomodel.compatexactly as it does forkimi-k2.6-managed-inference.jsonandcompatible-endpoint.scripts/generate-openclaw-config.py, drop the three unrealistic-fixture vitest cases intest/generate-openclaw-config.test.ts, and add two new ones insrc/lib/inference/config.test.tsthat exercise the real provider name.Evidence the previous fix was inert
Direct run of
scripts/generate-openclaw-config.pywith the env values NemoClaw actually passes at docker build time for an ollama-local onboard (post-#3678 on main):```
NEMOCLAW_PROVIDER_KEY=inference
NEMOCLAW_PRIMARY_MODEL_REF=inference/qwen2.5:0.5b
NEMOCLAW_INFERENCE_BASE_URL=https://inference.local/v1
NEMOCLAW_INFERENCE_API=openai-completions
NEMOCLAW_INFERENCE_COMPAT_B64=$(echo -n '{}' | base64)
```
generates an
openclaw.jsonwithprovider.models[0].compat = null. Theif provider_key in {\"ollama\", \"ollama-local\"}:conditional never matched, so thesupportsUsageInStreamingflag was not set. TUI token counter would stay?onnemoclaw onboard --provider ollamabuilds in main today.After this PR (same env, but TypeScript now also encodes
inferenceCompat = {\"supportsUsageInStreaming\": true}intoNEMOCLAW_INFERENCE_COMPAT_B64for ollama-local):```
NEMOCLAW_INFERENCE_COMPAT_B64=$(echo -n '{"supportsUsageInStreaming":true}' | base64)
```
openclaw.jsonnow hasprovider.models[0].compat = {\"supportsUsageInStreaming\": true}✓.Test plan
Re-opens — and properly fixes — #2747.
Signed-off-by: Shawn Xie shaxie@nvidia.com
Summary by CodeRabbit
Refactor
Tests