test(agents): include Ollama in small live model matrix#87838
Conversation
|
Codex review: needs maintainer review before merge. Reviewed May 31, 2026, 9:27 PM ET / 01:27 UTC. Summary PR surface: Source +1, Tests +817, Docs +1. Total +819 across 4 files. Reproducibility: not applicable. this is a test matrix expansion rather than a bug report. The PR body supplies after-patch live Ollama Cloud, focused test, Crabbox, and CI proof instead of a current-main failure reproduction. Review metrics: 2 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land this only if maintainers want the curated small direct live matrix to exercise Ollama; keep gateway-level Ollama smoke as a separate lane or follow-up if needed. Do we have a high-confidence way to reproduce the issue? Not applicable; this is a test matrix expansion rather than a bug report. The PR body supplies after-patch live Ollama Cloud, focused test, Crabbox, and CI proof instead of a current-main failure reproduction. Is this the best way to solve the issue? Yes, with maintainer acceptance of the automation scope. The branch keeps Ollama-specific handling in the direct live-test harness, uses the plugin runtime-api barrel, and leaves gateway-level Ollama smoke as a reasonable separate follow-up. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 52c809a759f3. Label changesLabel justifications:
Evidence reviewedPR surface: Source +1, Tests +817, Docs +1. Total +819 across 4 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
f142472 to
23909b4
Compare
23909b4 to
2b4fb34
Compare
2b4fb34 to
9920539
Compare
9920539 to
815cccb
Compare
142530e to
82cfd3e
Compare
49ec1a3 to
14c0ace
Compare
52edd1b to
47e3da2
Compare
432c802 to
c3abae1
Compare
c3abae1 to
919413c
Compare
|
Maintainer verification for head
Landing with maintainer override on exact-head local proof because the change is test harness/docs scoped and the queued CI has not produced a code failure. |
* test(agents): include Ollama in small live model matrix * test: avoid Ollama cloud key in local live runs * test: recognize Ollama env secret refs * test: type Ollama live key fixtures * test: prevent Ollama cloud auth in local live probes * test: preserve equivalent Ollama live credentials --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
* test(agents): include Ollama in small live model matrix * test: avoid Ollama cloud key in local live runs * test: recognize Ollama env secret refs * test: type Ollama live key fixtures * test: prevent Ollama cloud auth in local live probes * test: preserve equivalent Ollama live credentials --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
* test(agents): include Ollama in small live model matrix * test: avoid Ollama cloud key in local live runs * test: recognize Ollama env secret refs * test: type Ollama live key fixtures * test: prevent Ollama cloud auth in local live probes * test: preserve equivalent Ollama live credentials --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
Summary
OLLAMA_API_KEY, and use remote credentials only for remote/Ollama Cloud endpoints.Verification
git diff --check origin/main...HEAD- passed on head14c0ace5b76.node_modules/.bin/oxfmt --check --threads=1 docs/help/testing-live.md src/agents/live-model-filter.ts src/agents/model-compat.test.ts src/agents/models.profiles.live.test.ts- passed.node scripts/run-oxlint.mjs src/agents/live-model-filter.ts src/agents/model-compat.test.ts src/agents/models.profiles.live.test.ts- passed.node scripts/check-docs-mdx.mjs docs/help/testing-live.md- passed.node scripts/run-vitest.mjs src/agents/model-compat.test.ts src/agents/models.profiles.live.test.ts --reporter=dot- passed; the non-live agent config picked upmodel-compat.test.ts, 57 tests.node scripts/test-live.mjs --quiet -- src/agents/models.profiles.live.test.ts --reporter=dot- passed, 33 tests; provider sweep intentionally skipped withoutOPENCLAW_LIVE_MODELS.OPENCLAW_LIVE_MODELS=small OPENCLAW_LIVE_PROVIDERS=ollama OPENCLAW_LIVE_OLLAMA_BASE_URL=https://ollama.com OPENCLAW_LIVE_MAX_MODELS=1 OPENCLAW_LIVE_MODEL_CONCURRENCY=1 OPENCLAW_LIVE_MODEL_TIMEOUT_MS=45000 OPENCLAW_LIVE_TEST_TIMEOUT_MS=120000 node scripts/test-live.mjs --quiet -- src/agents/models.profiles.live.test.ts --reporter=dot- passed, 33 tests;ollama/gemma3:4bcompleted prompt, file-read, and image probes.ollama/gemma3:4bcompleted prompt, file-read, and image probes again.node scripts/run-vitest.mjs src/gateway/server-runtime-state.test.ts src/gateway/server-startup-session-migration.test.ts src/gateway/server-startup-web-fetch-bind.test.ts src/gateway/server.lazy.test.ts --reporter=dot- passed, 4 files / 7 tests.checks-node-agentic-control-plane-startup-runtime- passed in 42s: https://github.com/openclaw/openclaw/actions/runs/26721310412/job/78759461978.aws, leasecbx_0b77e578cbc5, slugcoral-lobster, runrun_3aae1ee99dfa, machinec7a.8xlarge, exit0,leaseStopped=true;pnpm check:changedpassed lanescore,coreTests, anddocs.aws, leasecbx_f3263a9c4953, slugcoral-lobster, runrun_f36bcaa35ea6, machinec7a.8xlarge, exit0,leaseStopped=true..agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main- first run found the legacyopenai-codex/*doc example; fixed. Final rerun was clean, no accepted/actionable findings,overall: patch is correct (0.84).Real behavior proof
Behavior addressed: the direct live model matrix now covers a constrained Ollama small-model route and can prove prompt, file-read, and image behavior against Ollama Cloud or a local Ollama endpoint without treating local endpoints as secret-bearing cloud calls.
Real environment tested: local linked OpenClaw worktree for source, live setup, and Ollama Cloud proof; AWS Crabbox Linux for
check:changedon the amended PR head; GitHub Actions rerun for the startup-runtime shard after a transient hang.Exact steps or command run after this patch:
node scripts/run-vitest.mjs src/agents/model-compat.test.ts src/agents/models.profiles.live.test.ts --reporter=dot;node scripts/test-live.mjs --quiet -- src/agents/models.profiles.live.test.ts --reporter=dot;OPENCLAW_LIVE_MODELS=small OPENCLAW_LIVE_PROVIDERS=ollama OPENCLAW_LIVE_OLLAMA_BASE_URL=https://ollama.com OPENCLAW_LIVE_MAX_MODELS=1 OPENCLAW_LIVE_MODEL_CONCURRENCY=1 OPENCLAW_LIVE_MODEL_TIMEOUT_MS=45000 OPENCLAW_LIVE_TEST_TIMEOUT_MS=120000 node scripts/test-live.mjs --quiet -- src/agents/models.profiles.live.test.ts --reporter=dot; AWS Crabbox runrun_3aae1ee99dfawithpnpm check:changed; GitHub Actions job78759461978.Evidence after fix: repeated local Ollama Cloud live runs selected
ollama/gemma3:4band completed prompt, file-read, and image probes; live-suite setup tests passed 33/33; compatibility tests passed 57/57; AWS Crabboxcheck:changedpassed on head14c0ace5b76; rerun startup-runtime CI shard passed.Observed result after fix:
OPENCLAW_LIVE_MODELS=small OPENCLAW_LIVE_PROVIDERS=ollamaselects the curated Ollama model, hydrates the right live provider config, registers the Ollama runtime stream in-process, and completes the small-model probes.What was not tested: gateway-level live agent smoke for Ollama was not run in this PR; this PR extends the direct live model matrix only.