fix(inference): sync anthropic runtime routes#4847
Conversation
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
E2E Advisor RecommendationRequired E2E: Dispatch hint: Auto-dispatched E2E: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
E2E Scenario Advisor RecommendationRequired scenario E2E: Dispatch required scenario E2E:
Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 1 needs attention, 7 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Consider writing more tests for
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
Selective E2E Results — ❌ Some jobs failedRun: 27011133903
|
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Selective E2E Results — ❌ Some jobs failedRun: 27011476971
|
Selective E2E Results — ✅ All requested jobs passedRun: 27011207086
|
Selective E2E Results — ❌ Some jobs failedRun: 27011569263
|
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Selective E2E Results — ✅ All requested jobs passedRun: 27011853362
|
Selective E2E Results — ✅ All requested jobs passedRun: 27011599401
|
Selective E2E Results — ❌ Some jobs failedRun: 27011860480
|
📝 WalkthroughWalkthroughAdds runtime inference API types and resolver, maps resolved API to Hermes/OpenClaw sandbox patches and session metadata, extends unit tests, adds E2E Anthropic mock/provider helpers and script changes, updates transient retry detection, and wires two nightly CI jobs for Anthropic inference-switch coverage. ChangesAnthropic Inference API Switching Feature
Sequence Diagram(s)sequenceDiagram
participant runInferenceSet
participant resolveRuntimeInferenceApi
participant patchHermesInferenceConfig
participant patchOpenClawInferenceConfig
participant updateMatchingOnboardSession
runInferenceSet->>resolveRuntimeInferenceApi: compute preferredInferenceApi(agentName, entry, session, provider)
resolveRuntimeInferenceApi-->>runInferenceSet: InferenceApi | null
alt agent is Hermes
runInferenceSet->>patchHermesInferenceConfig: apply preferredInferenceApi
else agent is OpenClaw
runInferenceSet->>patchOpenClawInferenceConfig: apply preferredInferenceApi
end
runInferenceSet->>updateMatchingOnboardSession: persist patched.route.inferenceApi
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Tools execution failed with the following error: Failed to run tools: Stream initialization permanently failed: 14 UNAVAILABLE: read ECONNRESET Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 1332-1344: The openclaw-anthropic-inference-switch-e2e job's
env_json is missing the NEMOCLAW_AGENT key; update the env_json value for the
openclaw-anthropic-inference-switch-e2e job to include
"NEMOCLAW_AGENT":"openclaw" (mirroring how the hermes job sets "hermes"), making
sure to insert the new key/value into the existing JSON string in the env_json
field and preserve proper commas/quoting so the resulting string remains valid
JSON.
In `@test/e2e/lib/anthropic-switch-provider.sh`:
- Around line 135-145: The script currently runs "openshell provider update -g
nemoclaw compatible-anthropic-endpoint ..." or "openshell provider create -g
nemoclaw --name compatible-anthropic-endpoint ..." but doesn't check their exit
status before reporting success; update the block so that immediately after each
openshell provider update and openshell provider create command you check their
exit code (e.g., via "$?" or using a conditional) and on non-zero print a clear
error to stderr and exit with a non-zero status (or return non-zero) so failures
in openshell provider update/create are propagated instead of being masked by
the later success message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 73ed0c81-401e-46b2-b20e-5518c3565aa0
📒 Files selected for processing (7)
.github/workflows/nightly-e2e.yamlsrc/lib/actions/inference-set.test.tssrc/lib/actions/inference-set.tstest/e2e/lib/anthropic-switch-provider.shtest/e2e/lib/inference-switch-retry.shtest/e2e/test-hermes-inference-switch.shtest/e2e/test-openclaw-inference-switch.sh
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Selective E2E Results — ✅ All requested jobs passedRun: 27013038114
|
Selective E2E Results — ❌ Some jobs failedRun: 27013044040
|
Selective E2E Results — ✅ All requested jobs passedRun: 27013103549
|
Selective E2E Results — ❌ Some jobs failedRun: 27014726579
|
Selective E2E Results — ✅ All requested jobs passedRun: 27014755537
|
Selective E2E Results — ✅ All requested jobs passedRun: 27014838796
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/e2e/lib/anthropic-switch-provider.sh (1)
170-172:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winError message references wrong variable name.
The error message mentions
NEMOCLAW_SWITCH_ENDPOINT_URLbut the code checksSWITCH_ENDPOINT_URL. This mismatch could confuse developers debugging test failures.Suggested fix
if [ -z "${SWITCH_ENDPOINT_URL:-}" ]; then - fail "NEMOCLAW_SWITCH_ENDPOINT_URL is required for compatible Anthropic inference switches" + fail "SWITCH_ENDPOINT_URL is required for compatible Anthropic inference switches" return 1 fi🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/lib/anthropic-switch-provider.sh` around lines 170 - 172, The error message refers to the wrong environment variable name; update the fail call that currently mentions NEMOCLAW_SWITCH_ENDPOINT_URL to reference SWITCH_ENDPOINT_URL (or make both check and message use the canonical env var you intend) so the message matches the condition checking the SWITCH_ENDPOINT_URL variable in the script.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@test/e2e/lib/anthropic-switch-provider.sh`:
- Around line 170-172: The error message refers to the wrong environment
variable name; update the fail call that currently mentions
NEMOCLAW_SWITCH_ENDPOINT_URL to reference SWITCH_ENDPOINT_URL (or make both
check and message use the canonical env var you intend) so the message matches
the condition checking the SWITCH_ENDPOINT_URL variable in the script.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 735dcc8e-33a1-462d-b54d-a12f8e4cafc0
📒 Files selected for processing (5)
.github/workflows/nightly-e2e.yamlsrc/lib/actions/inference-route-api.test.tssrc/lib/actions/inference-route-api.tssrc/lib/actions/inference-set.tstest/e2e/lib/anthropic-switch-provider.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/nightly-e2e.yaml
Selective E2E Results — ✅ All requested jobs passedRun: 27023295295
|
Selective E2E Results — ✅ All requested jobs passedRun: 27023413644
|
Selective E2E Results — ✅ All requested jobs passedRun: 27041002798
|
## Summary - Adds the `v0.0.60` section to `docs/about/release-notes.mdx` using the dev announcement from discussion #4877. - Fills the source-doc gaps found during release-prep review across inference, policy tiers, command behavior, security boundaries, Hermes dashboard/tooling, runtime context, and troubleshooting. - Refreshes generated agent skills under `.agents/skills/` from the current Fern docs output and upgrades Fern from `5.44.3` to `5.45.0`. ## Source summary - #4037 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents system-only runtime context that stays out of visible chat. - #4875 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents try-first sandbox network/filesystem guidance and clearer failure classification. - #4788 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents shared OpenClaw device-approval policy for startup and connect. - #4768 -> `docs/reference/network-policies.mdx`, `docs/network-policy/integration-policy-examples.mdx`, `docs/get-started/quickstart.mdx`, `docs/get-started/quickstart-hermes.mdx`, `docs/reference/commands.mdx`: Documents `weather`, `public-reference`, and Hermes managed-tool gateway preset behavior. - #3788 and #4864 -> `docs/reference/network-policies.mdx`, `docs/reference/commands.mdx`: Documents non-interactive policy-tier fail-fast behavior and interactive prompt fallback. - #4756 and #4866 -> `docs/reference/commands.mdx`: Documents env-aware default sandbox resolution for `list`, `status`, and `tunnel` commands. - #4320 -> `docs/reference/commands.mdx`: Documents `$$nemoclaw tunnel status` behavior. - #4328 -> `docs/reference/commands.mdx`: Documents line-scoped policy preset descriptions in `policy-list`. - #4580 and #4748 -> `docs/reference/architecture.mdx`: Documents package-managed OpenShell gateway service and Docker-driver gateway-marker behavior. - #4598 -> `docs/manage-sandboxes/lifecycle.mdx`: Documents concurrent gateway/dashboard cleanup isolation by sandbox name and port. - #4777 -> `docs/reference/troubleshooting.mdx`: Documents Docker GPU patch rollback behavior. - #4610 -> `docs/reference/troubleshooting.mdx`, `docs/reference/commands.mdx`: Keeps mutable OpenClaw config permission guidance aligned and removes skipped experimental wording. - #4868 -> `docs/reference/commands.mdx`: Keeps `.dockerignore` handling for custom `onboard --from <Dockerfile>` contexts in generated skills. - #4870 -> `docs/reference/commands.mdx`, `docs/manage-sandboxes/runtime-controls.mdx`: Documents `NEMOCLAW_MINIMAL_BOOTSTRAP` and generated skill coverage. - #4641 -> `docs/inference/inference-options.mdx`, `docs/reference/troubleshooting.mdx`: Documents local NVIDIA NIM platform-digest pulls and served-model id adoption. - #4810 and #4867 -> `docs/inference/inference-options.mdx`: Documents stable NGC managed-vLLM image lineage and DGX Station DeepSeek V4 Flash coverage. - #4852 -> `docs/inference/use-local-inference.mdx`, `docs/reference/troubleshooting.mdx`: Documents Ollama model fit filtering, 16K context floor, cold-load retry, and failed-model exclusion. - #4847 -> `docs/inference/switch-inference-providers.mdx`: Documents API-family sync, Hermes `api_mode`, and Bedrock Runtime exception. - #4800 -> `docs/inference/tool-calling-reliability.mdx`: Documents Nemotron managed-inference native tool-search fallback. - #4333 -> `docs/inference/switch-inference-providers.mdx`: Documents interactive multimodal input prompting. - #4086 -> `docs/reference/troubleshooting.mdx`: Keeps proxy bypass normalization in generated troubleshooting coverage. - #4811 and #4855 -> `docs/get-started/quickstart-hermes.mdx`: Documents prebuilt Hermes dashboard assets and TUI recovery without runtime rebuilds. - #4854 -> `docs/inference/switch-inference-providers.mdx`, `docs/reference/commands.mdx`: Documents Hermes proxy API-key placeholder preservation during inference switches. - #4248 -> `docs/manage-sandboxes/messaging-channels.mdx`, `.agents/skills/`: Keeps messaging enrollment behavior aligned with manifest-hook implementation. - #4771 -> `docs/security/best-practices.mdx`, `docs/security/credential-storage.mdx`: Documents Hermes placeholder-only secret boundary for sandbox-visible runtime files. - #4787 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents expanded memory scanner examples for OpenAI project keys and Slack app-level tokens. - #4848 -> `docs/reference/commands.mdx`: Documents OpenClaw skill install mirroring into the agent home directory. - #4790 -> `docs/about/release-notes.mdx`: Uses the prior release-prep structure and generated `.agents/skills/` refresh as the template for this release. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ skills/ --prefix nemoclaw-user --doc-platform fern-mdx --dry-run` - `npm run docs` - `git diff --check` - skip-term scan across `docs/`, `.agents/skills/`, and `skills/` - `npm run build:cli` - `npm run typecheck:cli` - Commit and pre-push hook suites, including markdownlint, gitleaks, env-var docs gate, docs-to-skills verification, and skills YAML tests <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * DeepSeek-V4-Flash now available as default inference model for DGX Station. * Hermes dashboard improved with dedicated port and OAuth-authenticated tool gateway selection. * Added weather and public-reference policy presets for expanded agent capabilities. * Enhanced Ollama model selection with GPU memory filtering and automatic retry for timeouts. * **Bug Fixes** * Improved policy tier validation to prevent invalid configurations. * Better sandbox cleanup scoping by port to prevent conflicts across deployments. * Added GPU patch failure recovery with automatic rollback. * **Documentation** * Expanded troubleshooting guides for inference, security, and sandbox lifecycle. * Added .dockerignore best practices for custom deployments. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Summary
nemoclaw inference set/nemohermes inference setbefore patching in-sandbox config.model.api_modefor Anthropic Messages and OpenAI Responses routes, and clear staleapi_modewhen switching back to OpenAI-style chat completions./v1/chat/completions./v1/messagesthroughinference.local, then exercise the agent path after the switch.Why
#4809 reported a
403 connection not allowed by policywhile the agent was callinghttps://inference.local, so the right fix is not to open direct sandbox egress to the upstream inference host. #4402 fixed fresh Hermes onboarding by allowing managed/v1/messagesand bakingapi_mode: anthropic_messages. This PR covers the remaining runtime-switch path so both Hermes and OpenClaw keep using OpenShell-managed inference correctly afterinference set.References
Validation
npx vitest run src/lib/actions/inference-set.test.tsnpx vitest run src/lib/actions/inference-set.test.ts src/lib/inference/config.test.ts test/generate-hermes-config.test.ts test/generate-openclaw-config.test.ts(initial combined run hit two existing 5s per-test timeouts intest/generate-openclaw-config.test.ts; rerun below passed with a larger timeout)npx vitest run test/generate-openclaw-config.test.ts --testTimeout 20000npx vitest run test/validate-e2e-coverage.test.tsshellcheck test/e2e/test-hermes-inference-switch.sh test/e2e/test-openclaw-inference-switch.sh test/e2e/lib/anthropic-switch-provider.sh test/e2e/lib/inference-switch-retry.shbash -n test/e2e/test-hermes-inference-switch.sh test/e2e/test-openclaw-inference-switch.sh test/e2e/lib/anthropic-switch-provider.sh test/e2e/lib/inference-switch-retry.shnpx biome check src/lib/actions/inference-set.ts src/lib/actions/inference-set.test.tsnpm run build:clinpm run validate:configsgit diff --checke21952d57e8ef23caa266d6862e7367ec3bd3814, including commit-lint and DCO.27014755537passed both new agent-path proofs:hermes-anthropic-inference-switch-e2e / runopenclaw-anthropic-inference-switch-e2e / runSummary by CodeRabbit
Release Notes
New Features
Tests
Chores
Final confidence pass
5c09efefe6e30e2fe3708dfa3b864d3cd3ece95a:ubuntu-repo-cloud-openclaw-double-provider-switch,ubuntu-repo-cloud-openclaw-double-same-provider— run27023280926.5c09efefe6e30e2fe3708dfa3b864d3cd3ece95a:hermes-inference-switch-e2e,openclaw-inference-switch-e2e— run27023413644.hermes-e2epassed on head5c09efefe6e30e2fe3708dfa3b864d3cd3ece95a: run27041002798.5c09efefe6e30e2fe3708dfa3b864d3cd3ece95a:bedrock-runtime-compatible-anthropic-e2e,inference-routing-e2e— run27023295295.