feat(inference): support Hermes runtime model switches#3354
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR extends ChangesHermes Inference Set Implementation
🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docs/inference/inference-options.md (1)
72-72: ⚡ Quick winSplit the two-sentence table description into one sentence per source line.
Line 72 contains two sentences in the same Markdown source line inside the table cell, which breaks the docs formatting rule.
As per coding guidelines, "One sentence per line in source (makes diffs readable). Flag paragraphs where multiple sentences appear on the same line."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/inference/inference-options.md` at line 72, The table cell for "Hermes Provider" contains two sentences on a single Markdown source line; edit the cell in the inference-options.md table so each sentence is on its own source line (i.e., break the cell content after "NemoClaw." so "Available only when onboarding Hermes Agent." and the curated models sentence are on separate lines) to satisfy the one-sentence-per-line docs rule.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/sandbox-config.ts`:
- Around line 429-442: In buildRecomputeSandboxConfigHashScript, avoid
truncating HERMES_STRICT_HASH_FILE directly; instead compute checksums into a
temp file (e.g., HERMES_STRICT_HASH_FILE.tmp in the same dir), verify sha256sum
writes to that temp, set ownership and permissions on the temp, then atomically
mv the temp to HERMES_STRICT_HASH_FILE and only then cp or mv to the
compatibilityHash (also using a temp for the compatibility file if desired) so
that creation, chown, chmod and final replacement are all atomic and cannot
leave a partial/empty strict or compatibility hash; update the sequence that
currently uses `> ${HERMES_STRICT_HASH_FILE}` to use temp files and mv
operations while keeping the existing mkdir, chown and chmod steps.
---
Nitpick comments:
In `@docs/inference/inference-options.md`:
- Line 72: The table cell for "Hermes Provider" contains two sentences on a
single Markdown source line; edit the cell in the inference-options.md table so
each sentence is on its own source line (i.e., break the cell content after
"NemoClaw." so "Available only when onboarding Hermes Agent." and the curated
models sentence are on separate lines) to satisfy the one-sentence-per-line docs
rule.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 34e480bd-45af-434b-94d8-f859f4603e84
📒 Files selected for processing (13)
docs/get-started/quickstart-hermes.mddocs/inference/inference-options.mddocs/inference/switch-inference-providers.mddocs/reference/cli-selection-guide.mddocs/reference/commands.mdsrc/commands/inference/set.tssrc/lib/actions/inference-set.test.tssrc/lib/actions/inference-set.tssrc/lib/commands/inference/set.tssrc/lib/onboard.tssrc/lib/sandbox-config.tstest/cli.test.tstest/config-set.test.ts
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/e2e/test-openclaw-inference-switch.sh (1)
17-17: 💤 Low valueMissing
-eflag insetmay be intentional but warrants a note.The script uses
set -uo pipefailwithout-e(errexit). This appears intentional since the script relies on manualexit 1calls afterfail()for critical failures. However, any unhandled command failures between checks will not abort the script.If the design is to continue collecting all pass/fail results before exiting, the current approach is correct. Otherwise, consider adding
-eto fail fast on unexpected errors.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/test-openclaw-inference-switch.sh` at line 17, The script's shell flags use "set -uo pipefail" without "-e", which can allow unexpected command failures to continue; either add "-e" (change to "set -euo pipefail") to enable fail-fast behavior, or explicitly document the intentional omission by adding a comment near the top explaining that fail() and manual exit handling are used to collect all results; reference the existing "set -uo pipefail" line and the fail() invocation so the reviewer can choose the correct fix.test/e2e/test-hermes-inference-switch.sh (1)
171-209: 💤 Low valueRegex-based YAML parsing is acceptable but fragile.
The Python code uses regex to parse
config.yamlrather than a YAML library. This works for the controlled test scenario but could break if the YAML structure changes (e.g., nested maps, multiline values, or different key ordering).For E2E tests where PyYAML availability is uncertain, this tradeoff is reasonable. Consider adding a brief comment noting this limitation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/test-hermes-inference-switch.sh` around lines 171 - 209, The inline Python probe in test/e2e/test-hermes-inference-switch.sh uses regex-based YAML parsing (see the Python block that builds the model dict and the regex matches) which is fragile for nested/multiline YAML; add a brief comment immediately above that Python heredoc explaining that this is a deliberate lightweight approach for the controlled E2E test (PyYAML may not be available), call out its limitations (won't handle nested maps or multiline values), and note that if the config parsing needs to be robust in future the code should switch to a proper YAML parser like PyYAML.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/e2e/test-hermes-inference-switch.sh`:
- Around line 171-209: The inline Python probe in
test/e2e/test-hermes-inference-switch.sh uses regex-based YAML parsing (see the
Python block that builds the model dict and the regex matches) which is fragile
for nested/multiline YAML; add a brief comment immediately above that Python
heredoc explaining that this is a deliberate lightweight approach for the
controlled E2E test (PyYAML may not be available), call out its limitations
(won't handle nested maps or multiline values), and note that if the config
parsing needs to be robust in future the code should switch to a proper YAML
parser like PyYAML.
In `@test/e2e/test-openclaw-inference-switch.sh`:
- Line 17: The script's shell flags use "set -uo pipefail" without "-e", which
can allow unexpected command failures to continue; either add "-e" (change to
"set -euo pipefail") to enable fail-fast behavior, or explicitly document the
intentional omission by adding a comment near the top explaining that fail() and
manual exit handling are used to collect all results; reference the existing
"set -uo pipefail" line and the fail() invocation so the reviewer can choose the
correct fix.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e79de72b-7649-4413-a970-33a34b9548b7
📒 Files selected for processing (4)
.coderabbit.yaml.github/workflows/nightly-e2e.yamltest/e2e/test-hermes-inference-switch.shtest/e2e/test-openclaw-inference-switch.sh
…ence-set # Conflicts: # .github/workflows/nightly-e2e.yaml
Selective E2E Results — ❌ Some jobs failedRun: 25685065343
|
Selective E2E Results — ❌ Some jobs failedRun: 25686307945
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/test-hermes-inference-switch.sh`:
- Around line 495-505: The registry removal check should be skipped when the
sandbox is intentionally kept (NEMOCLAW_E2E_KEEP_SANDBOX=1); modify the
post-destroy verification so that if the environment variable
NEMOCLAW_E2E_KEEP_SANDBOX is set to "1" the script does not check registry_file
(the "${HOME}/.nemoclaw/sandboxes.json" lookup for "${SANDBOX_NAME}") and does
not call fail/pass for removal—wrap the registry_file existence and grep check
(and the subsequent fail "Sandbox ${SANDBOX_NAME} still in registry after
destroy" / pass "Sandbox ${SANDBOX_NAME} removed") in an if that only runs when
NEMOCLAW_E2E_KEEP_SANDBOX != "1".
- Around line 133-141: The current guard "if session:" allows empty dicts/lists
from deserializing onboard-session.json to pass; change the checks to explicitly
validate that session is a non-empty mapping (e.g., ensure session is a dict and
has keys) before skipping and add an explicit error when it's empty or not a
dict. In other words, replace or augment the "if session:" guard with a check
like "if not isinstance(session, dict) or not session: errors.append('session is
empty or invalid')" and then perform the existing field assertions against
session.get('sandboxName'), session.get('agent'), session.get('provider'), and
session.get('model') (using the variables name, provider, model) so an empty
onboard-session.json fails the test.
In `@test/e2e/test-openclaw-inference-switch.sh`:
- Around line 428-438: The registry existence check assumes the sandbox was
destroyed; skip this check when NEMOCLAW_E2E_KEEP_SANDBOX is set by guarding the
registry_file lookup and the fail/pass logic with the same condition used for
destroy (NEMOCLAW_E2E_KEEP_SANDBOX), i.e., if NEMOCLAW_E2E_KEEP_SANDBOX==1 then
report or skip verification for SANDBOX_NAME instead of running grep on
registry_file; update the block that references registry_file, SANDBOX_NAME,
fail and pass so it only verifies removal when the sandbox was actually
destroyed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b335ab9c-36ca-49f8-8c27-9e6bb20d9a5b
📒 Files selected for processing (7)
.coderabbit.yaml.github/workflows/nightly-e2e.yamldocs/reference/commands.mdsrc/lib/onboard.tstest/cli.test.tstest/e2e/test-hermes-inference-switch.shtest/e2e/test-openclaw-inference-switch.sh
✅ Files skipped from review due to trivial changes (3)
- src/lib/onboard.ts
- docs/reference/commands.md
- .coderabbit.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/nightly-e2e.yaml
Selective E2E Results — ✅ All requested jobs passedRun: 25686994377
|
Selective E2E Results — ✅ All requested jobs passedRun: 25687562452
|
Selective E2E Results — ✅ All requested jobs passedRun: 25688652356
|
## Summary Refreshes the release-prep docs for v0.0.39 based on changes merged since the Friday 4pm doc refresh. Updates the source docs, bumps the docs version metadata, and regenerates the NemoClaw user skills from the refreshed docs. ## Changes - #3314 -> `docs/get-started/prerequisites.md`, `docs/get-started/quickstart.md`, `docs/reference/troubleshooting.md`: Documents installer Docker setup, Docker group activation, and retry guidance. - #3317 -> `docs/get-started/quickstart.md`, `docs/reference/commands.md`: Documents the DGX Spark and DGX Station express install prompt and `NEMOCLAW_NO_EXPRESS`. - #3328 and #3329 -> `docs/security/best-practices.md`, `docs/deployment/sandbox-hardening.md`: Updates sandbox capability hardening docs for the stricter bounding-set and `setpriv` step-down behavior. - #3330, #3335, and #3346 -> `docs/inference/use-local-inference.md`: Documents Windows-host Ollama relaunch behavior, NIM key passthrough, early health-fail diagnostics, and mixed-GPU preflight detail. - #2406, #2883, #3001, #3244, #3267, #3318, #3320, and #3354 -> `docs/about/release-notes.md`: Adds the v0.0.39 release-prep section while keeping the v0.0.38 release notes intact. - Advances the release-prep docs metadata from v0.0.38 to v0.0.39. - Regenerates `.agents/skills/nemoclaw-user-*` from the updated source docs. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [x] `make docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes v0.0.39 * **New Features** * Host alias management commands for easier configuration * Sandbox GPU control options during onboarding * Update command with check and confirmation modes * **Documentation** * Enhanced Linux installer guidance with Docker and group membership handling * Expanded troubleshooting for permission and connectivity issues * Improved capability-dropping security documentation * Updated inference model switching commands * Brev environment-specific troubleshooting * **Improvements** * DGX Spark/Station express install flow * Windows Ollama relay and health-check enhancements * NVIDIA NIM preflight GPU reporting [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3375) <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
nemoclaw inference setto dispatch by sandbox agent and support Hermes sandboxes/sandbox/.hermes/config.yamlmodel fields, and recompute Hermes hash files without restarting the gatewaynemohermeshelp/docs sohermes-providerand the no-rebuild model switch path are documentedValidation
npm run build:clinpm run typecheck:clinpm run checksshellcheck test/e2e/test-openclaw-inference-switch.sh test/e2e/test-hermes-inference-switch.shbash -n test/e2e/test-openclaw-inference-switch.sh test/e2e/test-hermes-inference-switch.sh && git diff --checknpx vitest run test/validate-e2e-coverage.test.ts --testTimeout 60000npx vitest run src/lib/actions/inference-set.test.ts test/config-set.test.ts test/validate-e2e-coverage.test.ts --testTimeout 60000npx vitest run test/cli.test.ts -t "nemohermes inference set" --testTimeout 60000npx vitest run src/lib/actions/inference-set.test.ts test/config-set.test.ts test/hermes-provider-foundation.test.ts test/generate-hermes-config.test.ts test/no-direct-credential-env.test.ts test/check-env-var-docs.test.ts src/lib/inference/model-prompts.test.ts --testTimeout 60000Notes
Summary by CodeRabbit
New Features
Documentation
Tests / CI