fix(onboard): offer running vLLM in provider menu#3417
Conversation
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR refines vLLM inference option availability: onboarding now always shows a detected running local vLLM, while managed vLLM install/start remains gated by ChangesvLLM Detection and Experimental Flag Refinement
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: None Dispatch hint: Full advisor summaryPi Semantic E2E AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/inference/use-local-inference.md (1)
238-245: ⚡ Quick winFormat CLI command mentions as inline code in prose.
Line 238 and Line 245 reference the onboarding command in plain text (
onboard/rerun onboard). Please format command references as inline code for consistency.Suggested edit
-For an already-running vLLM server, run onboard and select **Local vLLM [experimental]** from the provider list. +For an already-running vLLM server, run `nemoclaw onboard` and select **Local vLLM [experimental]** from the provider list. ... -If vLLM is not running and your host matches a managed profile, set `NEMOCLAW_EXPERIMENTAL=1`, rerun onboard, and select the **Install vLLM** or **Start vLLM** entry. +If vLLM is not running and your host matches a managed profile, set `NEMOCLAW_EXPERIMENTAL=1`, rerun `nemoclaw onboard`, and select the **Install vLLM** or **Start vLLM** entry.As per coding guidelines: "CLI commands, file paths, flags, parameter names, and values must use inline
codeformatting."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/inference/use-local-inference.md` around lines 238 - 245, The prose references CLI commands and an env var without inline code formatting; update the sentences that mention the onboarding command and the rerun instruction so `nemoclaw onboard` and `rerun onboard` are formatted as inline code, and format the environment variable `NEMOCLAW_EXPERIMENTAL=1` as inline code as well; locate the occurrences around the vLLM onboarding text (mentions of "Local vLLM [experimental]", "nemoclaw onboard", "rerun onboard", and "NEMOCLAW_EXPERIMENTAL=1") and replace the plain text command/env references with inline code snippets to match the project's CLI/flag formatting guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@docs/inference/use-local-inference.md`:
- Around line 238-245: The prose references CLI commands and an env var without
inline code formatting; update the sentences that mention the onboarding command
and the rerun instruction so `nemoclaw onboard` and `rerun onboard` are
formatted as inline code, and format the environment variable
`NEMOCLAW_EXPERIMENTAL=1` as inline code as well; locate the occurrences around
the vLLM onboarding text (mentions of "Local vLLM [experimental]", "nemoclaw
onboard", "rerun onboard", and "NEMOCLAW_EXPERIMENTAL=1") and replace the plain
text command/env references with inline code snippets to match the project's
CLI/flag formatting guidelines.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 5f1a8f21-1fdb-4c9c-8690-aff17ea4c3b6
📒 Files selected for processing (4)
docs/inference/inference-options.mddocs/inference/use-local-inference.mdsrc/lib/onboard.tstest/onboard-selection.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/onboard-selection.test.ts (1)
236-242: ⚡ Quick winAvoid hard-coding provider menu index in this new vLLM-running test.
Using
"7"makes this test fragile if provider ordering changes. You’re already asserting menu output with a dynamic^\s*\d+\)regex, so the selection should be dynamic too.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/onboard-selection.test.ts` around lines 236 - 242, The test hard-codes the provider selection as "7" via answers and credentials.prompt, which is fragile; instead read the provider menu text already captured in messages (from credentials.prompt), find the line matching /^\s*\d+\)/ to extract the shown index for the desired provider, and push that extracted index (as a string) into answers so the prompt selection is derived from the test output; update references to answers/credentials.prompt in the test (e.g., the existing answers array and credentials.prompt mock) to use the dynamically extracted index before returning from the mock.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/onboard-selection.test.ts`:
- Around line 314-339: The test script must explicitly stub the interactive
credential helpers so the non-interactive vLLM test fails fast if a prompt is
attempted: require the credentials module used by setupNim (e.g. const
credentials = require(${credentialsPath});) and set credentials.prompt = () => {
throw new Error("Unexpected prompt in non-interactive test"); } and
credentials.ensureApiKey = async () => { throw new Error("Unexpected
ensureApiKey call in non-interactive test"); }; this ensures any call to
credentials.prompt or credentials.ensureApiKey during setupNim will immediately
throw instead of hanging.
---
Nitpick comments:
In `@test/onboard-selection.test.ts`:
- Around line 236-242: The test hard-codes the provider selection as "7" via
answers and credentials.prompt, which is fragile; instead read the provider menu
text already captured in messages (from credentials.prompt), find the line
matching /^\s*\d+\)/ to extract the shown index for the desired provider, and
push that extracted index (as a string) into answers so the prompt selection is
derived from the test output; update references to answers/credentials.prompt in
the test (e.g., the existing answers array and credentials.prompt mock) to use
the dynamically extracted index before returning from the mock.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1c72d588-b783-43c6-b8fb-a174418ba551
📒 Files selected for processing (2)
src/lib/onboard.tstest/onboard-selection.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
## Summary - Add v0.0.40 release notes and update docs version metadata. - Document release-prep behavior changes around onboarding, local inference, policy preset filtering, and config recovery. - Refresh generated `nemoclaw-user-*` skills from the source docs. ## Source summary - #3383 -> `docs/about/release-notes.md`, `docs/reference/commands.md`, `docs/manage-sandboxes/lifecycle.md`: Reflect macOS Docker-driver OpenShell gateway onboarding and upgrade behavior. - #3378 -> `docs/about/release-notes.md`: Capture the Docker-driver gateway TCP readiness fix and clearer startup failures. - #3338 -> `docs/about/release-notes.md`, `docs/inference/use-local-inference.md`: Reflect the Ollama auth proxy token requirement on native API routes. - #3420 -> `docs/about/release-notes.md`, `docs/get-started/prerequisites.md`, `docs/inference/use-local-inference.md`: Document the Linux Ollama `zstd` preflight and sudo messaging. - #3417 -> `docs/about/release-notes.md`, `docs/inference/inference-options.md`, `docs/inference/use-local-inference.md`: Reflect detected running vLLM provider selection. - #3223 -> `docs/about/release-notes.md`, `docs/reference/commands.md`, `docs/reference/network-policies.md`, `docs/get-started/quickstart.md`: Document agent-aware policy preset filtering. - #3385 -> `docs/about/release-notes.md`: Capture the dashboard forward TCP reachability check. - #3160 -> `docs/about/release-notes.md`, `docs/reference/troubleshooting.md`: Document empty `openclaw.json` baseline recovery. - #3367 -> `docs/about/release-notes.md`: Capture OpenClaw plugin compatibility metadata. ## Test plan - [x] `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user` - [x] `make docs` - [x] `git diff --check` - [x] Skip-term scan for `docs/.docs-skip` terms <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit # Release Notes v0.0.40 * **New Features** * Sandbox configuration recovery when inference changes cause data loss * Policy presets now intelligently filter based on agent capabilities * Enhanced gateway health checks and upgrade reliability * **Documentation** * Improved local inference setup instructions with clearer dependency requirements * Clarified vLLM experimental feature availability and prerequisites * Reorganized architecture documentation for enhanced clarity * Refined security and hardening guidance [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3427) <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Fixes the onboarding UX when vLLM is already running locally. The provider menu now offers detected local vLLM directly instead of telling users to restart onboarding with extra environment variables; managed vLLM install/start remains behind the experimental opt-in because it pulls images and starts containers.
Changes
Local vLLM [experimental] (localhost:8000) — running (suggested)whenever the local/v1/modelsprobe succeeds.Install vLLM/Start vLLMgated behindNEMOCLAW_EXPERIMENTAL=1or an explicit provider env override.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Additional validation run:
npm run build:clinpx vitest run test/onboard-selection.test.ts --testNamePattern "running vLLM|forces openai-completions for vLLM"npx vitest run test/onboard-selection.test.ts --testTimeout 60000npm run typecheck:clinpm run docs:strictnpm run source-shape:checkgit diff --checkSigned-off-by: Aaron Erickson aerickson@nvidia.com
Summary by CodeRabbit