Skip to content

fix(installer): move Model Router setup out of install#3199

Merged
cv merged 10 commits into
mainfrom
fix/remove-router-installer-pip
May 8, 2026
Merged

fix(installer): move Model Router setup out of install#3199
cv merged 10 commits into
mainfrom
fix/remove-router-installer-pip

Conversation

@ericksoa

@ericksoa ericksoa commented May 7, 2026

Copy link
Copy Markdown
Contributor

Summary

  • remove Model Router submodule initialization and Python dependency setup from scripts/install.sh
  • keep NEMOCLAW_PROVIDER=routed accepted in the installer/provider env path
  • make routed onboarding show as Model Router (experimental) in the menu and docs
  • make routed onboarding prefer an executable NemoClaw-managed model-router venv command, then fall back to PATH, then prepare one in ~/.nemoclaw/model-router-venv
  • use one routed credential default, NVIDIA_API_KEY, across onboarding selection, router startup, and provider registration
  • start the router proxy from that resolved command path, without pip3 --user, global Python mutation, or installer-side shims

Validation

  • npm run build:cli
  • npx vitest run test/install-preflight.test.ts src/lib/domain/installer/provider.test.ts
  • npx vitest run test/onboard.test.ts --testNamePattern "Model Router"
  • npm run check:credential-env

Summary by CodeRabbit

  • Chores

    • Installer no longer performs an inline Python model-router installation; router bootstrap is deferred to host-side setup.
  • New Features

    • CLI help now lists the "routed" provider.
    • Onboard flow can provision a host-side virtual environment and manage/install/run the Model Router when applicable.
  • Documentation

    • Onboarding and quickstart label Model Router as "Experimental".
  • Tests

    • Installation and onboarding tests expanded to cover routed-provider and managed-router scenarios.

@coderabbitai

coderabbitai Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a4518812-f3e6-4d27-9b46-d95522ee124b

📥 Commits

Reviewing files that changed from the base of the PR and between 74d7ccd and b60c3df.

📒 Files selected for processing (1)
  • test/onboard.test.ts

📝 Walkthrough

Walkthrough

The installer no longer installs the Model Router Python package via pip3. The pip3 installation logic and invocations were removed from both source-checkout and GitHub-clone installer paths; provider help text and tests were updated, and onboard gained host-side router venv/setup utilities that ensure a resolved router executable is used when starting the proxy.

Changes

LLM Router Installation Removal

Layer / File(s) Summary
Installer Provider Values & Help
src/lib/domain/installer/provider.ts, src/lib/domain/installer/provider.test.ts
Adds "routed" to installer provider values and updates tests/help text to include routed.
Installer Script: Usage & Removal
scripts/install.sh
Removes pip3-driven llm-router bootstrap from source-checkout and GitHub-clone flows; usage() provider list now includes routed.
Onboard Host Utilities & Startup
src/lib/onboard.ts
Adds constants and host-side helpers to locate/init the router submodule, create a dedicated venv and pip-install router extras, and resolves/uses the model-router executable when generating proxy config and spawning the proxy.
Tests – install-preflight
test/install-preflight.test.ts
Adds python3/pip3 stub executables logging to PYTHON_LOG_PATH, creates fake nemoclaw-blueprint/router/llm-router/pyproject.toml, wires PYTHON_LOG_PATH into the installer run, and asserts no pythonLog file is created.
Tests – onboard routed setup
test/onboard.test.ts
Adds integration-style tests that simulate missing model-router command and verify venv creation and pip install of router, preference for managed router, and refresh on stale fingerprint; ensures routed provider creation and inference configuration commands run.
Docs / README
README.md, docs/get-started/quickstart.md
Relabels "Model Router" wording to "Model Router (experimental)" in onboarding/documentation text.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I'm a rabbit who hops past pip tonight,
No inline installs beneath the moonlight.
Tests wait for logs that never appear,
A quiet venv hums the router near.
Hop on the routed path — joy is here! 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: moving Model Router setup from the install script to the onboarding phase, which is the primary architectural shift across the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/remove-router-installer-pip

Comment @coderabbitai help to get the list of available commands and usage tips.

@ericksoa ericksoa changed the title Remove model router pip install from installer Move Model Router setup out of the installer May 7, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/onboard.ts (1)

6889-6900: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use one default credential env for routed flow.

This branch falls back to OPENAI_API_KEY, but the later routed startup paths fall back to NVIDIA_API_KEY. If blueprint.yaml omits credential_env, onboarding can save/prompt one env var here and then fail when reconcileModelRouter() looks up the other one.

Suggested fix
+const DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV = "NVIDIA_API_KEY";
+
 ...
-        const routerCredentialEnv = bp.router?.credential_env || bp.credential_env || "OPENAI_API_KEY";
+        const routerCredentialEnv =
+          bp.router?.credential_env ||
+          bp.credential_env ||
+          DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV;
// Apply the same constant in the other routed branches too:
const routerCredentialEnv =
  bp.router.credential_env || bp.credential_env || DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 6889 - 6900, The routed credential fallback
should use a single shared default constant instead of hardcoding
"OPENAI_API_KEY"; replace the literal fallback in the onboard initialization
(where routerCredentialEnv is computed) to use
DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV so the same default is used by
hydrateCredentialEnv/saveCredential/resolveProviderCredential and matches the
other routed startup paths (e.g., reconcileModelRouter); update the computation
of routerCredentialEnv in the block that sets credentialEnv, routedCredential,
and the provider hint handling to reference DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV.
🧹 Nitpick comments (1)
src/lib/onboard.ts (1)

829-1074: Run the onboarding E2E matrix on this path before merge.

This PR moved router provisioning into src/lib/onboard.ts, so the current unit/preflight coverage still leaves the full routed sandbox flow, resume path, and host-side router startup largely unexercised.

As per coding guidelines, src/lib/onboard.ts "contains core onboarding logic" and changes here "affect the full sandbox creation and configuration flow."

Also applies to: 6882-7206

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 829 - 1074, The onboarding router
provisioning code was moved into src/lib/onboard.ts and needs full E2E/preflight
exercise before merge; run the onboarding E2E matrix exercising the routed
sandbox flow, resume path, and host-side router startup to validate
startModelRouter, installModelRouterCommand, ensureModelRouterCommand,
initializeModelRouterSubmodule, isRouterHealthy and stopModelRouterProcess paths
(including submodule initialization, venv creation, pip install, proxy-config
generation, detached spawn and health polling) and fix any failures found
(missing env handling, permission/path issues, timeouts, or crash diagnostics)
before landing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 955-957: ensureModelRouterCommand currently calls
resolveHostCommandPath("model-router") first which lets a PATH binary shadow the
repo-managed virtualenv and forces repeated installs; change it to prefer the
managed router binary (the repo-managed ~/.nemoclaw/.../bin/model-router path
used by installModelRouterCommand) by checking that managed path exists and is
executable first, then fall back to resolveHostCommandPath("model-router"), and
only call installModelRouterCommand() if neither the managed path nor a PATH
binary is found; use the same helper functions (ensureModelRouterCommand,
installModelRouterCommand, resolveHostCommandPath) and perform a file-exists +
executable permission check on the managed binary before returning it.

---

Outside diff comments:
In `@src/lib/onboard.ts`:
- Around line 6889-6900: The routed credential fallback should use a single
shared default constant instead of hardcoding "OPENAI_API_KEY"; replace the
literal fallback in the onboard initialization (where routerCredentialEnv is
computed) to use DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV so the same default is used
by hydrateCredentialEnv/saveCredential/resolveProviderCredential and matches the
other routed startup paths (e.g., reconcileModelRouter); update the computation
of routerCredentialEnv in the block that sets credentialEnv, routedCredential,
and the provider hint handling to reference DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV.

---

Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 829-1074: The onboarding router provisioning code was moved into
src/lib/onboard.ts and needs full E2E/preflight exercise before merge; run the
onboarding E2E matrix exercising the routed sandbox flow, resume path, and
host-side router startup to validate startModelRouter,
installModelRouterCommand, ensureModelRouterCommand,
initializeModelRouterSubmodule, isRouterHealthy and stopModelRouterProcess paths
(including submodule initialization, venv creation, pip install, proxy-config
generation, detached spawn and health polling) and fix any failures found
(missing env handling, permission/path issues, timeouts, or crash diagnostics)
before landing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e710ad46-5866-4e04-9045-456f95336d7b

📥 Commits

Reviewing files that changed from the base of the PR and between a0a57a4 and c230fac.

📒 Files selected for processing (6)
  • scripts/install.sh
  • src/lib/domain/installer/provider.test.ts
  • src/lib/domain/installer/provider.ts
  • src/lib/onboard.ts
  • test/install-preflight.test.ts
  • test/onboard.test.ts

Comment thread src/lib/onboard.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa added bug Something fails against expected or documented behavior fix labels May 7, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/lib/onboard.ts (1)

925-957: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prefer the managed router binary before consulting PATH.

Line 955 still lets any PATH model-router shadow ~/.nemoclaw/model-router-venv/bin/model-router, and it never reuses that managed binary when PATH is empty. The result is repeated routed onboard runs redoing the venv/pip bootstrap unnecessarily, and an unrelated system install can win over the NemoClaw-owned environment.

Suggested fix
 function ensureModelRouterCommand(): string {
-  return resolveHostCommandPath("model-router") || installModelRouterCommand();
+  const venvDir = process.env.NEMOCLAW_MODEL_ROUTER_VENV || MODEL_ROUTER_VENV_DIR;
+  const managedCommand = path.join(venvDir, "bin", "model-router");
+  try {
+    fs.accessSync(managedCommand, fs.constants.X_OK);
+    return managedCommand;
+  } catch {
+    // fall through
+  }
+  return resolveHostCommandPath("model-router") || installModelRouterCommand();
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 925 - 957, ensureModelRouterCommand
currently prefers a PATH-installed "model-router" and only installs the managed
venv when PATH lacks it; change it to prefer the managed, NemoClaw-owned binary
first: compute the managed command path (the same logic used in
installModelRouterCommand: venvDir, routerCommand) and if
fs.existsSync(routerCommand) return that; otherwise fall back to
resolveHostCommandPath("model-router") and if that returns a path return it;
finally call installModelRouterCommand() only if neither exists. Update
ensureModelRouterCommand to reference venvDir/routerCommand,
resolveHostCommandPath, and installModelRouterCommand accordingly so the managed
venv is reused and not shadowed by PATH.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 925-957: ensureModelRouterCommand currently prefers a
PATH-installed "model-router" and only installs the managed venv when PATH lacks
it; change it to prefer the managed, NemoClaw-owned binary first: compute the
managed command path (the same logic used in installModelRouterCommand: venvDir,
routerCommand) and if fs.existsSync(routerCommand) return that; otherwise fall
back to resolveHostCommandPath("model-router") and if that returns a path return
it; finally call installModelRouterCommand() only if neither exists. Update
ensureModelRouterCommand to reference venvDir/routerCommand,
resolveHostCommandPath, and installModelRouterCommand accordingly so the managed
venv is reused and not shadowed by PATH.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c3c018ee-7ce6-4e31-aec4-70b0ba65c701

📥 Commits

Reviewing files that changed from the base of the PR and between c230fac and 49bbd4b.

📒 Files selected for processing (3)
  • README.md
  • docs/get-started/quickstart.md
  • src/lib/onboard.ts
✅ Files skipped from review due to trivial changes (1)
  • docs/get-started/quickstart.md

@ericksoa ericksoa changed the title Move Model Router setup out of the installer fix(installer): move Model Router setup out of install May 7, 2026
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa

ericksoa commented May 7, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the remaining CodeRabbit feedback in 6d655a34d.

  • Routed credential default: added DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV = "NVIDIA_API_KEY" and used it consistently across routed onboarding selection, router reconciliation/startup, and routed provider registration. This removes the OPENAI_API_KEY fallback mismatch when the blueprint omits credential_env.
  • Managed router command: ensureModelRouterCommand() now prefers an executable NemoClaw-managed venv command before PATH, then falls back to install only if neither exists. Added regression coverage for managed-command reuse when a broken PATH model-router is present.
  • E2E/validation nit: before this follow-up, the live PR checks had passed across checks, macos-e2e, wsl-e2e, build-sandbox-images, build-sandbox-images-arm64, test-e2e-port-overrides, test-e2e-sandbox, and test-e2e-gateway-isolation. For this follow-up commit, local validation passed:
    • npm run build:cli
    • npx vitest run test/onboard.test.ts --testNamePattern "Model Router"
    • npx vitest run test/install-preflight.test.ts src/lib/domain/installer/provider.test.ts
    • npm run check:credential-env

The docstring coverage item is a CodeRabbit generated warning, not a repo gate; this file does not use docstrings for private onboarding helpers.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/onboard.test.ts`:
- Around line 2584-2773: The test creates a tmpDir and writes executables but
never removes it; wrap the test body for the "prepares Model Router dependencies
during routed inference setup when command is absent" it block (the function
containing tmpDir, fakeBin, scriptPath, fakeRouterSource, setupLog, venvDir,
etc.) in a try/finally and in the finally call fs.rmSync(tmpDir, { recursive:
true, force: true }) to clean up; do the same for the other newly added routed
Model Router test (the other it block in this diff range) so both tests remove
their tmpDir even on failure.
- Around line 2699-2707: The test runner currently lets git calls pass through
to originalRun (runner.run -> originalRun) when cmd.startsWith("python3 -m
venv") || cmd.includes("/bin/python -m pip") || cmd.includes("git -C"), which
allows real git to execute; change runner.run so any git invocation (e.g.,
cmd.includes("git -C") or cmd.startsWith("git ")) is handled deterministically
in the harness instead of calling originalRun — either stub the git call by
pushing the command into commands and returning a deterministic {status, stdout,
stderr} or explicitly throw/return a failure for unexpected git invocations;
update the condition around originalRun and the branch that checks for "provider
get" to ensure git never reaches originalRun.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0bc63ac3-7b5c-444f-b0f4-4f598d87e36a

📥 Commits

Reviewing files that changed from the base of the PR and between 49bbd4b and 6d655a3.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard.ts

Comment thread test/onboard.test.ts Outdated
Comment thread test/onboard.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Comment thread test/onboard.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/lib/onboard.ts (1)

829-1287: Run the onboarding E2E slices on this path before merge.

This moves host-side Model Router bootstrap into src/lib/onboard.ts, so the unit coverage here is helpful but not enough to exercise the full gateway/sandbox/rebuild flow that now owns router startup.

As per coding guidelines src/lib/onboard.ts: "This file contains core onboarding logic. Changes here affect the full sandbox creation and configuration flow."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 829 - 1287, The change moved host-side Model
Router bootstrap into onboard.ts but the full gateway/sandbox/rebuild flow
(Model Router startup) hasn't been exercised by unit tests; run the onboarding
end-to-end slices that exercise reconcileModelRouter, startModelRouter,
ensureModelRouterCommand, installModelRouterCommand and isRouterHealthy to
validate venv creation, fingerprinting, proxy-config/proxy startup, credential
wiring, and stop/restart behavior before merging, and add or update E2E tests
that cover those flows if any slice fails or is missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1089-1108: ensureModelRouterCommand currently reinstalls the
router (via installModelRouterCommand) before checking for existing executables
or PATH binaries, causing failures when python3 is missing even though a host or
managed executable exists; modify ensureModelRouterCommand so that after
computing routerDir, venvDir and managedCommand it checks
isExecutableFile(managedCommand) and resolveHostCommandPath("model-router")
before calling installModelRouterCommand, and only call
initializeModelRouterSubmodule() if !isModelRouterPackageReady(routerDir) while
preserving the existing checks isModelRouterPackageReady(routerDir) and
isManagedModelRouterCurrent(routerDir, venvDir) to return managedCommand early —
use the existing helpers modelRouterPackageDir, modelRouterVenvDir,
modelRouterCommandPath, isModelRouterPackageReady, isManagedModelRouterCurrent,
initializeModelRouterSubmodule, installModelRouterCommand, isExecutableFile, and
resolveHostCommandPath to implement the new resolution order.

In `@test/onboard.test.ts`:
- Around line 2725-2736: The test harness' runner.run stub currently treats
unknown commands as success which masks regressions like a stray "pip3"
invocation; update the runner.run implementation(s) (the override around
runner.run in test/onboard.test.ts and the duplicate at the later block) to
detect "pip3" invocations (e.g., commands starting with or containing "pip3 " or
"/bin/pip3") and fail deterministically instead of returning a fake success —
either throw an Error or return a nonzero status for "pip3" so the test fails,
while keeping the existing special-cases for "python3 -m venv" and "/bin/python
-m pip".

---

Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 829-1287: The change moved host-side Model Router bootstrap into
onboard.ts but the full gateway/sandbox/rebuild flow (Model Router startup)
hasn't been exercised by unit tests; run the onboarding end-to-end slices that
exercise reconcileModelRouter, startModelRouter, ensureModelRouterCommand,
installModelRouterCommand and isRouterHealthy to validate venv creation,
fingerprinting, proxy-config/proxy startup, credential wiring, and stop/restart
behavior before merging, and add or update E2E tests that cover those flows if
any slice fails or is missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: de366cf9-3556-4e68-92b2-f3dc6ca1adbe

📥 Commits

Reviewing files that changed from the base of the PR and between 41bb6da and 17af345.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/onboard.test.ts

Comment thread src/lib/onboard.ts
Comment thread test/onboard.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
test/onboard.test.ts (1)

2731-2741: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail unexpected pip3 invocations in these router harnesses.

Both stubs still return fake success for any unhandled command. If setupInference() regresses to pip3 ..., the fake pip3 binary never runs, router-setup.log stays clean, and these tests can still pass.

Suggested minimal guard
 runner.run = (command, opts = {}) => {
   const cmd = _n(command);
   if (cmd.startsWith("python3 -m venv") || cmd.includes("/bin/python -m pip")) {
     return originalRun(command, opts);
   }
+  if (/(^|[\/\s])pip3(?:\s|$)/.test(cmd)) {
+    throw new Error("unexpected pip3 invocation in test harness: " + cmd);
+  }
   if (cmd.includes("git -C") || /^git(?:\s|$)/.test(cmd)) {
     throw new Error("unexpected git invocation in test harness: " + cmd);
   }
   commands.push({ command: cmd, env: opts.env || null });
   if (cmd.includes("provider get")) return { status: 1, stdout: "", stderr: "" };
   return { status: 0, stdout: "", stderr: "" };
 };

Also applies to: 3115-3125

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/onboard.test.ts` around lines 2731 - 2741, Tests currently treat
unhandled commands as success, so add a guard in runner.run to fail fast on
unexpected pip invocations: inside the runner.run implementation (the function
that wraps originalRun and pushes into commands, using _n and originalRun),
detect commands invoking pip3 (e.g. cmd.includes("pip3") or
/^pip3(?:\s|$)/.test(cmd) or cmd.includes("/bin/pip3")) and throw an Error like
the existing git check; mirror the same check in the other similar harness block
referenced (lines ~3115-3125) so any unexpected pip3 invocation fails the test
instead of silently succeeding.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@test/onboard.test.ts`:
- Around line 2731-2741: Tests currently treat unhandled commands as success, so
add a guard in runner.run to fail fast on unexpected pip invocations: inside the
runner.run implementation (the function that wraps originalRun and pushes into
commands, using _n and originalRun), detect commands invoking pip3 (e.g.
cmd.includes("pip3") or /^pip3(?:\s|$)/.test(cmd) or cmd.includes("/bin/pip3"))
and throw an Error like the existing git check; mirror the same check in the
other similar harness block referenced (lines ~3115-3125) so any unexpected pip3
invocation fails the test instead of silently succeeding.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 66bbdf91-8690-4e9d-a352-8069190391c6

📥 Commits

Reviewing files that changed from the base of the PR and between 17af345 and 20cbc78.

📒 Files selected for processing (1)
  • test/onboard.test.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/onboard.ts (1)

7367-7375: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize the routed credential env from the blueprint before provider upsert.

This branch now falls back to credentialEnv || DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV, but reconcileModelRouter() already resolves routed credentials via bp.router.credential_env || bp.credential_env || DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV. On resume, an older session can still carry null or OPENAI_API_KEY here, so the gateway provider gets registered against the wrong env var even though the router process was reconciled with the blueprint-resolved one.

Suggested fix
-    const resolvedCredentialEnv = credentialEnv || DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV;
+    const routedProfile = getRoutedProfile();
+    const resolvedCredentialEnv =
+      credentialEnv ||
+      routedProfile.router.credential_env ||
+      routedProfile.credential_env ||
+      DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV;
     const credentialValue = hydrateCredentialEnv(resolvedCredentialEnv);
     const env = credentialValue ? { [resolvedCredentialEnv]: credentialValue } : {};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 7367 - 7375, The code uses credentialEnv ||
DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV when calling upsertProvider which can
register the provider against an out-of-date env var; instead obtain the
normalized credential env resolved by reconcileModelRouter (the
bp.router.credential_env || bp.credential_env ||
DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV logic) and use that normalized value when
calling hydrateCredentialEnv and passing the env into upsertProvider; update the
call sites (referencing reconcileModelRouter, resolvedCredentialEnv,
credentialEnv, hydrateCredentialEnv, upsertProvider, and
DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV) so the provider registration always uses
the blueprint-reconciled credential_env rather than a stale session value.
♻️ Duplicate comments (2)
src/lib/onboard.ts (1)

1102-1108: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Defer the reinstall until after existing router executables are checked.

Lines 1102-1104 still return installModelRouterCommand(routerDir) before Lines 1107-1108 ever consult the managed executable or PATH. That reintroduces the exact failure mode this PR is trying to remove: routed onboarding now requires python3 again whenever the source checkout is present but the managed venv is stale/missing, even if model-router is already usable on the host.

Suggested fix
 function ensureModelRouterCommand(): string {
   const routerDir = modelRouterPackageDir();
   const venvDir = modelRouterVenvDir();
   const managedCommand = modelRouterCommandPath(venvDir);
+  const pathCommand = resolveHostCommandPath("model-router");
 
   if (isModelRouterPackageReady(routerDir) && isManagedModelRouterCurrent(routerDir, venvDir)) {
     return managedCommand;
   }
 
   if (!isModelRouterPackageReady(routerDir)) {
     initializeModelRouterSubmodule(routerDir);
   }
 
-  if (isModelRouterPackageReady(routerDir)) {
-    if (isManagedModelRouterCurrent(routerDir, venvDir)) return managedCommand;
-    return installModelRouterCommand(routerDir);
-  }
-
   if (isExecutableFile(managedCommand)) return managedCommand;
-  return resolveHostCommandPath("model-router") || installModelRouterCommand();
+  if (pathCommand) return pathCommand;
+  if (isModelRouterPackageReady(routerDir)) {
+    return installModelRouterCommand(routerDir);
+  }
+  return installModelRouterCommand();
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 1102 - 1108, The current logic returns
installModelRouterCommand(routerDir) as soon as
isModelRouterPackageReady(routerDir) is true but
isManagedModelRouterCurrent(...) is false, which forces a reinstall even if a
usable router binary exists; change the control flow in the block that checks
isModelRouterPackageReady to defer calling installModelRouterCommand(routerDir)
— instead, if isModelRouterPackageReady(routerDir) is true and
isManagedModelRouterCurrent(routerDir, venvDir) is false, first check if
isExecutableFile(managedCommand) or resolveHostCommandPath("model-router")
yields a usable command and return that, and only call
installModelRouterCommand(routerDir) as the last fallback.
test/onboard.test.ts (1)

2731-2741: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast on unexpected pip3 calls in both routed harnesses.

Both stubs still treat an unrecognized pip3 ... invocation as success, so a regression back to global pip usage can slip through while the later assert.doesNotMatch(log, /pip3 /) still passes. Add an explicit pip3 guard before the success fallthrough in both blocks.

Suggested minimal fix
 runner.run = (command, opts = {}) => {
   const cmd = _n(command);
   if (cmd.startsWith("python3 -m venv") || cmd.includes("/bin/python -m pip")) {
     return originalRun(command, opts);
   }
+  if (/\bpip3(?:\s|$)/.test(cmd)) {
+    throw new Error("unexpected pip3 invocation in test harness: " + cmd);
+  }
   if (cmd.includes("git -C") || /^git(?:\s|$)/.test(cmd)) {
     throw new Error("unexpected git invocation in test harness: " + cmd);
   }
   commands.push({ command: cmd, env: opts.env || null });
   if (cmd.includes("provider get")) return { status: 1, stdout: "", stderr: "" };

Also applies to: 3115-3125

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/onboard.test.ts` around lines 2731 - 2741, The test harness runner.run
currently treats unrecognized "pip3 ..." invocations as success; add an explicit
guard in the runner.run stub (the block that references originalRun, commands,
and returns {status:0,...}) to fail fast on any cmd that includes "pip3" or
starts with "pip3 " (either by returning a non-zero status result or throwing an
Error), and apply the same check to the other routed harness runner.run stub
(the similar block elsewhere that records commands and falls through to success)
so unexpected pip3 usage is detected immediately.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/onboard.ts`:
- Around line 7367-7375: The code uses credentialEnv ||
DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV when calling upsertProvider which can
register the provider against an out-of-date env var; instead obtain the
normalized credential env resolved by reconcileModelRouter (the
bp.router.credential_env || bp.credential_env ||
DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV logic) and use that normalized value when
calling hydrateCredentialEnv and passing the env into upsertProvider; update the
call sites (referencing reconcileModelRouter, resolvedCredentialEnv,
credentialEnv, hydrateCredentialEnv, upsertProvider, and
DEFAULT_MODEL_ROUTER_CREDENTIAL_ENV) so the provider registration always uses
the blueprint-reconciled credential_env rather than a stale session value.

---

Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 1102-1108: The current logic returns
installModelRouterCommand(routerDir) as soon as
isModelRouterPackageReady(routerDir) is true but
isManagedModelRouterCurrent(...) is false, which forces a reinstall even if a
usable router binary exists; change the control flow in the block that checks
isModelRouterPackageReady to defer calling installModelRouterCommand(routerDir)
— instead, if isModelRouterPackageReady(routerDir) is true and
isManagedModelRouterCurrent(routerDir, venvDir) is false, first check if
isExecutableFile(managedCommand) or resolveHostCommandPath("model-router")
yields a usable command and return that, and only call
installModelRouterCommand(routerDir) as the last fallback.

In `@test/onboard.test.ts`:
- Around line 2731-2741: The test harness runner.run currently treats
unrecognized "pip3 ..." invocations as success; add an explicit guard in the
runner.run stub (the block that references originalRun, commands, and returns
{status:0,...}) to fail fast on any cmd that includes "pip3" or starts with
"pip3 " (either by returning a non-zero status result or throwing an Error), and
apply the same check to the other routed harness runner.run stub (the similar
block elsewhere that records commands and falls through to success) so
unexpected pip3 usage is detected immediately.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 87ffe379-7a1d-4906-9635-4b182fe630f5

📥 Commits

Reviewing files that changed from the base of the PR and between 20cbc78 and 308ef71.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/onboard.test.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/onboard.test.ts`:
- Around line 2915-2920: The test harness currently lets any unrecognized
command succeed in runner.run, masking regressions that re-run installs; update
the runner.run implementation so that it only returns success for explicitly
allowed commands and fails fast for any other/unhandled command (e.g., change
the default branch to return a non-zero status or throw), using the existing
command normalization (_n) and the commands array push to detect the command;
keep the special-case check for "provider get" (cmd.includes("provider get"))
but ensure all other commands (including pip, python3 -m venv, git, etc.)
produce a failure status so unexpected reinstall work will break the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5fcf81a4-ef80-4641-9534-991b5d5da4d7

📥 Commits

Reviewing files that changed from the base of the PR and between 308ef71 and 74d7ccd.

📒 Files selected for processing (1)
  • test/onboard.test.ts

Comment thread test/onboard.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa requested a review from cv May 8, 2026 00:02
@cv cv merged commit 75d7289 into main May 8, 2026
22 checks passed
miyoungc added a commit that referenced this pull request May 9, 2026
## Summary
- Bump the docs release metadata to `0.0.38`.
- Document release-prep updates for status policy versions, Local Ollama
validation and cleanup, blueprint policy additions, rebuild backup
handling, and NemoHermes uninstall branding.
- Refresh generated `nemoclaw-user-*` skills from the updated docs.

## Source summary
- #3185 -> `docs/reference/commands.md`: Documents that `nemoclaw <name>
status` displays the gateway active policy version when OpenShell
reports one.
- #3167 -> `docs/reference/commands.md`,
`docs/inference/use-local-inference.md`: Documents uninstall cleanup for
matching Local Ollama auth proxy processes.
- #2737 -> `docs/inference/use-local-inference.md`,
`docs/network-policy/customize-network-policy.md`,
`docs/manage-sandboxes/lifecycle.md`, `docs/reference/commands.md`:
Documents stricter Local Ollama tool-call validation, blueprint policy
additions, and partial rebuild backup handling.
- #3220 -> `docs/reference/commands.md`: Documents NemoHermes-specific
uninstall progress and completion text.
- #3158 -> `.agents/skills/nemoclaw-user-configure-inference/*`:
Refreshes generated user skills from existing
`docs/inference/switch-inference-providers.md` heartbeat documentation.
- #3199 -> `.agents/skills/nemoclaw-user-get-started/SKILL.md`:
Refreshes generated user skills from existing
`docs/get-started/quickstart.md` Model Router wording.

## Skipped
- #3272 and #3268 were already documented by their merged docs updates
on `main`.
- #3154, #3216, #3166, and #3195 have no additional user-facing docs
impact for this release-prep pass.
- No commits matched `docs/.docs-skip`.

## Test plan
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user`
- `make docs`
- `npm run build:cli`
- Commit and pre-push hooks: markdownlint, docs-to-skills verification,
gitleaks, commitlint, skills YAML tests, CLI typecheck


Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Behavior Changes**
* Rebuild now safely handles partial backups, preserving successfully
captured entries while reporting only unarchived paths
* Uninstall for Local Ollama setups now stops proxy processes before
cleanup
* Local Ollama models require stricter tool-call response validation
during onboarding
* Blueprint policy additions enable custom network policy extensions via
`components.policy.additions`
* New `NEMOCLAW_AGENT_HEARTBEAT_EVERY` configuration controls agent
periodic task frequency
  * Status display now shows active policy version when available

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Cursor <cursoragent@cursor.com>
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 3, 2026
@wscurran wscurran removed fix bug Something fails against expected or documented behavior labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants