Skip to content

feat(router): add model router integration with complexity-based routing#2202

Merged
ericksoa merged 18 commits into
NVIDIA:mainfrom
vinaybagade:feat/model-router-integration
May 6, 2026
Merged

feat(router): add model router integration with complexity-based routing#2202
ericksoa merged 18 commits into
NVIDIA:mainfrom
vinaybagade:feat/model-router-integration

Conversation

@vinaybagade

@vinaybagade vinaybagade commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add NVIDIA LLM Router v3 (prefill-based) as an inference provider, enabling complexity-based routing that automatically picks the most efficient model for each query
  • Add Model Router as a provider option in the onboard wizard, following the same pattern as Ollama/NIM (onboard starts the service, configures the provider)
  • Router runs on the host; sandbox reaches it through the OpenShell gateway L7 proxy

Changes

  • Add llm-router v3 as a git submodule at nemoclaw-blueprint/router/llm-router/
  • Add pool-config.yaml defining the model pool and routing parameters
  • Add routed inference profile to blueprint.yaml with model name nvidia-routed
  • Add router startup logic to onboard wizard (startModelRouter in onboard.ts)
  • Add router schema to schemas/blueprint.schema.json
  • Remove router lifecycle code from blueprint runner (onboard owns service startup)
  • Remove model-router-toolkit install from Dockerfile (router is host-only)
  • Add model-router-toolkit install to scripts/install.sh
  • Update README with Model Router documentation and architecture

Verification

  • npx prek run --all-files passes (hadolint and CLI test failures are pre-existing)
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes

Test plan

  • Unit tests pass (322 plugin tests, 10 files)
  • nemoclaw onboard --non-interactive with NEMOCLAW_PROVIDER=routed completes all 8 steps
  • Router starts on port 4000, both pool models healthy
  • Inference works end-to-end: sandbox to OpenShell gateway to router to NVIDIA API
  • nvidia-routed model name resolves via model_group_alias in LiteLLM proxy config
  • Routing strategy injects eagerly at startup (no lazy first-request 400 error)

Made with Cursor
Signed-off-by: Vinay Bagade vbagade@nvidia.com

Summary by CodeRabbit

  • New Features

    • Optional Model Router: complexity-based routed inference selectable during onboarding; host router runs on port 4000 with configurable model pool and provider mapping; onboarding now starts and monitors the router and persists its process ID.
    • Installer detects and installs the Model Router when present.
  • Documentation

    • README updated with Model Router architecture, enablement, and pool configuration guidance.
  • Tests

    • New tests covering onboarding and routed/blueprint scenarios.

Add support for the NVIDIA LLM Router v3 (prefill-based) as an inference
provider in NemoClaw. The router automatically selects the most efficient
model for each query based on a learned complexity signal.

Architecture:
- Router runs on the host (started by onboard, not inside the sandbox)
- Sandbox reaches it through the OpenShell gateway's L7 proxy
- Credentials flow through the OpenShell provider system

Changes:
- Add llm-router v3 as a git submodule at nemoclaw-blueprint/router/llm-router/
- Add pool-config.yaml defining the model pool and routing parameters
- Add "Model Router" as a provider option in the onboard wizard
- Add routed inference profile to blueprint.yaml (model: nvidia-routed)
- Remove router lifecycle code from blueprint runner (onboard owns it)
- Remove model-router-toolkit install from Dockerfile (host-only)
- Add model-router-toolkit install to scripts/install.sh
- Update README with Model Router documentation

Made-with: Cursor
@coderabbitai

coderabbitai Bot commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a host-side complexity-based Model Router (blueprint, schema, pool config, submodule, installer, onboarding, runner, tests, and docs) and a new runtime-ready NemoClaw GitHub triage skill instruction document.

Changes

Model Router Feature

Layer / File(s) Summary
Schema Definition
schemas/blueprint.schema.json
Adds optional router object under components.inference with enabled (boolean), port (integer 1–65535), and pool_config_path (string).
Blueprint Configuration
nemoclaw-blueprint/blueprint.yaml, nemoclaw-blueprint/router/pool-config.yaml
Adds routed profile to top-level profiles and components.inference.profiles.routed pointing at http://localhost:4000/v1; adds components.router block; introduces router/pool-config.yaml with routing method, checkpoint, tolerance, encoder, and two-model pool entries.
Submodule & Pointer
.gitmodules, nemoclaw-blueprint/router/llm-router
Adds llm-router submodule stanza and updates its pointer/branch (v3 commit).
Installer Wiring
scripts/install.sh
Adds init_nemoclaw_submodules(), is_routed_provider_requested(), install_model_router_if_present() and wires submodule init and optional pip install into source-checkout and GitHub-clone flows.
Runner Core Logic
nemoclaw/src/blueprint/runner.ts, nemoclaw/src/blueprint/runner.test.ts
Adds RouterConfig type, DEFAULT_ROUTER_PORT = 4000, isOptionalBoolean helper, validates router fields, threads routerCfg through resolveRunConfig, extends RunPlan to include router with enabled, port, and pool_config_path; tests for plan/router behavior and endpoint passthrough.
Onboarding Integration
src/lib/onboard.ts, src/lib/onboard-providers.ts, test/onboard.test.ts
Adds loadBlueprintProfile, router health polling constants, startModelRouter lifecycle to spawn/probe the model-router, maps nvidia-routerrouted in provider selection, reconciles router and wires routed provider endpoint/credentials into gateway routing; tests for onboarding/provider propagation.
Session Model
src/lib/onboard-session.ts
Adds routerPid to Session/SessionUpdates, implements readPositiveInteger, wires routerPid through createSession, normalizeSession, and safe updates.
Documentation & Project Layout
README.md
Adds "Model Router (Complexity-Based Routing)" section describing routed mode, enablement options (onboard or env), host port (4000), pool-config usage, architecture flow, and updates Project Structure to include nemoclaw-blueprint/router/.

Triage Skill Documentation

Layer / File(s) Summary
Skill Instructions
.agents/skills/nemoclaw-user-triage-instructions/SKILL.md
New runtime-ready skill instruction file providing NemoClaw GitHub issue/PR label triage rules: strict JSON single-array output, 1–5 exact labels, skip rules, label guide, comment tiers, tone constraints, and related skills metadata.

Sequence Diagram

sequenceDiagram
  participant Sandbox
  participant OpenShell_Gateway
  participant Model_Router
  participant Provider_API

  Sandbox->>OpenShell_Gateway: inference request
  OpenShell_Gateway->>Model_Router: forward request (http://localhost:4000/v1)
  Model_Router->>Model_Router: encode complexity & select model from pool
  Model_Router->>Provider_API: proxy request to chosen provider (api_base + model)
  Provider_API-->>Model_Router: model response
  Model_Router-->>OpenShell_Gateway: return routed response
  OpenShell_Gateway-->>Sandbox: relay final response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

"I hopped through YAML and code,
spawned routers light and bold,
I nibbled at configs tight,
and piped credentials right,
now models choose with thrift and gold." 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 32.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding NVIDIA LLM Router v3 integration with complexity-based routing as a new inference provider feature.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Keep both our model router additions (loadBlueprintProfile, startModelRouter)
and upstream's new pullAndResolveBaseImageDigest function in onboard.ts.
Take upstream Dockerfile as-is (router installs on host, not in sandbox).

Made-with: Cursor
# Conflicts:
#	nemoclaw/src/blueprint/runner.ts
#	scripts/install.sh
#	src/lib/onboard.ts
@copy-pr-bot

copy-pr-bot Bot commented May 6, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
nemoclaw-blueprint/router/llm-router (2)

1-1: Establish a submodule update and governance strategy.

Consider documenting:

  • Update policy: Who is responsible for tracking upstream llm-router releases and determining when to update the submodule reference?
  • Version alignment: Should the submodule track tagged releases (e.g., v3.x.x) rather than branch HEAD to ensure predictable behavior?
  • Security monitoring: How will security advisories for the llm-router dependency be tracked and addressed?
  • CI/CD considerations: Ensure CI workflows include git submodule update --init --recursive to avoid build failures when submodules are not initialized.

The current pinning to a specific commit provides stability, but a defined process for evaluating and applying upstream updates will help maintain security and feature parity.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@nemoclaw-blueprint/router/llm-router` at line 1, Add a short governance and
update policy for the llm-router submodule: document an owner responsible for
tracking upstream llm-router releases, require pinning to tagged releases (e.g.,
v3.x.x) rather than branch HEAD for predictable behavior, and define a
cadence/process for evaluating and applying upstream commits; include a security
monitoring step (subscribe to upstream advisories or use Dependabot/Snyk) and a
CI change to run git submodule update --init --recursive (ensure workflows that
build or test reference the llm-router submodule perform this initialization) so
builds won’t fail when submodules are uninitialized.

1-1: Consider pinning to a tagged release instead of an arbitrary commit.

The submodule commit 2bd8dfaa751efb60aa4e7e49b270490dfbc0a68a exists in the upstream NVIDIA-AI-Blueprints/llm-router repository on branch v3. However, this commit is not associated with any release tag. Pinning to tagged releases (e.g., v3.0.0, v3.1.0) provides clearer version boundaries, improves maintainability, and makes dependency updates more traceable. If upstream repository publishes release tags for the v3 branch, update .gitmodules to reference the latest stable tag instead of this untagged commit.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@nemoclaw-blueprint/router/llm-router` at line 1, The submodule is pinned to
an untagged commit (2bd8dfaa751efb60aa4e7e49b270490dfbc0a68a) from the
NVIDIA-AI-Blueprints/llm-router v3 branch; update .gitmodules to point to a
released tag (e.g., v3.0.0 or v3.1.0) instead of the commit SHA, and update any
corresponding submodule init/update steps (git submodule set-url or the
submodule "branch"/"tag" reference) so the repo uses the stable release tag for
llm-router rather than the arbitrary commit.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@nemoclaw-blueprint/router/pool-config.yaml`:
- Around line 24-29: The pool entry currently has a mismatched identity: the
fields name: gpt-5.2 and litellm_model:
"openai/nvidia/nvidia/Nemotron-3-Nano-30B-A3B" must be aligned; either change
the name to match the routed model (e.g., name: nemotron-3-nano) or change
litellm_model to the correct GPT-5.2 model identifier so that the router’s
checkpoint (which uses the name field) maps accuracy predictions to the actual
model; update the first pool entry’s name or litellm_model to mirror the pattern
used in the working entry (nemotron-3-super → nemotron-3-super-v3).

In `@nemoclaw/src/blueprint/runner.ts`:
- Around line 149-161: The router validation currently allows components.router
= { enabled: true } without a pool_config_path; update the validation in
runner.ts (the block handling components.router in isBlueprint / validation) so
that when router.enabled === true you require a present, non-empty
pool_config_path (use the existing string validation helper rather than
isOptionalString) and keep the existing port checks; ensure the condition
rejects router objects with enabled true and pool_config_path missing/invalid so
broken routed blueprints fail validation instead of at startup.

In `@README.md`:
- Line 145: Replace the GitHub repo link in the sentence that mentions "NVIDIA
LLM Router v3" with either the official documentation URL for the NVIDIA LLM
Router (if available) or remove the hyperlink and keep the plain text name
"NVIDIA LLM Router v3"; ensure the README.md line referencing the router does
not link to a third-party code repo per guidelines and that the rest of the
sentence ("prefill routing engine" / "runs on the host as a LiteLLM proxy" /
"sandbox reaches it through the OpenShell gateway") remains unchanged.

In `@scripts/install.sh`:
- Around line 1341-1343: The routed install path silently degrades: instead of
no-op when pip3 is missing or swallowing failures with "|| true", make routed
installs fail fast by checking command_exists pip3 and testing the pip3 install
return code (the spin "Installing model router" pip3 install ... call) and
exiting non‑zero with an error message if pip3 is missing or the install fails;
likewise remove or replace any "|| true" used around submodule fetches for the
llm-router submodule so git fetch/checkout failures are checked and cause an
immediate error/exit; apply the same fail-fast checks to the other occurrences
around the llm-router installs mentioned (the blocks referencing
NEMOCLAW_SOURCE_ROOT/nemoclaw-blueprint/router/llm-router and any uses of
command_exists pip3, spin, and NEMOCLAW_PROVIDER=routed).

In `@src/lib/onboard.ts`:
- Around line 6620-6673: The provider recovery logic needs a reverse mapping for
the routed provider: update the providerNameToOptionKey function so that when
name === "nvidia-router" it returns "routed" (respecting the existing signature
providerNameToOptionKey(name, opts) and any opts.hasNimContainer logic); this
ensures the persisted provider set in onboard.ts (provider = bp.provider_name ||
"nvidia-router") is correctly mapped back to the "routed" selection during
non-interactive recovery rather than falling through to "build".
- Around line 854-887: The current health-check loop can mistakenly accept a
pre-existing /health on the same port while the spawned child has already
exited; capture the spawned child's lifecycle and only treat /health as success
if the child is still running. After calling spawn("model-router", ... ) and
before returning pid, attach an 'exit' handler or poll child.exitCode (or use
process.kill(pid, 0) safely) to detect early termination and save that state; in
the health loop, when a healthy HTTP response is seen, verify the child has not
exited (child.exitCode === null/undefined and not killed) and only then return
pid; if the child exited with a non-zero code or is not alive, throw an error
and continue retries. Ensure to reference the existing symbols: spawn call,
child, child.pid, child.on('exit') / child.exitCode, ROUTER_HEALTH_RETRIES,
ROUTER_HEALTH_INTERVAL_MS, litellmConfigPath, poolConfigPath, port,
blueprintDir, and buildSubprocessEnv when implementing this check.
- Around line 6627-6644: Resolve the routed credential by calling
hydrateCredentialEnv(routerCredentialEnv) (or equivalent) before the
non-interactive/interactive gate and use its returned value to populate
process.env[routerCredentialEnv]; ensure that hydrateCredentialEnv performs
legacy staging fallback so saved/staged credentials count in non-interactive
mode. Update the conditional checks that reference
process.env[routerCredentialEnv] (and bp.credential_default) to instead use the
hydrated result so callers like startModelRouter() / upsertProvider() receive
the credential, and keep the ensureNamedCredential(routerCredentialEnv, "Model
Router API key", null) call only when the hydrated value is absent and
interactive.

---

Nitpick comments:
In `@nemoclaw-blueprint/router/llm-router`:
- Line 1: Add a short governance and update policy for the llm-router submodule:
document an owner responsible for tracking upstream llm-router releases, require
pinning to tagged releases (e.g., v3.x.x) rather than branch HEAD for
predictable behavior, and define a cadence/process for evaluating and applying
upstream commits; include a security monitoring step (subscribe to upstream
advisories or use Dependabot/Snyk) and a CI change to run git submodule update
--init --recursive (ensure workflows that build or test reference the llm-router
submodule perform this initialization) so builds won’t fail when submodules are
uninitialized.
- Line 1: The submodule is pinned to an untagged commit
(2bd8dfaa751efb60aa4e7e49b270490dfbc0a68a) from the
NVIDIA-AI-Blueprints/llm-router v3 branch; update .gitmodules to point to a
released tag (e.g., v3.0.0 or v3.1.0) instead of the commit SHA, and update any
corresponding submodule init/update steps (git submodule set-url or the
submodule "branch"/"tag" reference) so the repo uses the stable release tag for
llm-router rather than the arbitrary commit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c5c311e1-1fc1-433f-832e-4571b576e60e

📥 Commits

Reviewing files that changed from the base of the PR and between f5b8144 and e2e36d9.

📒 Files selected for processing (15)
  • .agents/skills/nemoclaw-user-triage-instructions/SKILL.md
  • .claude/skills
  • .gitmodules
  • CLAUDE.md
  • CLAUDE.md
  • README.md
  • nemoclaw-blueprint/blueprint.yaml
  • nemoclaw-blueprint/router/llm-router
  • nemoclaw-blueprint/router/pool-config.yaml
  • nemoclaw/src/blueprint/runner.test.ts
  • nemoclaw/src/blueprint/runner.ts
  • schemas/blueprint.schema.json
  • scripts/install.sh
  • src/lib/onboard-providers.ts
  • src/lib/onboard.ts
💤 Files with no reviewable changes (1)
  • .claude/skills

Comment thread nemoclaw-blueprint/router/pool-config.yaml Outdated
Comment on lines +149 to +161
const router = components.router;
if (router !== undefined) {
if (!isObjectLike(router)) {
return false;
}
if (
!isOptionalBoolean(router.enabled) ||
!(router.port === undefined || isValidPort(router.port)) ||
!isOptionalString(router.pool_config_path)
) {
return false;
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require pool_config_path when the router is enabled.

isBlueprint() currently accepts components.router = { enabled: true }, and actionPlan() will later emit router.enabled: true with pool_config_path: undefined. That defers a broken routed blueprint until startup instead of rejecting it during validation.

🛡️ Suggested validation
     if (
       !isOptionalBoolean(router.enabled) ||
       !(router.port === undefined || isValidPort(router.port)) ||
-      !isOptionalString(router.pool_config_path)
+      !isOptionalString(router.pool_config_path) ||
+      (router.enabled === true && typeof router.pool_config_path !== "string")
     ) {
       return false;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const router = components.router;
if (router !== undefined) {
if (!isObjectLike(router)) {
return false;
}
if (
!isOptionalBoolean(router.enabled) ||
!(router.port === undefined || isValidPort(router.port)) ||
!isOptionalString(router.pool_config_path)
) {
return false;
}
}
const router = components.router;
if (router !== undefined) {
if (!isObjectLike(router)) {
return false;
}
if (
!isOptionalBoolean(router.enabled) ||
!(router.port === undefined || isValidPort(router.port)) ||
!isOptionalString(router.pool_config_path) ||
(router.enabled === true && typeof router.pool_config_path !== "string")
) {
return false;
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@nemoclaw/src/blueprint/runner.ts` around lines 149 - 161, The router
validation currently allows components.router = { enabled: true } without a
pool_config_path; update the validation in runner.ts (the block handling
components.router in isBlueprint / validation) so that when router.enabled ===
true you require a present, non-empty pool_config_path (use the existing string
validation helper rather than isOptionalString) and keep the existing port
checks; ensure the condition rejects router objects with enabled true and
pool_config_path missing/invalid so broken routed blueprints fail validation
instead of at startup.

Comment thread README.md Outdated

NemoClaw includes an optional model router that automatically picks the most efficient model for each query. Instead of sending every request to a single large model, the router uses a lightweight encoder to predict which model in a pool can handle each query correctly, then routes to the cheapest one that meets an accuracy threshold.

The router uses the [NVIDIA LLM Router v3](https://github.com/NVIDIA-AI-Blueprints/llm-router/tree/v3) prefill routing engine and runs on the host as a LiteLLM proxy. The sandbox reaches it through the OpenShell gateway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace the GitHub repo link with docs or plain text.

This sentence links directly to the llm-router repository. The Markdown rules here only allow official documentation links, so please swap it for the corresponding docs page or keep the reference as plain text.

As per coding guidelines, "Do not add links to third-party code repositories or community collections; only link to official tool documentation (Node.js, Python, uv)".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 145, Replace the GitHub repo link in the sentence that
mentions "NVIDIA LLM Router v3" with either the official documentation URL for
the NVIDIA LLM Router (if available) or remove the hyperlink and keep the plain
text name "NVIDIA LLM Router v3"; ensure the README.md line referencing the
router does not link to a third-party code repo per guidelines and that the rest
of the sentence ("prefill routing engine" / "runs on the host as a LiteLLM
proxy" / "sandbox reaches it through the OpenShell gateway") remains unchanged.

Comment thread scripts/install.sh Outdated
Comment thread src/lib/onboard.ts
Comment thread src/lib/onboard.ts
Comment thread src/lib/onboard.ts Outdated
@ericksoa

ericksoa commented May 6, 2026

Copy link
Copy Markdown
Contributor

/ok to test e2e36d9

@ericksoa

ericksoa commented May 6, 2026

Copy link
Copy Markdown
Contributor

/ok to test 8316fc779463db1804f333b0400f1ff5b6ebc15b

@copy-pr-bot

copy-pr-bot Bot commented May 6, 2026

Copy link
Copy Markdown

/ok to test 8316fc779463db1804f333b0400f1ff5b6ebc15b

@ericksoa, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@ericksoa

ericksoa commented May 6, 2026

Copy link
Copy Markdown
Contributor

/ok to test 8316fc7

Keep both our model router additions (loadBlueprintProfile, startModelRouter)
and upstream's new pullAndResolveBaseImageDigest function in onboard.ts.
Take upstream Dockerfile as-is (router installs on host, not in sandbox).

Made-with: Cursor
Co-authored-by: Cursor <cursoragent@cursor.com>
@vinaybagade vinaybagade force-pushed the feat/model-router-integration branch from 8316fc7 to f32b45d Compare May 6, 2026 17:35
ericksoa added 3 commits May 6, 2026 11:01
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

# Conflicts:
#	Dockerfile
#	src/lib/onboard.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

# Conflicts:
#	test/onboard.test.ts
@ericksoa

ericksoa commented May 6, 2026

Copy link
Copy Markdown
Contributor

/ok to test 58bb18b

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (4)
src/lib/onboard.ts (3)

6631-6647: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve routed credential before the interactive/non-interactive gate.

Lines 6637-6647 only check process.env, so saved/staged credentials are ignored here; bp.credential_default also bypasses prompting without being written into env for downstream router/provider setup.

Suggested fix
         const routerCredentialEnv = bp.credential_env || "OPENAI_API_KEY";
         credentialEnv = routerCredentialEnv;
+        const routedCredential =
+          hydrateCredentialEnv(routerCredentialEnv) ||
+          normalizeCredentialValue(bp.credential_default || "");
+        if (routedCredential) {
+          process.env[routerCredentialEnv] = routedCredential;
+        }
         const _providerKeyHint = (process.env.NEMOCLAW_PROVIDER_KEY || "").trim();
         if (_providerKeyHint && !process.env[routerCredentialEnv]) {
           process.env[routerCredentialEnv] = _providerKeyHint;
         }
         if (isNonInteractive()) {
-          if (!process.env[routerCredentialEnv] && !bp.credential_default) {
+          if (!process.env[routerCredentialEnv]) {
             console.error(
               `  ${routerCredentialEnv} (or NEMOCLAW_PROVIDER_KEY) is required for Model Router in non-interactive mode.`,
             );
             process.exit(1);
           }
         } else {
-          if (!process.env[routerCredentialEnv] && !bp.credential_default) {
+          if (!process.env[routerCredentialEnv]) {
             await ensureNamedCredential(routerCredentialEnv, "Model Router API key", null);
           }
         }

Based on learnings, hydrateCredentialEnv(envName) must resolve the specific key before fallback staging and should prevent stale credentials from overriding explicit env credentials.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 6631 - 6647, The code validates router
credentials by only checking process.env and bp.credential_default, which skips
staged or saved credentials; call hydrateCredentialEnv(routerCredentialEnv) (or
the existing hydrateCredentialEnv function) before the
interactive/non-interactive checks so any stored/staged key is resolved into the
env and prevents stale fallbacks from overriding explicit env values;
specifically, before using routerCredentialEnv/_providerKeyHint and before the
isNonInteractive() branch, invoke hydrateCredentialEnv(routerCredentialEnv),
then proceed to set credentialEnv and run the existing logic (including
ensureNamedCredential) so downstream router/provider setup sees the resolved
credential.

6666-6667: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add reverse mapping for persisted routed provider.

This branch persists provider = "nvidia-router", but recovery mapping still lacks "nvidia-router" -> "routed", so non-interactive reruns can silently fall back to "build".

Suggested fix (in providerNameToOptionKey)
 function providerNameToOptionKey(
   name: string | null | undefined,
   opts: { hasNimContainer?: boolean } = {},
 ): string | null {
   if (!name) return null;
+  if (name === "nvidia-router") return "routed";
   if (name === "ollama-local") return "ollama";
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 6666 - 6667, The persisted provider value
"nvidia-router" is not mapped back to the routed option, causing recovery to
fall back to "build"; update the reverse mapping in the providerNameToOptionKey
function to include an entry mapping "nvidia-router" (and any equivalent
persisted keys) to "routed" so that when provider is read (e.g., provider
variable set from bp.provider_name) it correctly resolves to the routed option;
locate providerNameToOptionKey and add the "nvidia-router" -> "routed" case
along with any related string variants to ensure non-interactive reruns recover
the routed selection.

857-891: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent false-positive router startup when port 4000 is already serving /health.

At Line 881, success can come from a pre-existing listener; plus the spawned child can exit before the loop still returns success.

Suggested fix
+  const preexistingHealthy = await new Promise((resolve) => {
+    http
+      .get(`http://127.0.0.1:${port}/health`, (res: import("node:http").IncomingMessage) =>
+        resolve((res.statusCode || 0) >= 200 && (res.statusCode || 0) < 300),
+      )
+      .on("error", () => resolve(false));
+  });
+  if (preexistingHealthy) {
+    throw new Error(
+      `Port ${port} already has a healthy router endpoint; refusing to start a second router.`,
+    );
+  }
+
   const child = spawn(
@@
   child.unref();
+  let exited = false;
+  child.once("exit", () => {
+    exited = true;
+  });
@@
-    if (healthy) return pid;
+    if (healthy && !exited) return pid;
+    if (exited) break;
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 857 - 891, The health-check loop can return
a false positive if another process already serves /health or if the spawned
child exited; update the logic around spawn/child and the loop (symbols: spawn
call that creates child, child, pid, ROUTER_HEALTH_RETRIES,
ROUTER_HEALTH_INTERVAL_MS) so that after receiving a healthy response you also
verify the spawned process is still running (e.g., ensure child.exitCode ===
null and !child.killed or use process.kill(pid, 0) for existence) before
returning pid; if the child has exited, treat it as not healthy and continue
retries (and surface/log the child.exitCode/error), and consider failing after
retries by killing/stopping the child if necessary.
scripts/install.sh (1)

1334-1338: ⚠️ Potential issue | 🟠 Major

Fail routed installs when router prerequisites are unavailable.

install_model_router_if_present() still returns success when pip3 is missing or the llm-router checkout is absent, and both submodule-init call sites still downgrade failures to warnings. With NEMOCLAW_PROVIDER=routed, that lets step 2 succeed without a usable host router and pushes the failure into onboard.

Suggested guard
 install_model_router_if_present() {
   local root="$1"
   local router_dir="$root/nemoclaw-blueprint/router/llm-router"
+  local routed=0
+  [[ "${NEMOCLAW_PROVIDER:-}" == "routed" ]] && routed=1

-  command_exists pip3 || return 0
+  if ! command_exists pip3; then
+    [[ "$routed" -eq 1 ]] && error "pip3 is required for routed inference."
+    return 0
+  fi
   [[ -d "$router_dir" ]] || return 0
   if [[ ! -f "$router_dir/pyproject.toml" && ! -f "$router_dir/setup.py" ]]; then
-    warn "Skipping model router install — llm-router submodule is not initialized."
+    [[ "$routed" -eq 1 ]] && error "llm-router is required for routed inference but is not initialized."
+    warn "Skipping model router install — llm-router submodule is not initialized."
     return 0
   fi
   spin "Installing model router" pip3 install --quiet --user "${router_dir}[prefill,proxy]"
 }
...
-    spin "Initializing ${_CLI_DISPLAY} submodules" init_nemoclaw_submodules "$NEMOCLAW_SOURCE_ROOT" \
-      || warn "Submodule initialization failed — model router support may be unavailable"
+    if ! spin "Initializing ${_CLI_DISPLAY} submodules" init_nemoclaw_submodules "$NEMOCLAW_SOURCE_ROOT"; then
+      [[ "${NEMOCLAW_PROVIDER:-}" == "routed" ]] && error "Failed to initialize the llm-router submodule required for routed inference."
+      warn "Submodule initialization failed — model router support may be unavailable"
+    fi

Also applies to: 1358-1359, 1382-1383

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/install.sh` around lines 1334 - 1338,
install_model_router_if_present() currently returns success when prerequisites
are missing (checks like command_exists pip3, [[ -d "$router_dir" ]], and
existence of "$router_dir/pyproject.toml" or "$router_dir/setup.py"), allowing
routed provider flows to continue without a router; change those early-return
paths to propagate failure (non-zero exit) when NEMOCLAW_PROVIDER is "routed" so
installs fail fast: replace the noop/return 0 behavior in the checks for pip3,
router_dir, and missing pyproject/setup with a non-zero return (or exit) when
the environment expects a routed provider, and update the places that call
install_model_router_if_present() (the call sites that currently downgrade
errors to warnings) to treat the function’s non-zero return as a fatal error
instead of a warning.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@scripts/install.sh`:
- Around line 1334-1338: install_model_router_if_present() currently returns
success when prerequisites are missing (checks like command_exists pip3, [[ -d
"$router_dir" ]], and existence of "$router_dir/pyproject.toml" or
"$router_dir/setup.py"), allowing routed provider flows to continue without a
router; change those early-return paths to propagate failure (non-zero exit)
when NEMOCLAW_PROVIDER is "routed" so installs fail fast: replace the
noop/return 0 behavior in the checks for pip3, router_dir, and missing
pyproject/setup with a non-zero return (or exit) when the environment expects a
routed provider, and update the places that call
install_model_router_if_present() (the call sites that currently downgrade
errors to warnings) to treat the function’s non-zero return as a fatal error
instead of a warning.

In `@src/lib/onboard.ts`:
- Around line 6631-6647: The code validates router credentials by only checking
process.env and bp.credential_default, which skips staged or saved credentials;
call hydrateCredentialEnv(routerCredentialEnv) (or the existing
hydrateCredentialEnv function) before the interactive/non-interactive checks so
any stored/staged key is resolved into the env and prevents stale fallbacks from
overriding explicit env values; specifically, before using
routerCredentialEnv/_providerKeyHint and before the isNonInteractive() branch,
invoke hydrateCredentialEnv(routerCredentialEnv), then proceed to set
credentialEnv and run the existing logic (including ensureNamedCredential) so
downstream router/provider setup sees the resolved credential.
- Around line 6666-6667: The persisted provider value "nvidia-router" is not
mapped back to the routed option, causing recovery to fall back to "build";
update the reverse mapping in the providerNameToOptionKey function to include an
entry mapping "nvidia-router" (and any equivalent persisted keys) to "routed" so
that when provider is read (e.g., provider variable set from bp.provider_name)
it correctly resolves to the routed option; locate providerNameToOptionKey and
add the "nvidia-router" -> "routed" case along with any related string variants
to ensure non-interactive reruns recover the routed selection.
- Around line 857-891: The health-check loop can return a false positive if
another process already serves /health or if the spawned child exited; update
the logic around spawn/child and the loop (symbols: spawn call that creates
child, child, pid, ROUTER_HEALTH_RETRIES, ROUTER_HEALTH_INTERVAL_MS) so that
after receiving a healthy response you also verify the spawned process is still
running (e.g., ensure child.exitCode === null and !child.killed or use
process.kill(pid, 0) for existence) before returning pid; if the child has
exited, treat it as not healthy and continue retries (and surface/log the
child.exitCode/error), and consider failing after retries by killing/stopping
the child if necessary.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4b5a3a35-469e-4830-ac89-6b306e72e939

📥 Commits

Reviewing files that changed from the base of the PR and between e2e36d9 and 58bb18b.

📒 Files selected for processing (5)
  • README.md
  • scripts/install.sh
  • src/lib/onboard-providers.ts
  • src/lib/onboard.ts
  • test/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard-providers.ts

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa

ericksoa commented May 6, 2026

Copy link
Copy Markdown
Contributor

/ok to test 02a4ac4

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/install.sh`:
- Around line 1336-1354: install_model_router_if_present currently calls pip3
install directly which can cause the whole script to exit on failure even when a
routed provider isn't requested; change the final step so the pip3 install (the
spin "Installing model router" pip3 install --quiet --user
"${router_dir}[prefill,proxy]") is executed and its exit status is checked, and
if it fails call error "..." only when is_routed_provider_requested returns
true, otherwise call warn "Skipping model router install — pip install failed"
(and do not hard-exit); implement this by replacing the direct spin+pip3 line
with an if ! spin ...; then if is_routed_provider_requested; error "pip3 install
of llm-router failed"; else warn "Skipping model router install — pip3 install
failed"; fi; fi so the script only hard-fails for routed installs.
- Around line 1330-1334: The is_routed_provider_requested function uses
Bash-4-only lowercase expansion (${provider,,}) which fails on macOS Bash 3.2;
change the normalization to a POSIX-compatible approach (e.g., pipe provider
through tr '[:upper:]' '[:lower:]' or use printf + tr) before comparing, so
replace the ${provider,,} usage in is_routed_provider_requested with a
POSIX-safe lowercase conversion and then test [[ "$provider" == "routed" ]] as
before.

In `@src/lib/onboard.ts`:
- Around line 6653-6692: The router is being started prematurely inside the
selected.key === "routed" provider-selection branch (see
loadBlueprintProfile("routed"), ensureNamedCredential, and startModelRouter
usage); move the actual startModelRouter(...) call and the onboarding state
mutation (onboardSession.updateSession(...).routerPid) out of this branch and
instead invoke a reconcile/start step at a committed lifecycle point for routed
sandboxes (e.g., after sandbox creation completes or in the resume/reuse path),
and add a verification/restart routine that runs on resume/reuse (and before
marking routed inference ready) to ensure the router process is present and
restarted if needed; keep credential collection
(hydrateCredentialEnv/ensureNamedCredential) in the selection branch but defer
starting or persisting the router PID until the centralized reconcile/start
logic (and call startModelRouter from there).
- Around line 835-843: The isRouterHealthy probe lacks a request timeout so
http.get can hang indefinitely; update the isRouterHealthy function to abort or
timeout the request after a short period (e.g., a few seconds) and
resolve(false) on timeout or error. Locate isRouterHealthy and the http.get call
(using the port variable) and attach a timeout handler—either call
request.setTimeout(...) to destroy/abort the socket or use an AbortController
with a timer—to ensure the promise resolves false on timeout and cleans up
listeners so await isRouterHealthy() cannot block forever.

In `@test/onboard.test.ts`:
- Line 109: Add a runtime guard check inside isOnboardTestInternals to validate
the newly exposed member providerNameToOptionKey: ensure the internals object
has own property "providerNameToOptionKey" and that typeof
internals.providerNameToOptionKey === "function" (or allow null/undefined if
your type permits) so the upfront type guard fails clearly if a stale/dist drift
removes or changes this helper; update the guard logic used by any callers of
providerNameToOptionKey (e.g., tests referenced in onboard.test.ts) to rely on
this enhanced check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e0559819-9be4-4f08-87bc-57f6ec981541

📥 Commits

Reviewing files that changed from the base of the PR and between 58bb18b and 02a4ac4.

📒 Files selected for processing (3)
  • scripts/install.sh
  • src/lib/onboard.ts
  • test/onboard.test.ts

Comment thread scripts/install.sh
Comment thread scripts/install.sh
Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard.ts Outdated
Comment thread test/onboard.test.ts
@ericksoa

ericksoa commented May 6, 2026

Copy link
Copy Markdown
Contributor

/ok to test 56e68b6

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/onboard-session.ts (1)

118-134: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

routerPid in SessionUpdates is never persisted via update flows.

filterSafeUpdates does not handle routerPid, so passing it to markStepComplete, completeSession, etc. will silently drop the value. If routerPid updates are expected to flow through SessionUpdates, add a case in filterSafeUpdates:

   if (typeof updates.nimContainer === "string") safe.nimContainer = updates.nimContainer;
+  if (typeof updates.routerPid === "number" && Number.isInteger(updates.routerPid) && updates.routerPid > 0) {
+    safe.routerPid = updates.routerPid;
+  }

If routerPid is intentionally only set at session creation, consider removing it from SessionUpdates to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard-session.ts` around lines 118 - 134, SessionUpdates includes
routerPid but filterSafeUpdates currently omits routerPid so any routerPid
passed into update flows (e.g., markStepComplete, completeSession) gets dropped;
either add routerPid to the allowed fields in filterSafeUpdates so it is
propagated through update flows, or if routerPid should only be set at creation
remove it from the SessionUpdates interface to avoid misuse—update the
filterSafeUpdates implementation and adjust call sites (markStepComplete,
completeSession) accordingly to ensure routerPid is preserved when intended.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 805-808: When constructing the router object (variable router
built from parsed?.components?.router and profile), avoid unconditionally
overwriting an existing parsed.components.router.credential_env with
profile.credential_env; instead only apply profile.credential_env when it is
explicitly set (not undefined/null), and otherwise preserve
parsed?.components?.router.credential_env. Update the router creation logic (the
spread of parsed?.components?.router and the assignment of credential_env) so
that credential_env is conditionally included from profile (or falls back to the
existing parsed value) to prevent silently clobbering a valid router-level
credential_env.
- Around line 961-971: reconcileModelRouter currently calls
hydrateCredentialEnv(routerCredentialEnv) but proceeds to startModelRouter even
if the resolved credential is absent; update reconcileModelRouter to read the
resolved credential name (routerCredentialEnv or bp.router.credential_env) after
hydrateCredentialEnv and assert the corresponding value is present (e.g.,
process.env[routerCredentialEnv] or the module's credential getter) and if
missing log an explicit error and fail fast (return/throw) before calling
startModelRouter(bp.router); this check should reference hydrateCredentialEnv,
routerCredentialEnv, and startModelRouter to locate the code to change.

---

Outside diff comments:
In `@src/lib/onboard-session.ts`:
- Around line 118-134: SessionUpdates includes routerPid but filterSafeUpdates
currently omits routerPid so any routerPid passed into update flows (e.g.,
markStepComplete, completeSession) gets dropped; either add routerPid to the
allowed fields in filterSafeUpdates so it is propagated through update flows, or
if routerPid should only be set at creation remove it from the SessionUpdates
interface to avoid misuse—update the filterSafeUpdates implementation and adjust
call sites (markStepComplete, completeSession) accordingly to ensure routerPid
is preserved when intended.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2156c618-74c5-4b75-b79e-29b1a6c52fcf

📥 Commits

Reviewing files that changed from the base of the PR and between 02a4ac4 and 56e68b6.

📒 Files selected for processing (4)
  • scripts/install.sh
  • src/lib/onboard-session.ts
  • src/lib/onboard.ts
  • test/onboard.test.ts

Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard.ts
@ericksoa

ericksoa commented May 6, 2026

Copy link
Copy Markdown
Contributor

/ok to test 0c26b52

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 869-878: The code builds credEnvVars (using buildSubprocessEnv,
routerCfg.credential_env, and environment keys like
NVIDIA_API_KEY/OPENAI_API_KEY/_providerKey) and injects it when starting a
router in startModelRouter(), but reconcileModelRouter() may reuse an existing
healthy listener without verifying those credentials, allowing stale API keys to
remain after rotations; update reconcileModelRouter() to detect changes to the
effective routed credential (compare the current process.env-derived credential
value for routerCfg.credential_env or the resolved provider key against the
listener's last-started credential) and force-restart or replace the child
listener if the credential differs, or alternatively ensure the child is always
started with the new credEnvVars (propagate the new buildSubprocessEnv output)
so listeners never continue with revoked keys; apply the same check/fix to the
analogous logic around the other block noted (lines ~958-973).
- Around line 6714-6736: The code selects routerCredentialEnv from
bp.credential_env causing mismatch when a blueprint sets
components.router.credential_env; change the selection to prefer
bp.router?.credential_env first (e.g., compute routerCredentialEnv =
bp.router?.credential_env || bp.credential_env || "OPENAI_API_KEY") so all
subsequent uses (hydrateCredentialEnv, process.env assignment, the
_providerKeyHint check, isNonInteractive handling and ensureNamedCredential
calls) consistently use the router-specific env key; also update the other
occurrence (the similar block referenced around reconcileModelRouter) to use the
same router-first lookup so loadBlueprintProfile/reconcileModelRouter paths are
aligned.
- Around line 886-913: The spawn() child process may emit an 'error' before pid
is available; add a child.once('error', (err) => { ... }) listener immediately
after creating child (and before calling child.unref() and before checking const
pid) to capture spawn failures and convert them to a thrown or logged onboarding
error (include err.message), then remove or ignore the listener if the process
starts successfully; update the block around the spawn call, child.unref(),
const pid check, and the existing child.once('exit', ...) so the 'error' event
is handled deterministically for the model-router child.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 75ce3224-787c-41d6-a7c2-9aba6cbd816d

📥 Commits

Reviewing files that changed from the base of the PR and between 56e68b6 and 0c26b52.

📒 Files selected for processing (2)
  • src/lib/onboard-session.ts
  • src/lib/onboard.ts

Comment thread src/lib/onboard.ts
Comment thread src/lib/onboard.ts
Comment thread src/lib/onboard.ts Outdated
@ericksoa

ericksoa commented May 6, 2026

Copy link
Copy Markdown
Contributor

/ok to test 89aebda

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved after resolving the remaining router pool model-identity issue. Current head is mergeable, CodeRabbit is passing, and focused local validation passed: npm run build:cli; npx vitest run nemoclaw/src/blueprint/runner.test.ts test/onboard.test.ts.

@ericksoa ericksoa merged commit ca1d6b8 into NVIDIA:main May 6, 2026
13 checks passed
miyoungc added a commit that referenced this pull request May 7, 2026
## Summary
- Bump docs metadata to 0.0.36 and refresh generated NemoClaw user
skills.
- Document Model Router onboarding, validation retries, Ollama tool
checks, Hermes policy behavior, and deployment verification updates.
- Remove suppressed experimental command references from public docs per
`docs/.docs-skip`.

## Source summary
- #2202 -> `docs/get-started/quickstart.md`,
`docs/inference/inference-options.md`, `docs/reference/architecture.md`:
Document Model Router setup and routed inference architecture.
- #3128 -> `docs/get-started/quickstart.md`,
`docs/reference/commands.md`: Document deployment verification and HTTP
401 health handling.
- #3104 -> `docs/inference/inference-options.md`: Document retry
behavior for transient provider validation failures.
- #3121 -> `docs/reference/architecture.md`: Document agent-scoped
model/provider compatibility manifests.
- #3046 -> `docs/reference/architecture.md`: Tie model-specific
compatibility setup to known model/provider behavior.
- #3097 -> `docs/inference/use-local-inference.md`: Document Ollama
tool-calling capability validation.
- #3082 -> `docs/reference/commands.md`: Document
`NEMOCLAW_SANDBOX_NAME` as the interactive sandbox-name default.
- f586cc5, 3442adf -> `docs/get-started/quickstart-hermes.md`,
`docs/reference/network-policies.md`: Document Hermes agent-specific
baseline policy endpoints.

## Test plan
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user`
- `make docs`
- `npm run build:cli`
- `rg` skip-term scan for `docs/` and generated user skills

Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Model Router provider for complexity-based routed inference.
* Ollama/local inference onboarding now validates tool-calling
capability.
  * Added `local-inference` network policy preset.

* **Documentation**
* New integration policy examples (Outlook, Telegram, Slack, Discord,
GitHub, Jira, etc.).
  * Clarified config immutability workflow and sandbox writable paths.
  * Hermes baseline network policy documented.

* **Improvements**
* Health checks treat device-auth responses as live; transient
validation retries.
* Installer performs pre-install reachability checks; CLI onboarding
gained a --fresh option.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@wscurran wscurran added area: integrations Third-party service integration behavior feature PR adds or expands user-visible functionality and removed enhancement: feature labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: integrations Third-party service integration behavior feature PR adds or expands user-visible functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants