fix(runtime): probe live gateway to reconcile agent identity (#3175) by jason-ma-nv · Pull Request #3679 · NVIDIA/NemoClaw

jason-ma-nv · 2026-05-18T07:07:13Z

Summary

The reconcile_agent_model_with_provider safety net added in #3319 reads both sides of its equality check from the same file (primary vs providers.inference.models[0].name). When a user runs openshell inference set — which only writes the gateway and leaves /sandbox/.openclaw/openclaw.json untouched — both fields stay equal to each other and the reconciler short-circuits as a no-op. This PR makes the reconciler probe the live gateway via openshell inference get --json and use that as the source of truth so the file is actually realigned to the routed model on the next sandbox start.

Related Issue

Partial fix for #3175. See the issue comment for why a complete fix also requires an OpenShell-side change.

Changes

scripts/nemoclaw-start.sh: reconcile_agent_model_with_provider now probes the live gateway and, when a model is returned, treats it as authoritative — aligning both agents.defaults.model.primary and models.providers.inference.models[0].{id,name} so the file no longer carries a stale entry the gateway can push back on its next reconcile.
Legacy in-file reconcile is preserved as a fallback for environments where the openshell binary is missing or the probe fails (no behavioral change for existing test scaffolds without a stubbed openshell).
test/nemoclaw-start-reconcile.test.ts: harness now optionally installs a stubbed openshell on PATH and scrubs any inherited host openshell. Adds four cases covering the gateway-mode happy path (the user's repro from [Inference] openshell inference set changes gateway model but sandbox agent still reports old model #3175), inference-prefix idempotency, gateway-matches-file no-op, and empty-probe fallback.

Scope

This PR does not stop the silent runtime revert reported in the #3175 repro (gateway version 3 → 4 without a sandbox restart). That push-back is owned by openshell-gateway and needs an OpenShell-side change; see the linked issue comment.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

`npx prek run --all-files` passes
`npm test` passes for the touched test file and the adjacent `test/nemoclaw-start.test.ts` (81/81)
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
`make docs` builds without warnings (doc changes only)

Notes for reviewers:

Local full CLI vitest run had 4 pre-existing failures on this machine unrelated to the diff (ssrf-parity build artifact missing, cli.test.ts gateway-cleanup timeouts, sandbox-build-context umask 775 vs 755). Will rely on PR CI to validate.
The gateway-probe Python block uses subprocess.run(timeout=3) to bound the probe and silently falls back on any error.

Signed-off-by: jason-ma-nv jama@nvidia.com

Summary by CodeRabbit

New Features
- Reconciliation now prefers a live gateway probe as the source of truth for model alignment; when available it updates the agent's primary model and the provider's primary model entry, with a legacy in-file fallback if the probe is unavailable.
Tests
- Expanded tests cover probe-driven updates, no-op when already aligned, fallback behavior, malformed probe output, and deterministic probe stubbing.

The reconciler in scripts/nemoclaw-start.sh was meant to be the safety net for users who run `openshell inference set` instead of `nemoclaw inference set`. As written it read both sides of its equality check from the same file (primary vs providers.inference.models[0].name), so when `openshell inference set` modified only the gateway and left the sandbox file untouched, the reconciler saw the two file fields still equal and short-circuited as a no-op. Probe the live gateway via `openshell inference get --json` and use it as the source of truth. When the gateway model differs from the file, align primary AND models[0].{id,name} so the agent identity and the gateway route stay consistent across the next reconcile cycle. Fall back to the legacy in-file reconcile when openshell is unavailable so environments without the probe still resolve drift. This is a partial fix for #3175: it closes the agent self-report drift at the next sandbox restart, but the silent runtime revert in the user's repro also needs an OpenShell-side fix to stop the gateway from pushing the sandbox's stale config back onto a runtime `inference set` change. Signed-off-by: jason-ma-nv <jama@nvidia.com>

coderabbitai · 2026-05-18T07:07:25Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5c63d087-4a55-4868-a6b7-c3a3e741f971

📥 Commits

Reviewing files that changed from the base of the PR and between f32465f and d1d4982.

📒 Files selected for processing (1)

scripts/nemoclaw-start.sh

🚧 Files skipped from review as they are similar to previous changes (1)

scripts/nemoclaw-start.sh

📝 Walkthrough

Walkthrough

The PR makes reconcile_agent_model_with_provider probe the live gateway (openshell) for the current model, normalizes and aligns agent and provider model entries when successful, and falls back to legacy in-file reconciliation when the probe is unavailable or invalid. Tests and harness support for gateway stubbing are added.

Changes

Gateway Model Reconciliation

Layer / File(s)	Summary
Gateway probe logic and reconciliation algorithm `scripts/nemoclaw-start.sh`	Updated `reconcile_agent_model_with_provider` to call `openshell inference get --json`, extract/qualify `data.model`, decide gateway vs legacy write, and in gateway mode update `agents.defaults.model.primary` and `models.providers.inference.models[0]` (repairing the models list/first element when missing).
Test harness setup for gateway mocking `test/nemoclaw-start-reconcile.test.ts`	Added `RunReconcileOptions` (`gatewayModel`, `gatewayRawOutput`, `env`), changed `runReconcile` signature, conditionally install `bin/openshell` stub that emits controlled stdout, and construct deterministic `PATH` removing inherited `openshell` and prepending the stub; spawned process receives merged env.
Test coverage for gateway-driven reconciliation `test/nemoclaw-start-reconcile.test.ts`	Added gateway-focused tests: updating both agent and provider when stale, accepting already-qualified gateway models without double-prefixing, no-op when file fields already match gateway model, and legacy fallback when gateway returns empty or malformed output.

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

fix, NemoClaw CLI, OpenShell

Suggested reviewers

ericksoa
cv

"I hopped through logs and paths aglow,
A stub that answers when probes must go,
I nudged the models so files all show,
When gateways hush, old ways still flow,
A rabbit cheers — the tests now know."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: probing the live gateway to reconcile agent identity, directly reflecting the core fix described in the PR objectives.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch worktree-fix-3175-reconciler-uses-live-gateway

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-18T07:09:35Z

E2E Advisor Recommendation

Required E2E: openclaw-inference-switch-e2e, model-router-provider-routed-inference-e2e
Optional E2E: runtime-overrides-e2e, inference-routing-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

openclaw-inference-switch-e2e (high): Best existing coverage for the affected path: installs OpenClaw, switches inference with nemoclaw inference set, verifies OpenShell route, openclaw.json primary/model entries, config hash, and live OpenClaw requests.
model-router-provider-routed-inference-e2e (high): The PR changes provider-routed model reconciliation against the live gateway; this regression job validates Model Router/provider-routed inference through inference.local after onboard.

Optional E2E

runtime-overrides-e2e (medium): Useful adjacent confidence for the same startup script config-write/ownership/hash machinery, though it does not specifically exercise the live OpenShell gateway probe.
inference-routing-e2e (medium): Optional broader inference routing smoke coverage if maintainers want extra confidence that route selection still behaves with real provider calls.

New E2E recommendations

sandbox restart reconciliation after route drift (high): No existing E2E appears to directly reproduce the new [Inference] openshell inference set changes gateway model but sandbox agent still reports old model #3175 startup scenario: live OpenShell gateway route is changed while /sandbox/.openclaw/openclaw.json is stale, then the sandbox restarts and must reconcile primary, models[0].id/name, .config-hash, and live agent responses to the gateway model.
- Suggested test: Add an OpenClaw inference-switch restart E2E that changes the live gateway model, intentionally verifies/creates stale openclaw.json state, restarts the sandbox, then asserts openshell inference get --json, agents.defaults.model.primary, models.providers.inference.models[0], .config-hash, and an OpenClaw agent turn all agree.

wscurran · 2026-05-18T14:43:19Z

✨
Related open PRs:

#3319 fix(runtime): reconcile agent identity with provider on sandbox start (#3175)

Related open issues:

#3175 [macOS][Inference] openshell inference set changes gateway model but sandbox agent still reports old model

wscurran · 2026-05-19T03:20:25Z

Sprint 5 planning update: we’re organizing this PR as a partial fix for #3175.

Relationship:

[Inference] openshell inference set changes gateway model but sandbox agent still reports old model #3175 = direct Sprint 5 issue for gateway model and sandbox agent model divergence
fix(runtime): probe live gateway to reconcile agent identity (#3175) #3679 = NemoClaw-side partial fix that probes the live gateway during sandbox startup and realigns the sandbox OpenClaw config
remaining work = runtime revert behavior still needs an OpenShell-side change, as noted in this PR’s Scope section

This PR should be tracked for Sprint 5 review, but it should not auto-close #3175 when merged.

… JSON Adds the missing coverage case from a deep-review pass: today the gateway probe absorbs malformed stdout via Python's `json.loads` → `SystemExit(0)` → empty `gateway_model` → legacy in-file reconcile. That behavior is correct but only protected by the implementation detail of the parser. A future refactor that swaps the parser or changes the exit handling could silently turn the malformed-output case into "do nothing" without any unit test catching it. Extends the test harness with a `gatewayRawOutput` option that lets the stub emit arbitrary stdout (bypassing the JSON-formatted shape), then adds one test that drives an HTML-like response and asserts the legacy-fallback shape (primary aligned to file's first model, models[0] untouched). 10/10 reconciler tests pass. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

cjagwani

Approving. Pushed one follow-up test commit (7f6ea8c) to pin the legacy-fallback shape when the gateway probe emits malformed JSON.

The case is already handled correctly today (json.loads → SystemExit(0) → empty gateway_model → legacy in-file reconcile), but only via implementation detail of the parser. A future refactor that swaps how stdout is parsed or how the exit is handled could silently degrade malformed-output to "do nothing" with no test catching it. The new test drives an HTML-like response through the existing harness (extended with a gatewayRawOutput option) and asserts the legacy-fallback config shape.

10/10 reconciler tests pass.

Heads-up not blocking this PR:

command -v openshell at the call site is host-namespaced; if a future packaging change ever ships a shim openshell without inference get --json, the current absorb-via-SystemExit(0) path covers it but worth a code comment.
Sync'd-but-non-canonical id: "inference/foo" values pass the equality check without rewriting (write path always writes bare). Cosmetic — config stays non-canonical until next true drift.

Scope check vs #3175: PR is intentionally a partial fix per its body and @wscurran's note — closes the "agent self-report stays old after restart" symptom but not the runtime v3→v4 silent revert that lives in the openshell-gateway layer. Issue should stay open.

github-actions · 2026-06-03T21:02:37Z

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-openclaw-double-same-provider, ubuntu-repo-cloud-openclaw-double-provider-switch
Optional scenario E2E: ubuntu-repo-cloud-openclaw

Dispatch required scenario E2E:

gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-same-provider
gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-provider-switch

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

ubuntu-repo-cloud-openclaw-double-same-provider: scripts/nemoclaw-start.sh changes OpenClaw sandbox startup reconciliation after inference route/model changes. This scenario exercises the same-provider double switch lifecycle on the standard Ubuntu repo-current OpenClaw path.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-same-provider
ubuntu-repo-cloud-openclaw-double-provider-switch: The changed reconciliation logic explicitly handles gateway/provider routing drift after inference provider changes. This scenario is the smallest routed scenario that exercises the OpenClaw double-provider-switch lifecycle.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-provider-switch

Optional scenario E2E

ubuntu-repo-cloud-openclaw: Adjacent baseline Ubuntu OpenClaw scenario can provide broad startup/inference regression coverage for the modified sandbox start script, but it does not specifically exercise the route-switch lifecycle targeted by this PR.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Relevant changed files

scripts/nemoclaw-start.sh

github-actions · 2026-06-03T21:05:31Z

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

None.

🔎 Worth checking

PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
- Recommendation: Re-run the PR Review Advisor or perform a manual review.
- Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

None.

Consider writing more tests for

**Runtime validation** — Add or identify targeted runtime/integration validation for the changed behavior; do not report external E2E job pass/fail here.. Runtime/sandbox/infrastructure paths need behavioral runtime validation: scripts/nemoclaw-start.sh.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

## Summary - Add the v0.0.59 release notes from the GitHub announcement discussion. - Refresh local inference and credential-storage guidance for the current release behavior. - Regenerate the user skills from the updated Fern docs. - Tighten release-prep and docs review guidance for generated skills, PR labels, and shared `$$nemoclaw` command placeholders. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" --glob '*.{md,mdx}'` - `git diff --check` - `npm run docs` (rerun outside sandbox after sandbox-only `tsx` IPC permission failure) - `npm run typecheck:cli` - Pre-commit hooks during commit passed, including markdownlint, docs-to-skills verification, gitleaks, commitlint, and skills YAML tests. ## Source Summary - #3679, #4437, #4681, #4766, #4772, #4775, #4786 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`, `docs/reference/troubleshooting.mdx`: Summarize OpenClaw 2026.5.27 compatibility, runtime path pinning, plugin registry recovery, live gateway reconciliation, and clearer host-alias/startup diagnostics. - #4332, #4402, #4769, #4776, #4779 -> `docs/about/release-notes.mdx`, `docs/inference/inference-options.mdx`, `docs/inference/use-local-inference.mdx`, `docs/inference/switch-inference-providers.mdx`: Document the release inference changes covering Local NIM waits, Hermes Anthropic routing, Nemotron 3 Ultra, the current Ollama starter fallback, and Spark managed-vLLM context length. - #4628, #4652, #4733, #4745 -> `docs/about/release-notes.mdx`, `docs/security/credential-storage.mdx`, `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/troubleshooting.mdx`: Capture permission healing, gateway-stored credential reuse, cross-sandbox messaging credential conflict checks, and CDI preflight diagnostics. - #4728, #4737, #4743, #4744, #4782 -> `.agents/skills/nemoclaw-user-*`: Regenerate the user skill references from the updated source docs. - Follow-up maintenance -> `.agents/skills/nemoclaw-contributor-update-docs/SKILL.md`, `.coderabbit.yaml`: Add release-prep area labels for docs and skills PRs, and teach docs review guidance that `$$nemoclaw` is the correct shared command placeholder for examples that work across agent aliases. Note: the `documentation` label was not present in the repository, so this PR is labeled with `v0.0.59` only.  ## Summary by CodeRabbit * **Documentation** * Updated default model for local Ollama inference setup to qwen3.5:9b * Added Nemotron 3 Ultra 550B as an NVIDIA Endpoints model option * Clarified credential storage and reuse behavior for post-deployment (day-two) operations * Added v0.0.59 release notes covering OpenClaw compatibility, inference options, Hermes messaging sync, and troubleshooting * Clarified CLI selection guidance and updated OpenClaw version example in status output * Revised release-prep instructions and docs review guidance for CLI alias usage

wscurran added fix labels May 18, 2026

Merge branch 'main' into worktree-fix-3175-reconciler-uses-live-gateway

0f5b25b

wscurran linked an issue May 19, 2026 that may be closed by this pull request

[Inference] openshell inference set changes gateway model but sandbox agent still reports old model #3175

Closed

wscurran mentioned this pull request May 19, 2026

openshell inference set does not override model in proxied inference requests #733

Closed

2 tasks

wscurran added the enhancement: inference label May 19, 2026

This was referenced May 21, 2026

fix: remove proxy hooks from sandbox rc files #3842

Merged

fix(openclaw): route discord through loopback proxy #3961

Merged

fix(openclaw): remove Discord loopback helper #4005

Merged

wscurran added the v0.0.58 Release target label Jun 2, 2026

cv assigned cjagwani Jun 3, 2026

wscurran added area: inference Inference routing, serving, model selection, or outputs area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression feature PR adds or expands user-visible functionality and removed OpenShell labels Jun 3, 2026

cjagwani approved these changes Jun 3, 2026

View reviewed changes

Merge branch 'main' into worktree-fix-3175-reconciler-uses-live-gateway

eb7c119

cv added v0.0.59 Release target and removed v0.0.58 Release target labels Jun 4, 2026

cv added 2 commits June 4, 2026 12:48

Merge branch 'main' into worktree-fix-3175-reconciler-uses-live-gateway

f32465f

fix(runtime): format gateway reconciliation probe

d1d4982

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

cv merged commit 30ae331 into main Jun 4, 2026
30 checks passed

cv deleted the worktree-fix-3175-reconciler-uses-live-gateway branch June 4, 2026 20:21

miyoungc mentioned this pull request Jun 5, 2026

docs: refresh 0.0.59 release notes #4790

Merged

wscurran removed the feature PR adds or expands user-visible functionality label Jun 9, 2026

Conversation

jason-ma-nv commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Scope

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Uh oh!

wscurran commented May 18, 2026

Uh oh!

wscurran commented May 19, 2026

Uh oh!

cjagwani left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jason-ma-nv commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading

github-actions Bot commented May 18, 2026 •

edited

Loading

github-actions Bot commented Jun 3, 2026 •

edited

Loading

github-actions Bot commented Jun 3, 2026 •

edited

Loading