Skip to content

fix(connect): stop silent gateway inference revert#4611

Merged
cv merged 9 commits into
mainfrom
fix/connect-inference-route-revert
Jun 2, 2026
Merged

fix(connect): stop silent gateway inference revert#4611
cv merged 9 commits into
mainfrom
fix/connect-inference-route-revert

Conversation

@hunglp6d

@hunglp6d hunglp6d commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

When the live gateway inference route and a sandbox's recorded route diverge, connect used to silently reapply the recorded route to the gateway — quietly reverting a user's model change with only a status-style line (#3726). This PR makes that override loud (a warning naming the mismatch and the supported way to change the route) and removes the precondition that made the revert destructive, by persisting the route to the registry before the failure-prone in-sandbox sync in inference set. It intentionally only detects and surfaces the mismatch — it does not try to silently reconcile every layer from connect, since some layers cannot be set safely there.

Related Issue

Fixes #3726
Related: #3725 — the in-sandbox sync failure that this PR makes non-destructive (no longer leaves the recorded route stale).

Changes

  • inference set: write the host registry before the in-sandbox config sync, and make that sync best-effort — on failure it warns and points at rebuild instead of aborting and leaving the gateway/registry split.
  • connect: classify the gateway-vs-recorded route as aligned / repair / diverged via a new pure helper planInferenceRouteReconcile (src/lib/inference/config.ts).
  • diverged: print a loud warning (the mismatch + which route is being applied + the supported command), even in --probe-only quiet mode, instead of a silent status line. The shared gateway is still re-pointed to the recorded route.
  • Tests: unit tests for the reconcile helper (aligned/repair/diverged) and for inference set registry-before-sync resilience; updated the connect integration test for the loud output.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Hung Le hple@nvidia.com

Summary by CodeRabbit

  • New Features

    • Route-reconciliation planning for gateway inference connections with explicit outcomes (aligned/repair/diverged) and automated alignment guidance; route values are sanitized for display.
  • Bug Fixes

    • Registry is updated immediately after applying an inference set.
    • In-sandbox config sync is now best-effort: failures emit warnings (including rebuild guidance) and operation continues; results indicate sync success/failure.
    • Gateway warning verbosity adjusted to surface loud warnings on diverged routes.
  • Tests

    • Added/updated tests for reconciliation outcomes and sync failure messaging.

@copy-pr-bot

copy-pr-bot Bot commented Jun 1, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5f8959b3-4a2d-4035-96f0-b9e19491abcb

📥 Commits

Reviewing files that changed from the base of the PR and between bde3dc0 and 784d6a7.

📒 Files selected for processing (2)
  • src/lib/inference/config.test.ts
  • src/lib/inference/config.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/lib/inference/config.test.ts
  • src/lib/inference/config.ts

📝 Walkthrough

Walkthrough

Adds route reconciliation types and planner, makes runInferenceSet update the registry atomically while treating in-sandbox config sync as best-effort (exposing inSandboxConfigSynced), and refactors connect to use the planner and emit louder divergence warnings.

Changes

Gateway inference state reconciliation and robustness

Layer / File(s) Summary
Inference route reconciliation contract and tests
src/lib/inference/config.ts, src/lib/inference/config.test.ts
Adds RecordedInferenceRoute, InferenceRoutePlan, planInferenceRouteReconcile, and sanitizeRouteValueForDisplay; tests cover aligned, repair, and diverged outcomes.
Registry-atomic update + best-effort sandbox sync
src/lib/actions/inference-set.ts, src/lib/actions/inference-set.test.ts
runInferenceSet updates the registry immediately after OpenShell succeeds (throws InferenceSetError on registry failure), then performs writeSandboxConfig and recomputeSandboxConfigHash as nested best-effort steps, tracks inSandboxConfigSynced in the returned InferenceSetResult, and includes tests for write/hash failure modes and result shape.
Connect route alignment and messaging
src/lib/actions/sandbox/connect.ts, test/sandbox-connect-inference.test.ts
ensureSandboxInferenceRoute uses planInferenceRouteReconcile(live, recorded) to decide messaging and alignment; diverged emits loud warnings and shows operator align command even in quiet mode; tests updated to assert the new message fragments.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant InferenceSet as inference set
  participant OpenShell
  participant Registry as Registry/Sandbox
  participant SandboxSide as Sandbox Config
  User->>InferenceSet: set provider/model
  InferenceSet->>OpenShell: update gateway inference route
  OpenShell-->>InferenceSet: success
  InferenceSet->>Registry: updateSandbox (provider/model)
  Registry-->>InferenceSet: updated or error
  alt registry update fails
    InferenceSet-->>User: InferenceSetError
  else registry update succeeds
    InferenceSet->>SandboxSide: writeSandboxConfig (try)
    alt config write succeeds
      SandboxSide-->>InferenceSet: ok
      InferenceSet->>SandboxSide: recomputeSandboxConfigHash (try)
      alt hash ok
        InferenceSet-->>User: log "inference route synced"
      else hash fails
        InferenceSet-->>User: log rebuild guidance
      end
    else config write fails
      InferenceSet-->>User: log rebuild guidance
    end
    InferenceSet-->>User: return InferenceSetResult (provider/model, inSandboxConfigSynced)
  end
Loading
sequenceDiagram
  participant Connect
  participant Gateway as OpenShell Gateway
  participant Planner as planInferenceRouteReconcile
  User->>Connect: connect to sandbox
  Connect->>Gateway: fetch live inference state
  alt live state exists
    Gateway-->>Connect: GatewayInference
    Connect->>Planner: plan(live, sandbox-recorded)
  else live state is null
    Gateway-->>Connect: null
    Connect->>Planner: plan(null, sandbox-recorded)
  end
  alt plan.kind = aligned
    Planner-->>Connect: { kind: aligned }
    Connect->>User: skip (silent)
  else plan.kind = repair
    Planner-->>Connect: { kind: repair }
    Connect->>Gateway: quiet route setup
  else plan.kind = diverged
    Planner-->>Connect: { kind: diverged, live, recorded }
    Connect-->>User: loud "differs from the recorded route"
    Connect-->>User: loud "Aligning the gateway to <provider>/<model>"
    Connect->>Gateway: update route
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

fix, NemoClaw CLI, Sandbox

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 I hop through routes both live and stored,

I plan, I warn, I try without discord,
Registry writes stand firm, the sandbox tries to mend,
If sync should fail, I log and gently send,
Tests hop in circles so the flows won't bend.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(connect): stop silent gateway inference revert' directly and concisely describes the main objective: preventing silent reversion of gateway inference settings during connect operations.
Linked Issues check ✅ Passed The PR implements all coding requirements from #3726: adds planInferenceRouteReconcile to classify route divergence, makes in-sandbox sync best-effort with warning logs, emits loud warnings on divergence even in quiet mode, and includes comprehensive unit and integration tests.
Out of Scope Changes check ✅ Passed All changes directly support the stated objectives: inference-set registry persistence, route reconciliation logic, divergence warnings, sync failure handling, and comprehensive test coverage. No extraneous modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/connect-inference-route-revert

Comment @coderabbitai help to get the list of available commands and usage tips.

@hunglp6d hunglp6d marked this pull request as ready for review June 1, 2026 10:42
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: openclaw-inference-switch-e2e, hermes-inference-switch-e2e, inference-routing-e2e
Optional E2E: double-onboard-e2e, rebuild-openclaw-e2e, rebuild-hermes-e2e

Dispatch hint: openclaw-inference-switch-e2e,hermes-inference-switch-e2e,inference-routing-e2e

Auto-dispatched E2E: openclaw-inference-switch-e2e, hermes-inference-switch-e2e, inference-routing-e2e via nightly-e2e.yaml at bb732625ecb5a3251c4b5a3a72c39127b9003a3enightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • openclaw-inference-switch-e2e (medium-high; live cloud inference, sandbox install/onboard, roughly up to 45-60 minutes): Directly covers the changed nemoclaw inference set path for OpenClaw: switches a running sandbox, verifies OpenShell route, registry/session state, OpenClaw config/hash, and live requests through inference.local after the switch.
  • hermes-inference-switch-e2e (medium-high; live cloud inference, Hermes sandbox install/onboard, timeout 60 minutes): The same runInferenceSet changes apply to Hermes config.yaml sync and alias behavior; this validates the Hermes route/config/hash/live-request path after nemohermes inference set.
  • inference-routing-e2e (medium; live cloud inference with some PR-safe cases, timeout 30 minutes): Changes touch gateway route parsing and managed inference route semantics. This job validates inference routing through the OpenShell gateway proxy, credential isolation, and route health/error classification.

Optional E2E

  • double-onboard-e2e (high; multiple onboard phases and probe-only recovery, timeout 90 minutes): Useful adjacent confidence for shared-gateway multi-sandbox behavior and connect --probe-only recovery paths, which are near the changed route-reconciliation logic. Not fully targeted to route divergence, so optional.
  • rebuild-openclaw-e2e (medium-high; rebuild flow with live inference, timeout 60 minutes): The degraded in-sandbox config/hash path now instructs users to rebuild to resync from registry. This validates OpenClaw rebuild preserves state and post-rebuild inference, but it does not directly simulate a failed inference-set config write.
  • rebuild-hermes-e2e (medium-high; Hermes rebuild flow with live inference, timeout 60 minutes): Optional Hermes counterpart for rebuild-based recovery after degraded in-sandbox config sync guidance, since runInferenceSet also handles Hermes config.yaml.

New E2E recommendations

  • sandbox connect route divergence (high): Existing E2E coverage does not appear to create a real running sandbox whose recorded registry route intentionally differs from the live shared gateway route, then run nemoclaw <sandbox> connect --probe-only and assert that the divergence is warned loudly and aligned rather than silently reverted.
  • inference set degraded in-sandbox sync (high): Unit tests simulate writeSandboxConfig and hash recompute failures, but existing E2E jobs do not exercise a real or fake sandbox failure where nemoclaw inference set updates gateway/registry but cannot update in-sandbox config/hash, then verifies connect/rebuild behavior does not revert to stale registry state.
    • Suggested test: Add a focused E2E or regression-script job that forces in-sandbox config write or .config-hash recompute failure during nemoclaw inference set, verifies the command exits successfully with degraded warning/no false synced message, verifies registry and gateway remain consistent, and verifies nemoclaw <sandbox> rebuild restores in-sandbox config/hash.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: openclaw-inference-switch-e2e,hermes-inference-switch-e2e,inference-routing-e2e

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-openclaw-double-provider-switch, ubuntu-repo-cloud-hermes
Optional scenario E2E: ubuntu-repo-cloud-openclaw, ubuntu-repo-cloud-openclaw-double-same-provider

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-provider-switch
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required scenario E2E

  • ubuntu-repo-cloud-openclaw-double-provider-switch: Primary targeted scenario for the changed OpenClaw inference route reconciliation and provider-switch behavior. The PR changes nemoclaw inference set, gateway-vs-registry route planning, and connect handling for diverged routes.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-provider-switch
  • ubuntu-repo-cloud-hermes: Covers the Hermes path in runInferenceSet, including Hermes config sync and routed inference behavior affected by the shared inference-set result/audit/degraded-sync changes.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes

Optional scenario E2E

  • ubuntu-repo-cloud-openclaw: Adjacent baseline OpenClaw cloud route coverage for normal onboarding, inference.local health, and route-state assertions after changes to inference config parsing and route reconciliation.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw
  • ubuntu-repo-cloud-openclaw-double-same-provider: Adjacent provider-switch lifecycle coverage for the same-provider variant; useful to catch regressions in repeated inference set flows but not the primary diverged-route case fixed here.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-same-provider

Relevant changed files

  • src/lib/actions/inference-set.ts
  • src/lib/actions/sandbox/connect.ts
  • src/lib/inference/config.ts
  • test/sandbox-connect-inference.test.ts

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 3 needs attention, 10 worth checking, 0 nice ideas
Since last review: 3 prior items resolved, 8 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • Linked issue still asks for a --no-sync-inference escape hatch (src/lib/actions/sandbox/connect.ts:94): Issue [macOS][CLI&UX] nemoclaw <name> silently runs connect and reverts gateway inference #3726 gives two acceptable directions. This PR keeps implicit connect and makes the route override loud, but the same acceptance clause also says there should be a --no-sync-inference flag. Connect and --probe-only still re-point the shared gateway route to the recorded sandbox route with no user escape hatch.
    • Recommendation: Add --no-sync-inference to SandboxConnectOptions, SANDBOX_CONNECT_FLAGS, help text, argument parsing, and the ensureSandboxInferenceRoute call path so users can connect or probe without re-pointing the shared gateway. If maintainers intentionally narrow the issue scope, capture that explicitly before treating [macOS][CLI&UX] nemoclaw <name> silently runs connect and reverts gateway inference #3726 as fixed.
    • Evidence: Linked issue clause: "If running `connect` is intentional, the connect step must NOT silently revert gateway-level inference settings. At minimum the override must be loud (banner / warning), and there should be a `--no-sync-inference` flag." Current flags are --dangerously-skip-permissions, --probe-only, --help, and -h; SandboxConnectOptions only has probeOnly; ensureSandboxInferenceRoute still calls runOpenshell(buildInferenceSetArgs(sb.provider, sb.model)) for divergence, including the new probe-only test.
  • Route update still has split-brain failure paths before and after the registry write (src/lib/actions/inference-set.ts:371): Updating the registry before writeSandboxConfig fixes the stale-registry path for in-sandbox write/hash failures, but OpenShell is still changed before updateSandbox. If updateSandbox fails, the gateway route is already new while the registry remains old. If readSandboxConfig or config patching throws after the registry update, gateway and registry are new while the in-sandbox config remains old, without the degraded guidance or audit path added for write/hash failures.
    • Recommendation: Add targeted negative tests for updateSandbox failure after OpenShell succeeds and for readSandboxConfig/patch failures after the registry update. Either compensate or roll back where feasible, or explicitly surface the degraded state and recovery action before throwing.
    • Evidence: runInferenceSet calls deps.runOpenshell(...), then deps.updateSandbox(...), then deps.readSandboxConfig(...). The new degraded try/catch starts only around writeSandboxConfig and recomputeSandboxConfigHash.
  • Large-file hotspots grew further: Several already-large changed files grew beyond the repository's monolith threshold, increasing maintenance risk in hot inference/connect surfaces.
    • Recommendation: Move the new route-reconciliation and registry-before-sync tests into smaller focused modules, or offset the growth by extracting shared setup/helpers. Consider extracting the new connect reconciliation display logic out of connect.ts.
    • Evidence: Deterministic monolith deltas: src/lib/actions/inference-set.test.ts grew +77 lines, src/lib/inference/config.test.ts grew +68 lines, src/lib/actions/inference-set.ts grew +40 lines, and src/lib/actions/sandbox/connect.ts grew +23 lines.

🔎 Worth checking

  • Source-of-truth review needed: runInferenceSet best-effort in-sandbox config sync: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: src/lib/actions/inference-set.ts catches writeError/hashError, warns, appends an incomplete-sync audit reason, and returns inSandboxConfigSynced.
  • Source-of-truth review needed: connect route divergence reconciliation: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: ensureSandboxInferenceRoute warns on plan.kind === diverged and then calls runOpenshell(buildInferenceSetArgs(sb.provider, sb.model)).
  • Source-of-truth review needed: planInferenceRouteReconcile absent/partial live route repair: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: src/lib/inference/config.ts returns repair for !live, !live.provider, or !live.model.
  • Source-of-truth review needed: sanitizeRouteValueForDisplay for route diagnostics: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: sanitizeRouteValueForDisplay removes [\u0000-\u001f\u007f-\u009f], and connect.ts uses the sanitized values in a printed command.
  • Validate or quote untrusted route values in the divergence warning command (src/lib/actions/sandbox/connect.ts:604): The warning strips control characters, which reduces terminal escape injection, and actual OpenShell execution uses argv arrays. However, the suggested copy-paste command still interpolates live gateway values and the sandbox name without allowlist validation or shell quoting. Malformed gateway output or legacy registry data with spaces, quotes, shell metacharacters, or command substitutions can produce misleading or unsafe command guidance.
    • Recommendation: Validate live and recorded route values before treating them as usable divergence, and render the suggested command with shell-quoted arguments or omit the command when values fail the provider/model/sandbox allowlists. Add negative tests with malformed gateway output and malformed legacy registry values.
    • Evidence: sanitizeRouteValueForDisplay only removes control characters. The warning prints `${CLI_NAME} inference set --provider ${liveProvider} --model ${liveModel} --sandbox ${sandboxName}`. runInferenceSet validates new model IDs, but connect reads historical SandboxEntry values and parsed gateway output here.
  • Avoid logging raw lower-layer sync errors if they can include sensitive config details (src/lib/actions/inference-set.ts:397): The degraded write/hash paths print lower-layer error messages verbatim. The changed code does not directly add secrets, but this is a credentials/inference path and lower-layer sandbox config errors should not be allowed to leak secret-bearing config snippets, command stderr, or sensitive paths into CLI logs.
    • Recommendation: Confirm writeSandboxConfig and recomputeSandboxConfigHash errors never include secret values, or redact/summarize these details before logging. Add a negative test if the lower layer can include config content in error messages.
    • Evidence: The hash and write catch blocks derive `detail` from `hashError.message` / `writeError.message` and log it directly in the warning.
  • Route reconciliation still needs a complete source-of-truth contract (src/lib/actions/sandbox/connect.ts:575): The PR makes gateway/registry divergence loud, but connect still mutates the shared gateway route based on host registry state. The code comments do not fully define which layer is authoritative, why connect is the right repair boundary instead of preventing divergence at write time, or when this workaround can be removed.
    • Recommendation: Document the gateway-vs-registry source-of-truth boundary near ensureSandboxInferenceRoute/planInferenceRouteReconcile, including the invalid state, source-fix constraint, recovery behavior, removal condition, and --no-sync-inference semantics.
    • Evidence: planInferenceRouteReconcile returns diverged; ensureSandboxInferenceRoute warns and then calls runOpenshell(buildInferenceSetArgs(sb.provider, sb.model)). Comments say the shared gateway re-point must be loud, but do not define the longer-term source boundary or removal condition.
  • Best-effort in-sandbox sync still lacks source-fix and removal criteria (src/lib/actions/inference-set.ts:388): The PR now separates config-write and hash-recompute failures and avoids false synced reporting. The localized best-effort behavior still needs a fuller source-of-truth explanation: why the crash-prone in-sandbox layer cannot be fixed in this PR and when this degraded fallback should be removed.
    • Recommendation: Expand the comment or nearby docs to state the source boundary, why the sandbox write/hash source cannot be fixed here, what runtime condition or upstream fix removes the workaround, and how rebuild restores each degraded state.
    • Evidence: The comment identifies two degraded states and says rebuild fixes them. Tests cover write failure and hash failure, but no removal condition or source-fix constraint is documented.
  • Partial gateway-route repair needs clearer source-boundary documentation (src/lib/inference/config.ts:298): planInferenceRouteReconcile treats absent or partial live gateway routes as repair instead of divergence. That may be the right defensive behavior, but the code does not document why partial OpenShell output can occur, why connect should repair it, or whether this is permanent defensive handling versus a workaround.
    • Recommendation: Document the source boundary for null/provider-only/model-only gateway output and whether the repair behavior is intended as permanent defensive classification. If it is workaround behavior, include the source-fix constraint and removal condition.
    • Evidence: planInferenceRouteReconcile returns { kind: "repair" } for !live, !live.provider, or !live.model. Tests cover null, provider-only, and model-only repair cases.
  • Runtime validation is still recommended for gateway and sandbox route behavior: The added unit and fixture tests cover important branches, but these changes are in runtime/sandbox/infrastructure paths involving OpenShell, gateway route state, sandbox config writes, and integrity hashes. Mocked tests cannot fully validate real driver behavior or timing around route mutation and degraded states.
    • Recommendation: Add or identify targeted runtime/integration validation for route divergence, probe-only behavior, write/hash failures, registry/read failure paths, and malformed route diagnostics. Do not rely only on external E2E status surfaces.
    • Evidence: Deterministic test-depth context recommends runtime validation for src/lib/actions/inference-set.ts, src/lib/actions/sandbox/connect.ts, and src/lib/inference/config.ts.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: runInferenceSet best-effort in-sandbox config sync: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: src/lib/actions/inference-set.ts catches writeError/hashError, warns, appends an incomplete-sync audit reason, and returns inSandboxConfigSynced.
  • Source-of-truth review needed: connect route divergence reconciliation: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: ensureSandboxInferenceRoute warns on plan.kind === diverged and then calls runOpenshell(buildInferenceSetArgs(sb.provider, sb.model)).
  • Source-of-truth review needed: planInferenceRouteReconcile absent/partial live route repair: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: src/lib/inference/config.ts returns repair for !live, !live.provider, or !live.model.
  • Source-of-truth review needed: sanitizeRouteValueForDisplay for route diagnostics: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: sanitizeRouteValueForDisplay removes [\u0000-\u001f\u007f-\u009f], and connect.ts uses the sanitized values in a printed command.
  • Linked issue still asks for a --no-sync-inference escape hatch (src/lib/actions/sandbox/connect.ts:94): Issue [macOS][CLI&UX] nemoclaw <name> silently runs connect and reverts gateway inference #3726 gives two acceptable directions. This PR keeps implicit connect and makes the route override loud, but the same acceptance clause also says there should be a --no-sync-inference flag. Connect and --probe-only still re-point the shared gateway route to the recorded sandbox route with no user escape hatch.
    • Recommendation: Add --no-sync-inference to SandboxConnectOptions, SANDBOX_CONNECT_FLAGS, help text, argument parsing, and the ensureSandboxInferenceRoute call path so users can connect or probe without re-pointing the shared gateway. If maintainers intentionally narrow the issue scope, capture that explicitly before treating [macOS][CLI&UX] nemoclaw <name> silently runs connect and reverts gateway inference #3726 as fixed.
    • Evidence: Linked issue clause: "If running `connect` is intentional, the connect step must NOT silently revert gateway-level inference settings. At minimum the override must be loud (banner / warning), and there should be a `--no-sync-inference` flag." Current flags are --dangerously-skip-permissions, --probe-only, --help, and -h; SandboxConnectOptions only has probeOnly; ensureSandboxInferenceRoute still calls runOpenshell(buildInferenceSetArgs(sb.provider, sb.model)) for divergence, including the new probe-only test.
  • Route update still has split-brain failure paths before and after the registry write (src/lib/actions/inference-set.ts:371): Updating the registry before writeSandboxConfig fixes the stale-registry path for in-sandbox write/hash failures, but OpenShell is still changed before updateSandbox. If updateSandbox fails, the gateway route is already new while the registry remains old. If readSandboxConfig or config patching throws after the registry update, gateway and registry are new while the in-sandbox config remains old, without the degraded guidance or audit path added for write/hash failures.
    • Recommendation: Add targeted negative tests for updateSandbox failure after OpenShell succeeds and for readSandboxConfig/patch failures after the registry update. Either compensate or roll back where feasible, or explicitly surface the degraded state and recovery action before throwing.
    • Evidence: runInferenceSet calls deps.runOpenshell(...), then deps.updateSandbox(...), then deps.readSandboxConfig(...). The new degraded try/catch starts only around writeSandboxConfig and recomputeSandboxConfigHash.
  • Large-file hotspots grew further: Several already-large changed files grew beyond the repository's monolith threshold, increasing maintenance risk in hot inference/connect surfaces.
    • Recommendation: Move the new route-reconciliation and registry-before-sync tests into smaller focused modules, or offset the growth by extracting shared setup/helpers. Consider extracting the new connect reconciliation display logic out of connect.ts.
    • Evidence: Deterministic monolith deltas: src/lib/actions/inference-set.test.ts grew +77 lines, src/lib/inference/config.test.ts grew +68 lines, src/lib/actions/inference-set.ts grew +40 lines, and src/lib/actions/sandbox/connect.ts grew +23 lines.
  • Validate or quote untrusted route values in the divergence warning command (src/lib/actions/sandbox/connect.ts:604): The warning strips control characters, which reduces terminal escape injection, and actual OpenShell execution uses argv arrays. However, the suggested copy-paste command still interpolates live gateway values and the sandbox name without allowlist validation or shell quoting. Malformed gateway output or legacy registry data with spaces, quotes, shell metacharacters, or command substitutions can produce misleading or unsafe command guidance.
    • Recommendation: Validate live and recorded route values before treating them as usable divergence, and render the suggested command with shell-quoted arguments or omit the command when values fail the provider/model/sandbox allowlists. Add negative tests with malformed gateway output and malformed legacy registry values.
    • Evidence: sanitizeRouteValueForDisplay only removes control characters. The warning prints `${CLI_NAME} inference set --provider ${liveProvider} --model ${liveModel} --sandbox ${sandboxName}`. runInferenceSet validates new model IDs, but connect reads historical SandboxEntry values and parsed gateway output here.
  • Avoid logging raw lower-layer sync errors if they can include sensitive config details (src/lib/actions/inference-set.ts:397): The degraded write/hash paths print lower-layer error messages verbatim. The changed code does not directly add secrets, but this is a credentials/inference path and lower-layer sandbox config errors should not be allowed to leak secret-bearing config snippets, command stderr, or sensitive paths into CLI logs.
    • Recommendation: Confirm writeSandboxConfig and recomputeSandboxConfigHash errors never include secret values, or redact/summarize these details before logging. Add a negative test if the lower layer can include config content in error messages.
    • Evidence: The hash and write catch blocks derive `detail` from `hashError.message` / `writeError.message` and log it directly in the warning.
  • Route reconciliation still needs a complete source-of-truth contract (src/lib/actions/sandbox/connect.ts:575): The PR makes gateway/registry divergence loud, but connect still mutates the shared gateway route based on host registry state. The code comments do not fully define which layer is authoritative, why connect is the right repair boundary instead of preventing divergence at write time, or when this workaround can be removed.
    • Recommendation: Document the gateway-vs-registry source-of-truth boundary near ensureSandboxInferenceRoute/planInferenceRouteReconcile, including the invalid state, source-fix constraint, recovery behavior, removal condition, and --no-sync-inference semantics.
    • Evidence: planInferenceRouteReconcile returns diverged; ensureSandboxInferenceRoute warns and then calls runOpenshell(buildInferenceSetArgs(sb.provider, sb.model)). Comments say the shared gateway re-point must be loud, but do not define the longer-term source boundary or removal condition.
  • Best-effort in-sandbox sync still lacks source-fix and removal criteria (src/lib/actions/inference-set.ts:388): The PR now separates config-write and hash-recompute failures and avoids false synced reporting. The localized best-effort behavior still needs a fuller source-of-truth explanation: why the crash-prone in-sandbox layer cannot be fixed in this PR and when this degraded fallback should be removed.
    • Recommendation: Expand the comment or nearby docs to state the source boundary, why the sandbox write/hash source cannot be fixed here, what runtime condition or upstream fix removes the workaround, and how rebuild restores each degraded state.
    • Evidence: The comment identifies two degraded states and says rebuild fixes them. Tests cover write failure and hash failure, but no removal condition or source-fix constraint is documented.
  • Partial gateway-route repair needs clearer source-boundary documentation (src/lib/inference/config.ts:298): planInferenceRouteReconcile treats absent or partial live gateway routes as repair instead of divergence. That may be the right defensive behavior, but the code does not document why partial OpenShell output can occur, why connect should repair it, or whether this is permanent defensive handling versus a workaround.
    • Recommendation: Document the source boundary for null/provider-only/model-only gateway output and whether the repair behavior is intended as permanent defensive classification. If it is workaround behavior, include the source-fix constraint and removal condition.
    • Evidence: planInferenceRouteReconcile returns { kind: "repair" } for !live, !live.provider, or !live.model. Tests cover null, provider-only, and model-only repair cases.
  • Runtime validation is still recommended for gateway and sandbox route behavior: The added unit and fixture tests cover important branches, but these changes are in runtime/sandbox/infrastructure paths involving OpenShell, gateway route state, sandbox config writes, and integrity hashes. Mocked tests cannot fully validate real driver behavior or timing around route mutation and degraded states.
    • Recommendation: Add or identify targeted runtime/integration validation for route divergence, probe-only behavior, write/hash failures, registry/read failure paths, and malformed route diagnostics. Do not rely only on external E2E status surfaces.
    • Evidence: Deterministic test-depth context recommends runtime validation for src/lib/actions/inference-set.ts, src/lib/actions/sandbox/connect.ts, and src/lib/inference/config.ts.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26750157399
Target ref: edc02140f107b61d1b81a3b2c0d1f5dafd96e132
Workflow ref: main
Requested jobs: openclaw-inference-switch-e2e,hermes-inference-switch-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
hermes-inference-switch-e2e ✅ success
openclaw-inference-switch-e2e ✅ success

@hunglp6d hunglp6d self-assigned this Jun 1, 2026
@hunglp6d hunglp6d added E2E VRDC Issues and PRs submitted by NVIDIA VRDC test team. v0.0.56 Release target labels Jun 1, 2026
@cv

cv commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

@hunglp6d can you address the findings in #4611 (comment) please?

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26766842602
Target ref: a84d7df819232d6f6ad3c192d5d71067367e6ca7
Workflow ref: main
Requested jobs: openclaw-inference-switch-e2e,hermes-inference-switch-e2e,inference-routing-e2e,sandbox-operations-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
hermes-inference-switch-e2e ✅ success
inference-routing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
sandbox-operations-e2e ✅ success

@hunglp6d hunglp6d removed the v0.0.56 Release target label Jun 1, 2026
@hunglp6d

hunglp6d commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

@cv — one of the advisor's "--no-sync-inference" findings:
This PR removes the #3726 silent revert (divergent routes now warn loudly). The advisor's suggestion --no-sync-inference flag (that is suggested in the issue) would let nemoclaw <name> connect skip re-pointing the shared gateway — should we add it ?

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26806392358
Target ref: c52e86c50e0e9419159cba39d8e56e5da05ec9a6
Workflow ref: main
Requested jobs: openclaw-inference-switch-e2e,hermes-inference-switch-e2e,inference-routing-e2e,sandbox-operations-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
hermes-inference-switch-e2e ✅ success
inference-routing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
sandbox-operations-e2e ✅ success

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26816819955
Target ref: bde3dc026bf383e94e02e8d8722ec21bd19a5708
Workflow ref: main
Requested jobs: openclaw-inference-switch-e2e,hermes-inference-switch-e2e,inference-routing-e2e,sandbox-operations-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
hermes-inference-switch-e2e ✅ success
inference-routing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
sandbox-operations-e2e ✅ success

@hunglp6d

hunglp6d commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

@cv addressed the advisor findings:

  • partial gateway route now classifies as repair, not diverged (no more provider/null).
  • --probe-only divergence path now has a test.
  • inference set writes the registry before the in-sandbox sync and reports a degraded state (no false "synced") on sync failure.
  • divergence warning sanitizes route values before printing.
  • --no-sync-inference: leaving it out — the loud warning plus the supported inference set path is sufficient, and re-pointing the shared gateway is the correct behavior.

PR now is ready

@hunglp6d hunglp6d added the v0.0.57 Release target label Jun 2, 2026
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26852944551
Target ref: 784d6a7e88bac7b75f4e0d7d0ad9897a8cd3d24d
Workflow ref: main
Requested jobs: openclaw-inference-switch-e2e,hermes-inference-switch-e2e,inference-routing-e2e,sandbox-operations-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
hermes-inference-switch-e2e ✅ success
inference-routing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
sandbox-operations-e2e ✅ success

@cv cv enabled auto-merge (squash) June 2, 2026 23:28
@cv cv merged commit f17a19a into main Jun 2, 2026
28 checks passed
@cv cv deleted the fix/connect-inference-route-revert branch June 2, 2026 23:33
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26854232296
Target ref: bb732625ecb5a3251c4b5a3a72c39127b9003a3e
Workflow ref: main
Requested jobs: openclaw-inference-switch-e2e,hermes-inference-switch-e2e,inference-routing-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
hermes-inference-switch-e2e ✅ success
inference-routing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success

@wscurran wscurran added the area: e2e End-to-end tests, nightly failures, or validation infrastructure label Jun 3, 2026
@wscurran wscurran removed the E2E label Jun 3, 2026
cv pushed a commit that referenced this pull request Jun 3, 2026
## Summary
- Add the missing `v0.0.57` release-notes section with links to the
detailed docs pages for command, inference, onboarding, messaging,
status, installer, and policy changes.
- Remove public references to docs-skip terms from source docs and
regenerate the NemoClaw user skills from the current Fern MDX docs.
- Carry forward generated references for the per-agent documentation
split, including Hermes-specific reference files.

## Source summary
- #4615 and #4653 -> `docs/about/release-notes.mdx`,
`docs/reference/commands.mdx`: Release notes now cover host-side
`sessions` and `agents` commands plus `NEMOCLAW_EXTRA_AGENTS_JSON`
secondary-agent baking.
- #4163, #4204, #4611, #4619, and #4676 ->
`docs/about/release-notes.mdx`,
`docs/inference/use-local-inference.mdx`: Release notes now cover
managed vLLM progress/readiness, DGX Spark model default changes, local
Ollama streaming usage, and inference route divergence warnings.
- #4267, #4601, #4609, #4642, #4645, and #4661 ->
`docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release
notes now cover UFW auto-remediation, local-inference reachability
gates, gateway reuse/binding, cancel rollback, and policy selection
persistence.
- #4577, #4582, #4607, and #4660 -> `docs/about/release-notes.mdx`,
`docs/manage-sandboxes/messaging-channels.mdx`: Release notes now cover
Slack validation, atomic `channels add`, WhatsApp QR diagnostics, and
Slack placeholder normalization.
- #4388, #4600, #4646, and #4647 -> `docs/about/release-notes.mdx`,
`docs/reference/commands.mdx`: Release notes now cover status failure
layers, paused-container hints, Docker-driver doctor behavior, and
non-destructive stale-registry recovery.
- #4569, #4579, and #4678 -> `docs/about/release-notes.mdx`,
`docs/manage-sandboxes/lifecycle.mdx`,
`docs/network-policy/integration-policy-examples.mdx`: Release notes now
cover installer tag pinning, PyPI `uv` policy access, and observable
Jira validation.
- #4632 -> `.agents/skills/`: Regenerated user skills from the current
per-agent docs source, including newly generated Hermes reference files.

## Verification
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --doc-platform fern-mdx`
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" docs --glob "*.mdx"`
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" .agents/skills --glob "*.md"`
- `npm run docs`
- `npm run build:cli`
- Commit hooks: markdownlint, docs-to-skills verification, gitleaks,
skills YAML, commitlint

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Restructured documentation to clearly distinguish OpenClaw and Hermes
agent variants throughout user guides.
* Enhanced security, credential storage, and deployment guidance with
clearer setup flows.
  * Added Hermes plugin installation and ecosystem documentation.
* Improved workspace, messaging, and policy management references with
variant-specific command examples.
  * Refined troubleshooting and CLI reference sections for clarity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure bug-fix PR fixes a bug or regression v0.0.57 Release target VRDC Issues and PRs submitted by NVIDIA VRDC test team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[macOS][CLI&UX] nemoclaw <name> silently runs connect and reverts gateway inference

3 participants