fix(onboard): defer sandboxName to post-create to prevent phantom list by laitingsheng · Pull Request #2931 · NVIDIA/NemoClaw

laitingsheng · 2026-05-03T11:53:26Z

Summary

nemoclaw onboard wrote sandboxName into onboard-session.json at Step 3 (provider_selection), well before sandbox creation in Step 6. A SIGINT in between left a session entry pointing at a sandbox that was never created — and nemoclaw list then surfaced the phantom name both as a "recovered" registry entry and as the empty-state "last onboarded sandbox" hint, persisting until the user manually destroyed it.

This PR addresses the bug at the root: the session only records sandboxName after openshell sandbox create has actually returned. Three layered guards make the surface symptom-free without taking any destructive action against existing registry state:

Root-cause fix — strip sandboxName from every startRecordedStep / markStepComplete call before the sandbox step's markStepComplete (which runs after createSandbox() returns). Pre-create writes can no longer create phantoms.
Seed-time guard — seedRecoveryMetadata and getSandboxInventory skip session-derived surfaces when session.steps.sandbox.status !== "complete". Pre-fix on-disk session files no longer get re-seeded into the registry, and the empty-state hint suppresses them too.
Resume guard — non-interactive --resume exits early with a clear error when no name can be recovered from --name / NEMOCLAW_SANDBOX_NAME / a complete prior session. Prevents silently defaulting to the agent's my-assistant.

Related Issue

Fixes #2753

Changes

src/lib/onboard.ts:
- Strip sandboxName from startRecordedStep("provider_selection"), markStepComplete("provider_selection", …), the resume-branch markStepComplete, startRecordedStep("inference", …), both markStepComplete("inference", …) calls, and startRecordedStep("sandbox", …). The first persistence of sandboxName is now markStepComplete("sandbox", …), which runs after createSandbox() succeeds.
- Add a non-interactive resume guard right after the resume init block (before preflight side effects). Errors with Cannot resume non-interactive onboard: the previous run was interrupted before sandbox creation completed, so no sandbox name was recorded. Re-run with --name <sandbox> (or set NEMOCLAW_SANDBOX_NAME). when the session shows the sandbox step did not complete and no name was provided through --name or NEMOCLAW_SANDBOX_NAME.
src/lib/registry-recovery-action.ts: add isSessionSandboxConfirmed(session) predicate that requires session.steps.sandbox.status === "complete". seedRecoveryMetadata early-returns when not confirmed, so nemoclaw list cannot resurrect a phantom by reading a stale session file.
src/lib/inventory-commands.ts: extend the loadLastSession dep type to include steps, then filter lastOnboardedSandbox on steps.sandbox.status === "complete". The empty-state hint stops resurrecting phantom names.
test/onboard.test.ts: update the existing source-level regex to match the new { provider, model } options shape on the sandbox step's startRecordedStep. New does not persist sandboxName … (#2753) test pins the contract: no startRecordedStep call before sandbox passes sandboxName, no markStepComplete records sandboxName until after createSandbox().
test/cli.test.ts: new #2753: refuses non-interactive --resume when sandbox step never completed and no name is provided test exercising the resume guard end-to-end with a stale session fixture.
src/lib/inventory-commands.test.ts: existing fixtures updated to include steps: { sandbox: { status: "complete" } } (preserves the original "alpha was successfully onboarded previously" intent). New #2753: suppresses last-onboarded hint when sandbox step never completed test asserts the phantom name does not surface in the empty-state hint.
src/lib/registry-recovery-action.test.ts (new): four tests for the seed-time guard — phantom skipped, completed session seeded, empty state, and a defensive "registered alpha + stale session ≠ eviction" case to lock in that this PR does not actively mutate the registry.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

(No user-facing docs needed for this fix — the user-visible behaviour is "phantom no longer appears", which is the absence of the bug.)

Migration note for existing users

This PR does not actively evict already-persisted phantoms from sandboxes.json — the false-positive risk against real registry entries that share a name with a stale session is too high to justify destructive cleanup. Anyone who already hit the bug on a prior NemoClaw release and has the phantom registered locally can clean it up with one command:

nemoclaw destroy <phantom-name>

Verification

npx prek run --all-files passes (modulo the pre-existing test/secret-redaction.test.ts > debug.sh sed fallback includes essential prefixes > redacts essential token prefixes when node is unavailable failure that exists on main unchanged)
npm test passes — 2576 CLI tests pass with the same single pre-existing failure
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

AI Disclosure

AI-assisted — tools: Claude Code, OpenAI Codex

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

Bug Fixes
- Recovery and resume logic now only trusts sandboxes from fully completed onboarding sessions; incomplete sessions are ignored.
- Non-interactive onboarding resume now fails fast when the prior session didn’t finish sandbox onboarding instead of silently using a saved name.
- Session persistence no longer records sandbox identifiers until sandbox creation completes.
Tests
- Added regression and unit tests covering resume guards, registry recovery, and session persistence behaviors.

#2753) The wizard wrote `sandboxName` to onboard-session.json at Step 3 (provider_selection), well before sandbox creation in Step 6. A SIGINT in between left a session entry pointing at a sandbox that was never created; subsequent `nemoclaw list` calls then surfaced the phantom name both as a recovered registry entry and as the empty-state hint. Defer the write until after `createSandbox()` returns, so the session only ever records names backed by a real gateway sandbox. Add a seed-time guard so pre-fix on-disk sessions can't be revived, a view-time filter so the empty-state hint won't surface them either, and a non-interactive `--resume` guard so a missing recorded name fails loudly instead of silently defaulting to `my-assistant`. Pre-existing phantom entries already in `sandboxes.json` are not actively evicted (avoids false positives against real registry entries sharing a name with a stale session); the workaround is a one-time `nemoclaw destroy <name>`. Fixes #2753 Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai · 2026-05-03T11:53:38Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fdb94675-5704-412b-a002-919f229749b6

📥 Commits

Reviewing files that changed from the base of the PR and between bc4c510 and 0c1ec81.

📒 Files selected for processing (2)

src/lib/onboard.ts
test/onboard.test.ts

📝 Walkthrough

Walkthrough

This change makes a recorded session's sandbox name count as authoritative only when the session's sandbox step status is "complete". It updates session shapes, onboarding persistence, inventory and registry recovery logic, CLI resume guards, and adds tests covering incomplete-session behavior and persistence ordering.

Changes

Sandbox confirmation & resume-safety

Layer / File(s)	Summary
Data Shape `src/lib/inventory-commands.ts`	`ListSandboxesCommandDeps.loadLastSession` return type expanded to optionally include `steps?.sandbox?.status` alongside `sandboxName`.
Session Validation Helper `src/lib/registry-recovery-action.ts`	Adds `isSessionSandboxConfirmed(session)` which requires `session.sandboxName` and `session.steps?.sandbox?.status === "complete"`.
Registry Seeding Guard `src/lib/registry-recovery-action.ts`	`seedRecoveryMetadata()` now skips seeding session-derived sandbox metadata unless `isSessionSandboxConfirmed()` is true.
Inventory Listing Logic `src/lib/inventory-commands.ts`	`getSandboxInventory()` computes `lastOnboardedSandbox` only when the session sandbox step status is `"complete"`, otherwise returns `null`.
Onboard Step Bookkeeping `src/lib/onboard.ts`	`startRecordedStep(...)` calls for `provider_selection`, `inference`, and `sandbox` no longer persist `sandboxName`; `toSessionUpdates` for inference drops `sandboxName`. Resume handling refuses non-interactive resume if no confirmed sandbox name is available; recovered `recordedSandboxName` is used only when the sandbox step is `"complete"`.
Tests & Regression Coverage `src/lib/inventory-commands.test.ts`, `src/lib/registry-recovery-action.test.ts`, `test/cli.test.ts`, `test/onboard.test.ts`	Updated mocks to include `steps.sandbox.status: "complete"` where appropriate. Added regression tests (`#2753`) asserting: inventory listing suppresses last-sandbox hint for incomplete steps; registry recovery seed-time guard behavior (multiple scenarios); CLI rejects `--resume --non-interactive` when sandbox step incomplete and no valid name provided (including whitespace-only); and onboard persistence ordering ensures `sandboxName` is not written before `createSandbox()` completes.

sequenceDiagram
  participant CLI as CLI
  participant Onboard as Onboard
  participant SessionStore as SessionStore
  participant Registry as Registry
  CLI->>Onboard: run --resume --non-interactive
  Onboard->>SessionStore: loadLastSession()
  Note over Onboard,SessionStore: examine session.steps.sandbox.status
  alt sandbox step === "complete"
    Onboard->>Registry: accept session.sandboxName (seed/recover)
    Onboard->>CLI: continue resume flow
  else sandbox step !== "complete"
    Onboard->>CLI: refuse non-interactive resume (error)
    Onboard->>Registry: do not seed session sandbox
  end

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

🐰 I hopped through lines of session lore,
No name before the sandbox door.
Only after creation's chime,
The name will dwell in honest time. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(onboard): defer sandboxName to post-create to prevent phantom list' directly addresses the main change: deferring sandboxName persistence to after sandbox creation to prevent phantom entries from appearing in the sandbox list.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/2753-onboard-session-phantom

_{Review rate limit: 9/10 reviews remaining, refill in 6 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

test/onboard.test.ts (1)
2969-3015: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Exercise the persisted session file instead of regex-matching onboard.ts.

These checks only lock in source layout. They never simulate an interrupted onboard or inspect onboard-session.json, so the phantom-entry regression can reappear through a helper/session-layer change while this test still passes. Please drive the flow with a temp HOME, stop before createSandbox() completes, and assert the saved session omits sandboxName.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 2969 - 3015, Replace the brittle regex
assertions in the test with an integration-style check that runs the onboard
flow using a temporary HOME and inspects the persisted onboard-session.json: set
up a tmp HOME dir, stub or intercept createSandbox (the sandboxName = await
createSandbox(...) call in onboard.ts) so the flow reaches the point just before
createSandbox completes (e.g. by returning a deferred Promise or mocking
createSandbox to pause), run the onboarding steps (so startRecordedStep /
onboardSession updates execute), write out the session file, then read the saved
onboard-session.json and assert it does NOT contain sandboxName; ensure you
still verify earlier steps ran (startRecordedStep("provider_selection"),
startRecordedStep("inference"), startRecordedStep("sandbox")) by observing the
persisted session state rather than matching source text and clean up the temp
HOME after the test.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 8489-8510: Update the resume guard to use the resolved
non-promptable sandbox name instead of raw inputs: compute a single resolvedName
(prefer session?.sandboxName, then requestedSandboxName, then
process.env.NEMOCLAW_SANDBOX_NAME trimmed and only if non-empty, and any
existing non-promptable name state) and then require resolvedName to exist when
blocking resumes for non-interactive runs; replace the current checks that
reference process.env.NEMOCLAW_SANDBOX_NAME and requestedSandboxName directly,
keep the sandboxStepCompleted and isNonInteractive() conditions but test against
resolvedName (trim and ignore whitespace-only env values) so --resume on non-TTY
without explicit --non-interactive is handled correctly.

---

Outside diff comments:
In `@test/onboard.test.ts`:
- Around line 2969-3015: Replace the brittle regex assertions in the test with
an integration-style check that runs the onboard flow using a temporary HOME and
inspects the persisted onboard-session.json: set up a tmp HOME dir, stub or
intercept createSandbox (the sandboxName = await createSandbox(...) call in
onboard.ts) so the flow reaches the point just before createSandbox completes
(e.g. by returning a deferred Promise or mocking createSandbox to pause), run
the onboarding steps (so startRecordedStep / onboardSession updates execute),
write out the session file, then read the saved onboard-session.json and assert
it does NOT contain sandboxName; ensure you still verify earlier steps ran
(startRecordedStep("provider_selection"), startRecordedStep("inference"),
startRecordedStep("sandbox")) by observing the persisted session state rather
than matching source text and clean up the temp HOME after the test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a681a11b-f543-40a5-b7df-686e480d77a2

📥 Commits

Reviewing files that changed from the base of the PR and between 325d29e and 36b05e2.

📒 Files selected for processing (7)

src/lib/inventory-commands.test.ts
src/lib/inventory-commands.ts
src/lib/onboard.ts
src/lib/registry-recovery-action.test.ts
src/lib/registry-recovery-action.ts
test/cli.test.ts
test/onboard.test.ts

… values CodeRabbit follow-up: replace `isNonInteractive()` with `cannotPrompt` so non-TTY runs without explicit `--non-interactive` are caught by the intent expressed locally rather than by an upstream usage-notice guard, and replace the raw `NEMOCLAW_SANDBOX_NAME` check with the already-trimmed `requestedSandboxName`. Whitespace-only env values no longer slip through. Add a regression test for the whitespace case. Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 8489-8508: The code still lets a stale incomplete
session.sandboxName leak into resume logic; update all resume-path uses to
prefer the validated recoveredSandboxName instead of raw session.sandboxName:
change the getResumeConfigConflicts(...) call to pass recoveredSandboxName (or
compute its equivalent) so conflict checks use the ignored-incomplete value, and
ensure the later assignment that sets sandboxName uses recoveredSandboxName
first (e.g., let sandboxName = recoveredSandboxName || requestedSandboxName ||
null) so an incomplete session.sandboxName can no longer override an explicit
--name / NEMOCLAW_SANDBOX_NAME.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 19f13664-5094-4cf8-ac0d-2d098a5f51d8

📥 Commits

Reviewing files that changed from the base of the PR and between 36b05e2 and bc4c510.

📒 Files selected for processing (2)

src/lib/onboard.ts
test/cli.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

test/cli.test.ts

…e guard CodeRabbit follow-up: a stale `session.sandboxName` from a pre-fix interrupted onboard could still leak into resume flow in two places: - `getResumeSandboxConflict` would falsely block `--resume --name <new>` with "session belongs to <stale>". - The post-init `let sandboxName = session?.sandboxName || requestedSandboxName` would silently override an explicit `--name` / NEMOCLAW_SANDBOX_NAME with the stale recorded value. Both now treat `session.sandboxName` as a recovery source only when the sandbox step completed. Users hit by the original bug can now reliably recover with an explicit name. Update existing fixtures to include `steps.sandbox.status: "complete"` where the test's intent was a real completed onboard, and add a #2753 regression covering the incomplete-session case. Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

ericksoa

Approved. I reviewed the current head 0c1ec81 against the PR base and the fix looks correct for #2753.

The important behavior is covered at all three layers: onboard no longer persists sandboxName before createSandbox() succeeds, registry recovery only seeds session-derived inventory when the sandbox step completed, and list output suppresses incomplete-session names so an interrupted onboard cannot surface a phantom sandbox. The follow-up commits also handle the main adversarial resume cases: non-promptable resume without a recoverable name fails loudly, whitespace-only env values do not bypass that guard, and stale incomplete-session sandboxName no longer blocks or overrides an explicit recovery --name / NEMOCLAW_SANDBOX_NAME.

I also checked the CodeRabbit threads and they are resolved. Local validation passed with npm ci --ignore-scripts, npm run build:cli, npm run typecheck:cli, git diff --check, and the focused regression suite: npx vitest run src/lib/registry-recovery-action.test.ts src/lib/inventory-commands.test.ts test/onboard.test.ts test/cli.test.ts (319 tests). GitHub checks are green, including checks, macos-e2e, wsl-e2e, sandbox image builds, and the self-hosted e2e jobs.

I do not think a full nightly is necessary for this PR; the remaining rough edge I noticed is only that incomplete-session sandboxName can still cause an unnecessary gateway recovery probe, but it does not re-register or display the phantom and is not a merge blocker.

laitingsheng changed the title ~~fix(onboard): defer sandboxName to post-create to prevent phantom list (#2753)~~ fix(onboard): defer sandboxName to post-create to prevent phantom list May 3, 2026

coderabbitai Bot reviewed May 3, 2026

View reviewed changes

Comment thread src/lib/onboard.ts Outdated

coderabbitai Bot reviewed May 3, 2026

View reviewed changes

Comment thread src/lib/onboard.ts

ericksoa approved these changes May 3, 2026

View reviewed changes

ericksoa merged commit aed5064 into main May 3, 2026
18 checks passed

ericksoa deleted the fix/2753-onboard-session-phantom branch May 3, 2026 13:42

laitingsheng mentioned this pull request May 4, 2026

[Station][Recovery] failed onboard (exit 1) registers ghost sandbox via session recovery — list/status return data, connect says does not exist #2756

Closed

wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(onboard): defer sandboxName to post-create to prevent phantom list#2931

fix(onboard): defer sandboxName to post-create to prevent phantom list#2931
ericksoa merged 3 commits into
mainfrom
fix/2753-onboard-session-phantom

laitingsheng commented May 3, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

ericksoa left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

laitingsheng commented May 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Migration note for existing users

Verification

AI Disclosure

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ericksoa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

laitingsheng commented May 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 3, 2026 •

edited

Loading