Skip to content

fix(onboard): defer sandboxName to post-create to prevent phantom list#2931

Merged
ericksoa merged 3 commits into
mainfrom
fix/2753-onboard-session-phantom
May 3, 2026
Merged

fix(onboard): defer sandboxName to post-create to prevent phantom list#2931
ericksoa merged 3 commits into
mainfrom
fix/2753-onboard-session-phantom

Conversation

@laitingsheng

@laitingsheng laitingsheng commented May 3, 2026

Copy link
Copy Markdown
Contributor

Summary

nemoclaw onboard wrote sandboxName into onboard-session.json at Step 3 (provider_selection), well before sandbox creation in Step 6. A SIGINT in between left a session entry pointing at a sandbox that was never created — and nemoclaw list then surfaced the phantom name both as a "recovered" registry entry and as the empty-state "last onboarded sandbox" hint, persisting until the user manually destroyed it.

This PR addresses the bug at the root: the session only records sandboxName after openshell sandbox create has actually returned. Three layered guards make the surface symptom-free without taking any destructive action against existing registry state:

  1. Root-cause fix — strip sandboxName from every startRecordedStep / markStepComplete call before the sandbox step's markStepComplete (which runs after createSandbox() returns). Pre-create writes can no longer create phantoms.
  2. Seed-time guardseedRecoveryMetadata and getSandboxInventory skip session-derived surfaces when session.steps.sandbox.status !== "complete". Pre-fix on-disk session files no longer get re-seeded into the registry, and the empty-state hint suppresses them too.
  3. Resume guard — non-interactive --resume exits early with a clear error when no name can be recovered from --name / NEMOCLAW_SANDBOX_NAME / a complete prior session. Prevents silently defaulting to the agent's my-assistant.

Related Issue

Fixes #2753

Changes

  • src/lib/onboard.ts:

    • Strip sandboxName from startRecordedStep("provider_selection"), markStepComplete("provider_selection", …), the resume-branch markStepComplete, startRecordedStep("inference", …), both markStepComplete("inference", …) calls, and startRecordedStep("sandbox", …). The first persistence of sandboxName is now markStepComplete("sandbox", …), which runs after createSandbox() succeeds.
    • Add a non-interactive resume guard right after the resume init block (before preflight side effects). Errors with Cannot resume non-interactive onboard: the previous run was interrupted before sandbox creation completed, so no sandbox name was recorded. Re-run with --name <sandbox> (or set NEMOCLAW_SANDBOX_NAME). when the session shows the sandbox step did not complete and no name was provided through --name or NEMOCLAW_SANDBOX_NAME.
  • src/lib/registry-recovery-action.ts: add isSessionSandboxConfirmed(session) predicate that requires session.steps.sandbox.status === "complete". seedRecoveryMetadata early-returns when not confirmed, so nemoclaw list cannot resurrect a phantom by reading a stale session file.

  • src/lib/inventory-commands.ts: extend the loadLastSession dep type to include steps, then filter lastOnboardedSandbox on steps.sandbox.status === "complete". The empty-state hint stops resurrecting phantom names.

  • test/onboard.test.ts: update the existing source-level regex to match the new { provider, model } options shape on the sandbox step's startRecordedStep. New does not persist sandboxName … (#2753) test pins the contract: no startRecordedStep call before sandbox passes sandboxName, no markStepComplete records sandboxName until after createSandbox().

  • test/cli.test.ts: new #2753: refuses non-interactive --resume when sandbox step never completed and no name is provided test exercising the resume guard end-to-end with a stale session fixture.

  • src/lib/inventory-commands.test.ts: existing fixtures updated to include steps: { sandbox: { status: "complete" } } (preserves the original "alpha was successfully onboarded previously" intent). New #2753: suppresses last-onboarded hint when sandbox step never completed test asserts the phantom name does not surface in the empty-state hint.

  • src/lib/registry-recovery-action.test.ts (new): four tests for the seed-time guard — phantom skipped, completed session seeded, empty state, and a defensive "registered alpha + stale session ≠ eviction" case to lock in that this PR does not actively mutate the registry.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

(No user-facing docs needed for this fix — the user-visible behaviour is "phantom no longer appears", which is the absence of the bug.)

Migration note for existing users

This PR does not actively evict already-persisted phantoms from sandboxes.json — the false-positive risk against real registry entries that share a name with a stale session is too high to justify destructive cleanup. Anyone who already hit the bug on a prior NemoClaw release and has the phantom registered locally can clean it up with one command:

nemoclaw destroy <phantom-name>

Verification

  • npx prek run --all-files passes (modulo the pre-existing test/secret-redaction.test.ts > debug.sh sed fallback includes essential prefixes > redacts essential token prefixes when node is unavailable failure that exists on main unchanged)
  • npm test passes — 2576 CLI tests pass with the same single pre-existing failure
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

AI Disclosure

  • AI-assisted — tools: Claude Code, OpenAI Codex

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

  • Bug Fixes

    • Recovery and resume logic now only trusts sandboxes from fully completed onboarding sessions; incomplete sessions are ignored.
    • Non-interactive onboarding resume now fails fast when the prior session didn’t finish sandbox onboarding instead of silently using a saved name.
    • Session persistence no longer records sandbox identifiers until sandbox creation completes.
  • Tests

    • Added regression and unit tests covering resume guards, registry recovery, and session persistence behaviors.

#2753)

The wizard wrote `sandboxName` to onboard-session.json at Step 3
(provider_selection), well before sandbox creation in Step 6. A SIGINT
in between left a session entry pointing at a sandbox that was never
created; subsequent `nemoclaw list` calls then surfaced the phantom name
both as a recovered registry entry and as the empty-state hint.

Defer the write until after `createSandbox()` returns, so the session
only ever records names backed by a real gateway sandbox. Add a
seed-time guard so pre-fix on-disk sessions can't be revived, a
view-time filter so the empty-state hint won't surface them either, and
a non-interactive `--resume` guard so a missing recorded name fails
loudly instead of silently defaulting to `my-assistant`.

Pre-existing phantom entries already in `sandboxes.json` are not
actively evicted (avoids false positives against real registry entries
sharing a name with a stale session); the workaround is a one-time
`nemoclaw destroy <name>`.

Fixes #2753

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@coderabbitai

coderabbitai Bot commented May 3, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fdb94675-5704-412b-a002-919f229749b6

📥 Commits

Reviewing files that changed from the base of the PR and between bc4c510 and 0c1ec81.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/onboard.test.ts

📝 Walkthrough

Walkthrough

This change makes a recorded session's sandbox name count as authoritative only when the session's sandbox step status is "complete". It updates session shapes, onboarding persistence, inventory and registry recovery logic, CLI resume guards, and adds tests covering incomplete-session behavior and persistence ordering.

Changes

Sandbox confirmation & resume-safety

Layer / File(s) Summary
Data Shape
src/lib/inventory-commands.ts
ListSandboxesCommandDeps.loadLastSession return type expanded to optionally include steps?.sandbox?.status alongside sandboxName.
Session Validation Helper
src/lib/registry-recovery-action.ts
Adds isSessionSandboxConfirmed(session) which requires session.sandboxName and session.steps?.sandbox?.status === "complete".
Registry Seeding Guard
src/lib/registry-recovery-action.ts
seedRecoveryMetadata() now skips seeding session-derived sandbox metadata unless isSessionSandboxConfirmed() is true.
Inventory Listing Logic
src/lib/inventory-commands.ts
getSandboxInventory() computes lastOnboardedSandbox only when the session sandbox step status is "complete", otherwise returns null.
Onboard Step Bookkeeping
src/lib/onboard.ts
startRecordedStep(...) calls for provider_selection, inference, and sandbox no longer persist sandboxName; toSessionUpdates for inference drops sandboxName. Resume handling refuses non-interactive resume if no confirmed sandbox name is available; recovered recordedSandboxName is used only when the sandbox step is "complete".
Tests & Regression Coverage
src/lib/inventory-commands.test.ts, src/lib/registry-recovery-action.test.ts, test/cli.test.ts, test/onboard.test.ts
Updated mocks to include steps.sandbox.status: "complete" where appropriate. Added regression tests (#2753) asserting: inventory listing suppresses last-sandbox hint for incomplete steps; registry recovery seed-time guard behavior (multiple scenarios); CLI rejects --resume --non-interactive when sandbox step incomplete and no valid name provided (including whitespace-only); and onboard persistence ordering ensures sandboxName is not written before createSandbox() completes.
sequenceDiagram
  participant CLI as CLI
  participant Onboard as Onboard
  participant SessionStore as SessionStore
  participant Registry as Registry
  CLI->>Onboard: run --resume --non-interactive
  Onboard->>SessionStore: loadLastSession()
  Note over Onboard,SessionStore: examine session.steps.sandbox.status
  alt sandbox step === "complete"
    Onboard->>Registry: accept session.sandboxName (seed/recover)
    Onboard->>CLI: continue resume flow
  else sandbox step !== "complete"
    Onboard->>CLI: refuse non-interactive resume (error)
    Onboard->>Registry: do not seed session sandbox
  end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

🐰 I hopped through lines of session lore,
No name before the sandbox door.
Only after creation's chime,
The name will dwell in honest time. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(onboard): defer sandboxName to post-create to prevent phantom list' directly addresses the main change: deferring sandboxName persistence to after sandbox creation to prevent phantom entries from appearing in the sandbox list.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/2753-onboard-session-phantom

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@laitingsheng laitingsheng changed the title fix(onboard): defer sandboxName to post-create to prevent phantom list (#2753) fix(onboard): defer sandboxName to post-create to prevent phantom list May 3, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/onboard.test.ts (1)

2969-3015: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Exercise the persisted session file instead of regex-matching onboard.ts.

These checks only lock in source layout. They never simulate an interrupted onboard or inspect onboard-session.json, so the phantom-entry regression can reappear through a helper/session-layer change while this test still passes. Please drive the flow with a temp HOME, stop before createSandbox() completes, and assert the saved session omits sandboxName.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 2969 - 3015, Replace the brittle regex
assertions in the test with an integration-style check that runs the onboard
flow using a temporary HOME and inspects the persisted onboard-session.json: set
up a tmp HOME dir, stub or intercept createSandbox (the sandboxName = await
createSandbox(...) call in onboard.ts) so the flow reaches the point just before
createSandbox completes (e.g. by returning a deferred Promise or mocking
createSandbox to pause), run the onboarding steps (so startRecordedStep /
onboardSession updates execute), write out the session file, then read the saved
onboard-session.json and assert it does NOT contain sandboxName; ensure you
still verify earlier steps ran (startRecordedStep("provider_selection"),
startRecordedStep("inference"), startRecordedStep("sandbox")) by observing the
persisted session state rather than matching source text and clean up the temp
HOME after the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 8489-8510: Update the resume guard to use the resolved
non-promptable sandbox name instead of raw inputs: compute a single resolvedName
(prefer session?.sandboxName, then requestedSandboxName, then
process.env.NEMOCLAW_SANDBOX_NAME trimmed and only if non-empty, and any
existing non-promptable name state) and then require resolvedName to exist when
blocking resumes for non-interactive runs; replace the current checks that
reference process.env.NEMOCLAW_SANDBOX_NAME and requestedSandboxName directly,
keep the sandboxStepCompleted and isNonInteractive() conditions but test against
resolvedName (trim and ignore whitespace-only env values) so --resume on non-TTY
without explicit --non-interactive is handled correctly.

---

Outside diff comments:
In `@test/onboard.test.ts`:
- Around line 2969-3015: Replace the brittle regex assertions in the test with
an integration-style check that runs the onboard flow using a temporary HOME and
inspects the persisted onboard-session.json: set up a tmp HOME dir, stub or
intercept createSandbox (the sandboxName = await createSandbox(...) call in
onboard.ts) so the flow reaches the point just before createSandbox completes
(e.g. by returning a deferred Promise or mocking createSandbox to pause), run
the onboarding steps (so startRecordedStep / onboardSession updates execute),
write out the session file, then read the saved onboard-session.json and assert
it does NOT contain sandboxName; ensure you still verify earlier steps ran
(startRecordedStep("provider_selection"), startRecordedStep("inference"),
startRecordedStep("sandbox")) by observing the persisted session state rather
than matching source text and clean up the temp HOME after the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a681a11b-f543-40a5-b7df-686e480d77a2

📥 Commits

Reviewing files that changed from the base of the PR and between 325d29e and 36b05e2.

📒 Files selected for processing (7)
  • src/lib/inventory-commands.test.ts
  • src/lib/inventory-commands.ts
  • src/lib/onboard.ts
  • src/lib/registry-recovery-action.test.ts
  • src/lib/registry-recovery-action.ts
  • test/cli.test.ts
  • test/onboard.test.ts

Comment thread src/lib/onboard.ts Outdated
… values

CodeRabbit follow-up: replace `isNonInteractive()` with `cannotPrompt`
so non-TTY runs without explicit `--non-interactive` are caught by the
intent expressed locally rather than by an upstream usage-notice
guard, and replace the raw `NEMOCLAW_SANDBOX_NAME` check with the
already-trimmed `requestedSandboxName`. Whitespace-only env values
no longer slip through.

Add a regression test for the whitespace case.

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 8489-8508: The code still lets a stale incomplete
session.sandboxName leak into resume logic; update all resume-path uses to
prefer the validated recoveredSandboxName instead of raw session.sandboxName:
change the getResumeConfigConflicts(...) call to pass recoveredSandboxName (or
compute its equivalent) so conflict checks use the ignored-incomplete value, and
ensure the later assignment that sets sandboxName uses recoveredSandboxName
first (e.g., let sandboxName = recoveredSandboxName || requestedSandboxName ||
null) so an incomplete session.sandboxName can no longer override an explicit
--name / NEMOCLAW_SANDBOX_NAME.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 19f13664-5094-4cf8-ac0d-2d098a5f51d8

📥 Commits

Reviewing files that changed from the base of the PR and between 36b05e2 and bc4c510.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/cli.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/cli.test.ts

Comment thread src/lib/onboard.ts
…e guard

CodeRabbit follow-up: a stale `session.sandboxName` from a pre-fix
interrupted onboard could still leak into resume flow in two places:
- `getResumeSandboxConflict` would falsely block `--resume --name <new>`
  with "session belongs to <stale>".
- The post-init `let sandboxName = session?.sandboxName || requestedSandboxName`
  would silently override an explicit `--name` / NEMOCLAW_SANDBOX_NAME
  with the stale recorded value.

Both now treat `session.sandboxName` as a recovery source only when the
sandbox step completed. Users hit by the original bug can now reliably
recover with an explicit name.

Update existing fixtures to include `steps.sandbox.status: "complete"`
where the test's intent was a real completed onboard, and add a #2753
regression covering the incomplete-session case.

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. I reviewed the current head 0c1ec81 against the PR base and the fix looks correct for #2753.

The important behavior is covered at all three layers: onboard no longer persists sandboxName before createSandbox() succeeds, registry recovery only seeds session-derived inventory when the sandbox step completed, and list output suppresses incomplete-session names so an interrupted onboard cannot surface a phantom sandbox. The follow-up commits also handle the main adversarial resume cases: non-promptable resume without a recoverable name fails loudly, whitespace-only env values do not bypass that guard, and stale incomplete-session sandboxName no longer blocks or overrides an explicit recovery --name / NEMOCLAW_SANDBOX_NAME.

I also checked the CodeRabbit threads and they are resolved. Local validation passed with npm ci --ignore-scripts, npm run build:cli, npm run typecheck:cli, git diff --check, and the focused regression suite: npx vitest run src/lib/registry-recovery-action.test.ts src/lib/inventory-commands.test.ts test/onboard.test.ts test/cli.test.ts (319 tests). GitHub checks are green, including checks, macos-e2e, wsl-e2e, sandbox image builds, and the self-hosted e2e jobs.

I do not think a full nightly is necessary for this PR; the remaining rough edge I noticed is only that incomplete-session sandboxName can still cause an unnecessary gateway recovery probe, but it does not re-register or display the phantom and is not a merge blocker.

@ericksoa ericksoa merged commit aed5064 into main May 3, 2026
18 checks passed
@ericksoa ericksoa deleted the fix/2753-onboard-session-phantom branch May 3, 2026 13:42
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ubuntu 24.10][Onboard] sandbox name written to onboard-session.json before creation — nemoclaw list shows phantom entry after SIGINT

3 participants