fix(shields): clean up stale shields state on sandbox destroy#3162
Conversation
📝 WalkthroughWalkthroughThis PR adds cleanup of host-side shields state JSON files ( ChangesShields State Cleanup
Sequence Diagram(s)sequenceDiagram
participant Caller
participant destroySandbox
participant cleanupSandboxServices
participant removeShieldsState
participant FS
Caller->>destroySandbox: initiate destroySandbox(name)
destroySandbox->>cleanupSandboxServices: cleanupSandboxServices(name)
destroySandbox->>removeShieldsState: removeShieldsState(name)
removeShieldsState->>FS: validate paths & rm shields-<name>.json files
removeShieldsState-->>destroySandbox: complete
destroySandbox-->>Caller: continue remaining cleanup
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/image-cleanup.test.ts (1)
147-156: ⚡ Quick winAlso assert non-target
shields-timer-*file is preserved.The isolation test currently verifies only
shields-<other>.json. Addingshields-timer-<other>.jsonin the same case would fully cover both prefixes.Suggested test extension
it("does not remove state files for other sandboxes", () => { const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-shields-cleanup-")); try { const otherFile = path.join(tmpDir, "shields-bravo.json"); + const otherTimerFile = path.join(tmpDir, "shields-timer-bravo.json"); fs.writeFileSync(otherFile, JSON.stringify({ shieldsDown: false })); + fs.writeFileSync(otherTimerFile, JSON.stringify({ pid: 67890 })); removeShieldsState("alpha", tmpDir); expect(fs.existsSync(otherFile)).toBe(true); + expect(fs.existsSync(otherTimerFile)).toBe(true); } finally { fs.rmSync(tmpDir, { recursive: true, force: true }); } });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/image-cleanup.test.ts` around lines 147 - 156, Extend the "does not remove state files for other sandboxes" test to also create a non-target timer file (e.g., create shields-timer-bravo.json alongside shields-bravo.json), write some JSON into it, call removeShieldsState("alpha", tmpDir) as before, and add an assertion that the shields-timer-bravo.json file still exists (expect(fs.existsSync(timerFile)).toBe(true)); reference the existing test description and the removeShieldsState function when adding the additional file creation and expectation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/sandbox-destroy-action.ts`:
- Around line 147-151: The current blanket try/catch around fs.rmSync(filePath,
{ force: true }) in sandbox-destroy-action.ts hides real failures (permissions,
IO) while intending only to ignore missing files; update the handling so only
"file not found" is ignored and all other errors bubble up (or are logged and
rethrown). Concretely, keep the call to fs.rmSync(filePath, { force: true }) but
replace the empty catch with logic that checks the thrown error's code (e.g.,
error.code === 'ENOENT') and only swallow that case; for any other error
preserve/propagate it so shields cleanup failures are not silently swallowed.
---
Nitpick comments:
In `@test/image-cleanup.test.ts`:
- Around line 147-156: Extend the "does not remove state files for other
sandboxes" test to also create a non-target timer file (e.g., create
shields-timer-bravo.json alongside shields-bravo.json), write some JSON into it,
call removeShieldsState("alpha", tmpDir) as before, and add an assertion that
the shields-timer-bravo.json file still exists
(expect(fs.existsSync(timerFile)).toBe(true)); reference the existing test
description and the removeShieldsState function when adding the additional file
creation and expectation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: aa30c9b2-6159-4226-bc99-93b7f9e1cd57
📒 Files selected for processing (2)
src/lib/sandbox-destroy-action.tstest/image-cleanup.test.ts
872cc39 to
8f755fa
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/image-cleanup.test.ts (1)
147-156: ⚡ Quick winAlso assert
shields-timerisolation for other sandboxes.The “other sandbox” case currently checks only
shields-bravo.json; addingshields-timer-bravo.jsonwould fully cover both prefixes handled byremoveShieldsState.Suggested patch
it("does not remove state files for other sandboxes", () => { const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-shields-cleanup-")); try { const otherFile = path.join(tmpDir, "shields-bravo.json"); + const otherTimerFile = path.join(tmpDir, "shields-timer-bravo.json"); fs.writeFileSync(otherFile, JSON.stringify({ shieldsDown: false })); + fs.writeFileSync(otherTimerFile, JSON.stringify({ pid: 67890 })); removeShieldsState("alpha", tmpDir); expect(fs.existsSync(otherFile)).toBe(true); + expect(fs.existsSync(otherTimerFile)).toBe(true); } finally { fs.rmSync(tmpDir, { recursive: true, force: true }); } });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/image-cleanup.test.ts` around lines 147 - 156, The test that verifies removal isolation only asserts the presence of "shields-bravo.json"; update the "does not remove state files for other sandboxes" test to also create and assert the presence of the corresponding timer file (e.g., "shields-timer-bravo.json") so both prefixes handled by removeShieldsState are covered: create shields-timer-bravo.json in the tmpDir before calling removeShieldsState("alpha", tmpDir) and add an expectation that fs.existsSync for that timer file remains true after the call. Ensure you reference the same tmpDir and use the same cleanup/finally pattern as the existing shields-bravo.json assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/image-cleanup.test.ts`:
- Around line 147-156: The test that verifies removal isolation only asserts the
presence of "shields-bravo.json"; update the "does not remove state files for
other sandboxes" test to also create and assert the presence of the
corresponding timer file (e.g., "shields-timer-bravo.json") so both prefixes
handled by removeShieldsState are covered: create shields-timer-bravo.json in
the tmpDir before calling removeShieldsState("alpha", tmpDir) and add an
expectation that fs.existsSync for that timer file remains true after the call.
Ensure you reference the same tmpDir and use the same cleanup/finally pattern as
the existing shields-bravo.json assertion.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 13d7935e-4854-43ea-a612-fcc6fa832dbe
📒 Files selected for processing (2)
src/lib/sandbox-destroy-action.tstest/image-cleanup.test.ts
…#3114) sandbox-destroy-action.ts did not remove ~/.nemoclaw/state/shields-<name>.json on destroy, so a previous shields-up state persisted across re-onboard and deriveShieldsMode incorrectly reported 'locked' on fresh sandboxes.
8f755fa to
27213fb
Compare
…ale-state # Conflicts: # test/image-cleanup.test.ts
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/image-cleanup.test.ts (1)
119-160: 💤 Low valueTests cover the three critical cases cleanly.
Happy path, ENOENT no-op, and per-sandbox isolation are all verified with proper
finallycleanup. The existing three tests are sufficient for the core contract.One optional gap: the
console.warnbranch inremoveShieldsState(triggered whenrmSyncthrows a non-ENOENT error such as EPERM) is not exercised. If you want to lock that path in, a fourth test could spy onconsole.warnand mockfs.rmSyncto throw{ code: "EPERM" }.🧪 Optional fourth test to cover the console.warn branch
+ it("warns via console.warn when rmSync throws a non-ENOENT error", () => { + const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-shields-cleanup-")); + try { + const warnSpy = vi.spyOn(console, "warn").mockImplementation(() => {}); + const epermError = Object.assign(new Error("operation not permitted"), { code: "EPERM" }); + vi.spyOn(fs, "rmSync").mockImplementation(() => { throw epermError; }); + try { + removeShieldsState("alpha", tmpDir); + expect(warnSpy).toHaveBeenCalledTimes(2); // once per prefix + expect(warnSpy.mock.calls[0][0]).toMatch(/Failed to remove shields state/); + } finally { + vi.restoreAllMocks(); + } + } finally { + fs.rmSync(tmpDir, { recursive: true, force: true }); + } + });You'll also need to add
vito thevitestimport at line 7:-import { describe, it, expect } from "vitest"; +import { describe, it, expect, vi } from "vitest";🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/image-cleanup.test.ts` around lines 119 - 160, Add a fourth test in test/image-cleanup.test.ts that exercises the non-ENOENT error branch of removeShieldsState by spying on console.warn and mocking fs.rmSync to throw an error object with code "EPERM"; import vi from vitest (add vi to the existing vitest import) and use vi.spyOn(console, "warn") to assert it was called after calling removeShieldsState("alpha", tmpDir) while ensuring you restore the spy and fs.rmSync mock in finally so the test cleans up.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/image-cleanup.test.ts`:
- Around line 119-160: Add a fourth test in test/image-cleanup.test.ts that
exercises the non-ENOENT error branch of removeShieldsState by spying on
console.warn and mocking fs.rmSync to throw an error object with code "EPERM";
import vi from vitest (add vi to the existing vitest import) and use
vi.spyOn(console, "warn") to assert it was called after calling
removeShieldsState("alpha", tmpDir) while ensuring you restore the spy and
fs.rmSync mock in finally so the test cleans up.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ee9133a0-5a40-43eb-9e14-d192eab5249c
📒 Files selected for processing (2)
src/lib/sandbox-destroy-action.tstest/image-cleanup.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/actions/sandbox/destroy.ts`:
- Around line 126-133: removeShieldsState currently constructs file paths from
the untrusted sandboxName and may delete files outside stateDir; fix it by
resolving the candidate path and verifying it lives under the intended stateDir
before calling fs.rmSync. Specifically, in removeShieldsState use
path.resolve(stateDir, `${prefix}${sandboxName}.json`) (for each prefix in
["shields-", "shields-timer-"]) and ensure the resolved path is contained in the
resolved stateDir (e.g., via path.relative or string prefix check) and only then
call fs.rmSync; if the check fails, skip deletion and optionally log the invalid
sandboxName.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 120a6e03-b1c6-4bdc-b649-5f9b7238d10f
📒 Files selected for processing (1)
src/lib/actions/sandbox/destroy.ts
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/lib/actions/sandbox/destroy.ts (1)
126-141:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPath traversal in
removeShieldsStateis still unresolved.
path.join(stateDir,${prefix}${sandboxName}.json)normalises..segments: a craftedsandboxNamesuch as../../../../etc/cron.d/jobresolves to paths well outside the intended state directory. SinceremoveShieldsStateis exported, any caller can trigger deletion of arbitrary files on the host filesystem.The proposed guard from the previous review has not been applied.
🔒 Proposed fix
export function removeShieldsState( sandboxName: string, stateDir = path.join(process.env.HOME ?? "/tmp", ".nemoclaw", "state"), ): void { + const resolvedStateDir = path.resolve(stateDir); for (const prefix of ["shields-", "shields-timer-"]) { - const filePath = path.join(stateDir, `${prefix}${sandboxName}.json`); + const filePath = path.resolve(resolvedStateDir, `${prefix}${sandboxName}.json`); + if (!filePath.startsWith(`${resolvedStateDir}${path.sep}`)) { + throw new Error(`Invalid sandbox name for shields state cleanup: '${sandboxName}'`); + } try { fs.rmSync(filePath, { force: true }); } catch (error) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/actions/sandbox/destroy.ts` around lines 126 - 141, The removeShieldsState function currently constructs filePath with path.join(stateDir, `${prefix}${sandboxName}.json`) which allows path traversal; fix by validating/sanitising sandboxName and verifying the resolved filePath is inside stateDir before calling fs.rmSync: (1) reject or sanitize sandboxName to disallow path separators or `..` (e.g. allow only a safe pattern like alphanumerics, dash, underscore and dot) or use path.basename(sandboxName) to strip directories; (2) compute const resolved = path.resolve(stateDir, `${prefix}${sandboxName}.json`) and ensure path.relative(path.resolve(stateDir), resolved) does not start with '..' (or that resolved.startsWith(path.resolve(stateDir) + path.sep)); if the check fails, log/warn and skip deletion; apply these checks around the existing filePath/resolved variables in removeShieldsState and only call fs.rmSync when safe.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/lib/actions/sandbox/destroy.ts`:
- Around line 126-141: The removeShieldsState function currently constructs
filePath with path.join(stateDir, `${prefix}${sandboxName}.json`) which allows
path traversal; fix by validating/sanitising sandboxName and verifying the
resolved filePath is inside stateDir before calling fs.rmSync: (1) reject or
sanitize sandboxName to disallow path separators or `..` (e.g. allow only a safe
pattern like alphanumerics, dash, underscore and dot) or use
path.basename(sandboxName) to strip directories; (2) compute const resolved =
path.resolve(stateDir, `${prefix}${sandboxName}.json`) and ensure
path.relative(path.resolve(stateDir), resolved) does not start with '..' (or
that resolved.startsWith(path.resolve(stateDir) + path.sep)); if the check
fails, log/warn and skip deletion; apply these checks around the existing
filePath/resolved variables in removeShieldsState and only call fs.rmSync when
safe.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 5643ad49-9353-4ed9-9a86-c63bd2dfc3c1
📒 Files selected for processing (1)
src/lib/actions/sandbox/destroy.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/image-cleanup.test.ts`:
- Around line 164-178: The test leaks escapedFile if the assertion fails because
fs.rmSync(escapedFile, { force: true }) is inside the try block; move the
cleanup of escapedFile into the finally block so both tmpDir and escapedFile are
removed regardless of test outcome. Locate the test case in
test/image-cleanup.test.ts (the it("rejects path traversal in sandbox name")
block) and ensure escapedFile (constructed with path.join(tmpDir, "..",
"shields-traversal.json")) is removed in the finally along with tmpDir after
calling removeShieldsState; keep the expect assertion in the try but perform all
filesystem teardown in finally.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3dedc78b-e8a4-4374-bb9b-e8a917b9e17c
📒 Files selected for processing (2)
src/lib/actions/sandbox/destroy.tstest/image-cleanup.test.ts
34a7296 to
7e2c620
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/image-cleanup.test.ts (1)
166-176: ⚡ Quick winUse a unique escaped filename to reduce cross-test flake risk.
path.join(tmpDir, "..", "shields-traversal.json")targets a shared path under the temp root; parallel workers can interfere with each other.♻️ Proposed fix
- const escapedFile = path.join(tmpDir, "..", "shields-traversal.json"); + const escapedFile = path.join( + tmpDir, + "..", + `shields-traversal-${process.pid}-${Date.now()}.json`, + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/image-cleanup.test.ts` around lines 166 - 176, The test creates an escapedFile at path.join(tmpDir, "..", "shields-traversal.json") which can collide between parallel tests; change the filename to include a unique suffix (e.g., process.pid, Date.now(), or a random hex) so the variable escapedFile in the test uses a unique path. Update the test that calls removeShieldsState("../../shields-traversal", tmpDir) to write/read that unique escapedFile instead, and ensure cleanup still removes that unique file in the finally block.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/image-cleanup.test.ts`:
- Around line 166-176: The test creates an escapedFile at path.join(tmpDir,
"..", "shields-traversal.json") which can collide between parallel tests; change
the filename to include a unique suffix (e.g., process.pid, Date.now(), or a
random hex) so the variable escapedFile in the test uses a unique path. Update
the test that calls removeShieldsState("../../shields-traversal", tmpDir) to
write/read that unique escapedFile instead, and ensure cleanup still removes
that unique file in the finally block.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 26588866-1fa8-4099-bf29-373a8bb6e6f7
📒 Files selected for processing (2)
src/lib/actions/sandbox/destroy.tstest/image-cleanup.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/lib/actions/sandbox/destroy.ts
Summary
Sandbox destroy did not remove host-side shields state files (
~/.nemoclaw/state/shields-<name>.jsonandshields-timer-<name>.json), so a previousshields upstate persisted across destroy → re-onboard andderiveShieldsModeincorrectly reported "locked" on fresh sandboxes.Related Issue
Fixes #3114
Changes
src/lib/sandbox-destroy-action.ts— AddedremoveShieldsState()that deletes bothshields-<name>.jsonandshields-timer-<name>.jsonfrom the state directory. Called indestroySandbox()aftercleanupSandboxServices()and beforeremoveSandboxRegistryEntry().test/image-cleanup.test.ts— Added 3 tests: removes both state files, no-op when files don't exist, does not remove other sandboxes' state files.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Signed-off-by: Yimo Jiang yimoj@nvidia.com
Summary by CodeRabbit
Improvements
Tests