You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Follow-up from the review of merged PR #3162 (fix(shields): clean up stale shields state on sandbox destroy). #3162 correctly removes ~/.nemoclaw/state/shields-<sandbox>.json and shields-timer-<sandbox>.json during nemoclaw destroy, fixing the normal stale-state path from #3114.
There are a few remaining cleanup gaps worth addressing so destroy cannot leave or recreate stale shields artifacts.
Problem
1. Destroy can leave a detached auto-restore timer running
If a sandbox is currently in shields down --timeout ... mode when nemoclaw destroy runs, the detached auto-restore timer process is not explicitly killed.
The destroy path now deletes:
~/.nemoclaw/state/shields-<sandbox>.json
~/.nemoclaw/state/shields-timer-<sandbox>.json
…but if the detached timer process is still alive, it may wake later and recreate shields-<sandbox>.json via the timer restore path. If the user re-onboards the same sandbox name inside that window, deriveShieldsMode() can again observe stale state and report the fresh sandbox as locked/unlocked incorrectly.
This is narrow, but it is the same class of stale-state bug #3114 was intended to eliminate.
path.join(process.env.HOME ?? "/tmp", ".nemoclaw", "state") exists in multiple shields-related places. #3162 added another copy in the destroy module.
This should become a shared exported helper/constant so future state-dir changes cannot drift across:
src/lib/shields/index.ts
src/lib/shields/timer.ts
src/lib/actions/sandbox/destroy.ts
3. Policy snapshots accumulate
shields down writes timestamped files like policy-snapshot-<ts>.yaml under the state directory. These do not currently affect deriveShieldsMode() because they are not per-sandbox state files, but they can accumulate indefinitely.
Destroy could either remove snapshots that are known to belong to the sandbox, or we should add enough metadata/naming to make safe cleanup possible later.
4. Cleanup is silent in audit trail
Destroy-side shields cleanup currently removes files without writing to shields-audit.jsonl. Adding an audit entry such as shields_state_cleared_on_destroy would make the cleanup observable for troubleshooting and forensics.
The warning path for real removal failures (for example EPERM) is not currently covered. A small mocked fs.rmSync test should assert that the warning is emitted and the function continues.
Acceptance Criteria
nemoclaw destroy <sandbox> kills or otherwise neutralizes any live shields auto-restore timer for that sandbox before deleting the timer marker.
A detached timer cannot recreate shields-<sandbox>.json after the sandbox has been destroyed.
Shields state directory resolution is centralized in one shared helper/constant used by destroy, shields commands, and the timer.
Decide and document policy-snapshot cleanup behavior; implement if safe in this issue, otherwise file/link a follow-up.
Destroy-side shields cleanup writes an audit entry when it removes/attempts to remove shields state.
Tests cover the active-timer destroy case and the warning/error branch for failed file removal.
Suggested implementation notes
Prefer reusing/exporting the existing killTimer() behavior from src/lib/shields/index.ts rather than duplicating marker parsing in destroy.
Summary
Follow-up from the review of merged PR #3162 (
fix(shields): clean up stale shields state on sandbox destroy). #3162 correctly removes~/.nemoclaw/state/shields-<sandbox>.jsonandshields-timer-<sandbox>.jsonduringnemoclaw destroy, fixing the normal stale-state path from #3114.There are a few remaining cleanup gaps worth addressing so destroy cannot leave or recreate stale shields artifacts.
Problem
1. Destroy can leave a detached auto-restore timer running
If a sandbox is currently in
shields down --timeout ...mode whennemoclaw destroyruns, the detached auto-restore timer process is not explicitly killed.The destroy path now deletes:
~/.nemoclaw/state/shields-<sandbox>.json~/.nemoclaw/state/shields-timer-<sandbox>.json…but if the detached timer process is still alive, it may wake later and recreate
shields-<sandbox>.jsonvia the timer restore path. If the user re-onboards the same sandbox name inside that window,deriveShieldsMode()can again observe stale state and report the fresh sandbox as locked/unlocked incorrectly.This is narrow, but it is the same class of stale-state bug #3114 was intended to eliminate.
Related code:
src/lib/actions/sandbox/destroy.ts—removeShieldsState()added by fix(shields): clean up stale shields state on sandbox destroy #3162src/lib/shields/index.ts— existingreadTimerMarker()/killTimer()helper logicsrc/lib/shields/timer.ts— detached timer can update/recreate shields statefix(shields): self-heal expired auto-restore timers)2. Shields state directory path is duplicated
path.join(process.env.HOME ?? "/tmp", ".nemoclaw", "state")exists in multiple shields-related places. #3162 added another copy in the destroy module.This should become a shared exported helper/constant so future state-dir changes cannot drift across:
src/lib/shields/index.tssrc/lib/shields/timer.tssrc/lib/actions/sandbox/destroy.ts3. Policy snapshots accumulate
shields downwrites timestamped files likepolicy-snapshot-<ts>.yamlunder the state directory. These do not currently affectderiveShieldsMode()because they are not per-sandbox state files, but they can accumulate indefinitely.Destroy could either remove snapshots that are known to belong to the sandbox, or we should add enough metadata/naming to make safe cleanup possible later.
4. Cleanup is silent in audit trail
Destroy-side shields cleanup currently removes files without writing to
shields-audit.jsonl. Adding an audit entry such asshields_state_cleared_on_destroywould make the cleanup observable for troubleshooting and forensics.5. Error branch test coverage
#3162 added good behavioral tests for:
The warning path for real removal failures (for example
EPERM) is not currently covered. A small mockedfs.rmSynctest should assert that the warning is emitted and the function continues.Acceptance Criteria
nemoclaw destroy <sandbox>kills or otherwise neutralizes any live shields auto-restore timer for that sandbox before deleting the timer marker.shields-<sandbox>.jsonafter the sandbox has been destroyed.Suggested implementation notes
killTimer()behavior fromsrc/lib/shields/index.tsrather than duplicating marker parsing in destroy.