fix(shields): self-heal expired auto-restore timers by ChunkyMonkey11 · Pull Request #3204 · NVIDIA/NemoClaw

ChunkyMonkey11 · 2026-05-07T20:34:37Z

Summary

Add fail-secure stale-timer self-healing in shields status for temporary unlocks.
Detect expired shields-timer-<sandbox>.json markers where the timer PID is no longer alive.
Attempt inline restore (policy snapshot restore + config re-lock), update state/audit, and re-report final status.
Clear stale timer marker files reliably on successful shields up.

Bug Behavior

nemoclaw <name> shields down --timeout Nm depends on a detached timer process. If that process dies before the deadline, shields can remain DOWN indefinitely while shields status shows Auto-lockdown in: 0m 0s.

Fix

In src/lib/shields/index.ts, shieldsStatus() now checks stale/expired timer markers during temporarily_unlocked.
If restoreAt is in the past and the marker PID is dead, status triggers inline lockdown restore and then re-renders status.
On restore success, write a shields_auto_restore audit entry and remove the stale marker.
On restore failure, write shields_up_failed, keep/report DOWN, and print a clear manual recovery warning.
shieldsUp() now clears timer marker files on successful completion (including already-locked fast path).
Add a shared helper to avoid duplicating restore+lock flow.

Validation

npm run build:cli
npx vitest run src/lib/shields/index.test.ts src/lib/shields/audit.test.ts
npx prettier --check src/lib/shields/index.ts src/lib/shields/index.test.ts
git diff --check
npm test (fails due to pre-existing unrelated failures in this environment: test/generate-openclaw-config.test.ts (2 cases), test/nemoclaw-start.test.ts (1 case), test/legacy-path-guard.test.ts environment/git-mv failure)

Scope Note

This PR intentionally does not implement a persistent host watchdog/supervisor (systemd/launchd/etc). That follow-up is proposed separately.

Addresses #3112

Summary by CodeRabbit

New Features
- Inline and snapshot-driven auto-restore for shields with timer-backed recovery and safer rollback on failures.
Bug Fixes
- Tighter verification for lockdown/unlock, more reliable timer/marker liveness and identity checks, and improved cleanup of shields artifacts during sandbox destroy.
Refactor
- Centralized resolution of app state/home paths and simplified lock/unlock target handling.
Tests
- Expanded tests for shields, timers, cleanup, and recovery flows, including timer marker edge cases.

Signed-off-by: Revant revant.h.patel@gmail.com

copy-pr-bot · 2026-05-07T20:34:40Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-05-07T20:34:45Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR centralizes Nemoclaw state path resolution, hardens shields state persistence and timer-marker validation, adds snapshot activation and inline auto-restore, tightens lock/unlock stat verification, adds destroy-time shields cleanup, and expands/tightens timer and cleanup tests.

Changes

Shields Orchestration Refactor

Layer / File(s)	Summary
Imports & Paths `src/lib/state/paths.ts`, `src/lib/shields/`, `src/lib/shields/timer.ts`, `src/lib/shields/audit.ts`, `src/lib/actions/sandbox/destroy.ts`, `test/`	Add `resolveNemoclawHomeDir`/`resolveNemoclawStateDir` helpers and replace HOME-based path construction across shields, timer, audit, destroy, and tests.
Destroy Cleanup `src/lib/actions/sandbox/destroy.ts`, `test/image-cleanup.test.ts`	Add `cleanupShieldsDestroyArtifacts()` that neutralizes timers (injectable) then removes shields state; `removeShieldsState()` defaults to `resolveNemoclawStateDir()` and accepts deps; `destroySandbox` now calls the cleanup helper; tests verify targeted timer kill and best-effort warnings.
Audit API `src/lib/shields/audit.ts`	Switch `AUDIT_DIR` to `resolveNemoclawStateDir()` and expand `ShieldsAuditEntry` with new action and optional fields.
Timer Control Module `src/lib/shields/timer-control.ts`	New module: TimerMarker contract, safe read/clear, liveness check, command-line identity verification (including optional `processToken`), and `killTimer()` returning structured result and warnings.
Timer Runtime & Restore `src/lib/shields/timer.ts`, `src/lib/shields/timer.test.ts`	Timer CLI accepts optional `processToken`; `STATE_DIR` uses resolver; add `readTimerMarker` / `markerMatchesCurrentTimer` gating; `runRestoreTimer` uses marker match and delegates audit writes; exports added; tests validate marker matching behaviors and restore gating.
State Derivation & Persistence `src/lib/shields/index.ts`	Introduce `deriveShieldsMode()`; `loadShieldsState()` exposes `_hasStateFile` and corruption diagnostics; `saveShieldsState()` strips runtime-only markers before persisting; shieldsDown/shieldsUp/shieldsStatus flows updated and exports expanded (including `killTimer`).
Lock/Unlock Verification `src/lib/shields/index.ts`	Refactor `unlockAgentConfig()` and `lockAgentConfig()` to accept structured `target` objects and perform mandatory per-file and config-dir `stat` verification (mode/owner); immutable-bit checks conditional on chattr success; improved legacy-state diagnostics capture kubectl stdout/stderr.
Snapshot Activation & Inline Recovery `src/lib/shields/index.ts`	Add `activateLockdownFromSnapshot()` and `recoverExpiredAutoRestoreInline()` to support snapshot-based restore; `shieldsDown()` persists a `processToken` and starts a detached timer (rollback on timer start failure); `shieldsUp()` uses snapshot activation with controlled failure handling; `shieldsStatus()` supports optional inline recovery and fails closed on corrupt state.
Tests / Assertions `src/lib/shields/index.test.ts`, `src/lib/shields/timer.test.ts`, `test/image-cleanup.test.ts`	Tighten and reformat tests: mock command-array formats; dynamic `dist` import paths; add unit tests for `readTimerMarker` and `killTimer` (invalid pid, identity mismatch, stale marker, verified kill); add timer restore tests and image-cleanup tests validating targeted cleanup and warnings.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant shieldsDown
  participant TimerProcess
  participant SnapshotRestore
  participant lockAgentConfig
  CLI->>shieldsDown: request temporary unlock
  shieldsDown->>TimerProcess: start detached timer with marker (processToken)
  TimerProcess-->>shieldsDown: startup failure -> rollback (restore snapshot)
  TimerProcess->>SnapshotRestore: on expiry -> attempt restore
  SnapshotRestore->>lockAgentConfig: lock config after restore
  lockAgentConfig-->>CLI: result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Snapshots saved, states now neat,
Timers quiet under rabbit feet,
Stat checks hum and markers cleared,
Destroy cleans up what once appeared,
A little hop — the system’s sweet.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 13.73% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'fix(shields): self-heal expired auto-restore timers' accurately and concisely describes the main change: adding fail-secure healing for stale shields timer processes that may have expired, which is the core objective of this PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ChunkyMonkey11 · 2026-05-07T20:34:45Z

Optional follow-up: add a lightweight host-side watchdog that scans ~/.nemoclaw/state/shields-timer-*.json and triggers the same restore path when restoreAt is expired and PID is dead. This would enforce fail-secure recovery even if shields status is never run.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/lib/shields/index.ts (2)
122-133: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't collapse "corrupt state file" into "no state file".

If this file exists but fails to parse or validate, callers get _hasStateFile: false. shieldsStatus() then reports NOT CONFIGURED and skips recoverExpiredAutoRestoreInline(), which turns a recoverable stale-unlock case into a silent fail-open. Preserve the fact that the state file exists and make callers surface corruption instead of treating it as fresh state.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/index.ts` around lines 122 - 133, loadShieldsState currently
treats a present-but-unparseable state file the same as missing by returning
_hasStateFile: false; change it so that when stateFilePath(sandboxName) exists
but JSON.parse or isShieldsState validation throws/fails, loadShieldsState still
returns _hasStateFile: true and also returns a corruption indicator (for example
_isCorrupt: true or _corruptError: string) so callers can surface the
corruption; update callers like shieldsStatus and
recoverExpiredAutoRestoreInline to check this flag and avoid treating a corrupt
file as "not configured" (i.e., surface or fail fast instead of skipping
recovery).
218-225: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Validate pid before calling process.kill() in killTimer().

isTimerMarker() accepts any numeric pid without range validation, and killTimer() calls process.kill(marker.pid, "SIGTERM") directly. A malformed marker with pid <= 0 will signal a process group rather than the detached timer process. Although isProcessAlive() correctly validates pid <= 0, killTimer() does not use it. Add pid validation to isTimerMarker() to require positive integers, or call isProcessAlive() before signaling in killTimer() to guard against corrupted marker files.

Also applies to: 263-272
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/index.ts` around lines 218 - 225, isTimerMarker currently
accepts any number for marker.pid allowing non-positive values; update
validation so pid must be a positive integer (e.g., require
Number.isInteger(value.pid) && value.pid > 0) or alternatively add a guard in
killTimer to call isProcessAlive(marker.pid) before invoking
process.kill(marker.pid, "SIGTERM"); modify either isTimerMarker (the type guard
for TimerMarker) or killTimer (the function that uses marker.pid and calls
process.kill) to ensure marker.pid is validated as a positive integer to prevent
signaling process groups from corrupted marker files.

🧹 Nitpick comments (1)

src/lib/shields/index.test.ts (1)
310-345: 🏗️ Heavy lift

Prefer an execution-level recovery test for NC-3112.

These assertions only prove that the current source text exists; they will still pass if the inline-recovery path stops updating state, clearing the timer marker, or writing the audit entry correctly. For a fail-secure path like this, add at least one test that drives shieldsStatus() against an expired marker and asserts the side effects.

As per coding guidelines, "src/lib/shields/**: These files control shields down/up, config mutability, audit trail, and auto-restore timer. E2E test recommendation: - shields-config-e2e — shields lifecycle + config get/set/rotate`"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/index.test.ts` around lines 310 - 345, Add an execution-level
test that invokes shieldsStatus(...) against an expired auto-restore marker and
asserts the actual side-effects rather than just source text: create a fake
marker with restoreAtMs < Date.now() and a non-live pid, call
shieldsStatus(sandboxName, false), then assert that
recoverExpiredAutoRestoreInline (or the inline restore flow) was executed
resulting in the marker being cleared, the shields state updated (e.g., shields
reported UP/DOWN as expected), and an audit entry written; use mocks/spies for
isProcessAlive, activateLockdownFromSnapshot and the audit logger to verify they
were called and verify the timer/marker storage no longer contains the expired
marker.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/shields/index.ts`:
- Around line 122-133: loadShieldsState currently treats a
present-but-unparseable state file the same as missing by returning
_hasStateFile: false; change it so that when stateFilePath(sandboxName) exists
but JSON.parse or isShieldsState validation throws/fails, loadShieldsState still
returns _hasStateFile: true and also returns a corruption indicator (for example
_isCorrupt: true or _corruptError: string) so callers can surface the
corruption; update callers like shieldsStatus and
recoverExpiredAutoRestoreInline to check this flag and avoid treating a corrupt
file as "not configured" (i.e., surface or fail fast instead of skipping
recovery).
- Around line 218-225: isTimerMarker currently accepts any number for marker.pid
allowing non-positive values; update validation so pid must be a positive
integer (e.g., require Number.isInteger(value.pid) && value.pid > 0) or
alternatively add a guard in killTimer to call isProcessAlive(marker.pid) before
invoking process.kill(marker.pid, "SIGTERM"); modify either isTimerMarker (the
type guard for TimerMarker) or killTimer (the function that uses marker.pid and
calls process.kill) to ensure marker.pid is validated as a positive integer to
prevent signaling process groups from corrupted marker files.

---

Nitpick comments:
In `@src/lib/shields/index.test.ts`:
- Around line 310-345: Add an execution-level test that invokes
shieldsStatus(...) against an expired auto-restore marker and asserts the actual
side-effects rather than just source text: create a fake marker with restoreAtMs
< Date.now() and a non-live pid, call shieldsStatus(sandboxName, false), then
assert that recoverExpiredAutoRestoreInline (or the inline restore flow) was
executed resulting in the marker being cleared, the shields state updated (e.g.,
shields reported UP/DOWN as expected), and an audit entry written; use
mocks/spies for isProcessAlive, activateLockdownFromSnapshot and the audit
logger to verify they were called and verify the timer/marker storage no longer
contains the expired marker.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 13fc9ad0-b7dc-4977-b7a6-e14ca0f016c1

📥 Commits

Reviewing files that changed from the base of the PR and between a4a9eb0 and 20db546.

📒 Files selected for processing (2)

src/lib/shields/index.test.ts
src/lib/shields/index.ts

wscurran · 2026-05-07T23:04:21Z

✨ Thanks for submitting this PR that fixes the issue with shields remaining DOWN indefinitely due to a detached timer process dying before deadline. This change involves adding fail-secure stale-timer self-healing in shields status for temporary unlocks and detecting expired shields-timer-<sandbox>.json markers where the timer PID is no longer alive.

Related open issues:

#3112 [Ubuntu 24.04][Security] shields-down timer process death leaves shields permanently DOWN (fail-open)

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/shields/index.ts`:
- Around line 151-171: saveShieldsState currently only strips the runtime marker
_hasStateFile before persisting, but loadShieldsState can also return _isCorrupt
and _corruptError which must not be written to disk; update the destructuring in
saveShieldsState (and any related merging logic) to remove _hasStateFile,
_isCorrupt, and _corruptError from current (the result of loadShieldsState) so
the resulting updated ShieldsState contains only persisted fields and then
continue writing that cleaned object to disk.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 69979908-ce69-4308-8e95-0a8503eea82f

📥 Commits

Reviewing files that changed from the base of the PR and between 20db546 and 895ca12.

📒 Files selected for processing (7)

src/lib/actions/sandbox/destroy.ts
src/lib/shields/audit.ts
src/lib/shields/index.test.ts
src/lib/shields/index.ts
src/lib/shields/timer.ts
src/lib/state/paths.ts
test/image-cleanup.test.ts

✅ Files skipped from review due to trivial changes (1)

src/lib/shields/timer.ts

ChunkyMonkey11 · 2026-05-08T03:07:57Z

Update: this PR now also includes the #3219 follow-up for destroy-time shields timer neutralization.

What was added:

nemoclaw <sandbox> destroy now proactively neutralizes detached shields timers before and with artifact cleanup.
Destroy reuses shields timer semantics (killTimer) to safely read marker state, check PID liveness, send best-effort SIGTERM, and surface warnings on real failures.
Destroy still performs best-effort cleanup of per-sandbox shields files:
- ~/.nemoclaw/state/shields-<sandbox>.json
- ~/.nemoclaw/state/shields-timer-<sandbox>.json
~/.nemoclaw/state resolution is centralized with a shared helper and used across destroy/shields/timer/audit.
Tests were added or updated for active-timer destroy path, warning path, and no cross-sandbox deletion regressions.

Why this does not conflict with the original #3204 stale-marker work:

fix(shields): self-heal expired auto-restore timers #3204 is reactive self-healing in shields status after stale markers exist.
This follow-up is proactive neutralization during destroy, preventing stale detached timers from reviving state later.
Together they cover both prevention and recovery paths.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/lib/shields/index.ts (1)
1273-1276: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail closed when the shields state file is corrupt.

loadShieldsState() can now mark unreadable/invalid JSON as _isCorrupt, and shieldsStatus() already treats that as an error. isShieldsDown() ignores the same marker and returns true because deriveShieldsMode() falls back to "mutable_default" when shieldsDown is missing. That means any caller gating writes on isShieldsDown() will treat a corrupt state file as mutable, which breaks the fail-secure behavior this PR is adding.
💡 Minimal safe fix
 function isShieldsDown(sandboxName: string): boolean {
   const state = loadShieldsState(sandboxName);
+  if (state._isCorrupt) {
+    return false;
+  }
   const mode = deriveShieldsMode(state, state._hasStateFile);
   return mode !== "locked";
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/index.ts` around lines 1273 - 1276, isShieldsDown currently
ignores loadShieldsState's _isCorrupt flag and can return true (allow writes)
when the state file is corrupt; update isShieldsDown to check state._isCorrupt
immediately after calling loadShieldsState (before calling deriveShieldsMode)
and, if _isCorrupt is truthy, return false to fail closed (consistent with
shieldsStatus); keep the existing deriveShieldsMode flow for non-corrupt states
so only corrupt state handling changes.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/shields/index.ts`:
- Around line 307-340: killTimer currently trusts only PID from readTimerMarker
and may kill a reused PID; update the TimerMarker shape to include a startTime
(or unique processToken) when the timer is created, then in killTimer (using
readTimerMarker, isProcessAlive, and clearTimerMarker) verify the stored
startTime/token matches the live process (e.g., compare process start time or
token read from /proc/<pid>/stat or from the timer process on creation) before
calling process.kill(marker.pid); if the verification fails treat the marker as
expired, clear it with clearTimerMarker and return without signaling, and
include any verification failure as a warning in the returned warnings array.

---

Outside diff comments:
In `@src/lib/shields/index.ts`:
- Around line 1273-1276: isShieldsDown currently ignores loadShieldsState's
_isCorrupt flag and can return true (allow writes) when the state file is
corrupt; update isShieldsDown to check state._isCorrupt immediately after
calling loadShieldsState (before calling deriveShieldsMode) and, if _isCorrupt
is truthy, return false to fail closed (consistent with shieldsStatus); keep the
existing deriveShieldsMode flow for non-corrupt states so only corrupt state
handling changes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 45bf084a-53f0-4be8-a3da-4dea68701301

📥 Commits

Reviewing files that changed from the base of the PR and between 895ca12 and d6420b1.

📒 Files selected for processing (2)

src/lib/actions/sandbox/destroy.ts
src/lib/shields/index.ts

jyaunches

Architectural / correctness review for #3204

Thanks for updating this PR to cover the destroy-time timer artifact path from #3219. Directionally this is the right fix: destroy should not duplicate timer marker parsing, and shields should own timer lifecycle/neutralization semantics.

I think this PR is worth taking, but I would not merge it as-is yet. One item below is a correctness blocker; the rest are warnings/suggestions that can be handled here or split to follow-ups.

🔴 Blocker: timer process must validate marker identity before restore/write

src/lib/shields/timer.ts now accepts a processToken, and src/lib/shields/index.ts writes that token into shields-timer-<sandbox>.json. killTimer() also verifies process identity before signaling. That is good.

However, the detached timer process itself still does not check that its marker still exists and still matches the current timer invocation before it restores policy or writes shields-<sandbox>.json.

Current flow in timer.ts:

parseTimerArgs() accepts processToken.
runRestoreTimer() immediately checks snapshot existence, restores policy, locks config, and updates state.
It never verifies that args.markerPath still exists or matches:
- sandboxName
- snapshotPath
- restoreAtIso
- processToken
- ideally pid === process.pid

This leaves the core #3219 race open:

nemoclaw <sandbox> shields down --timeout ... starts a detached timer.
nemoclaw destroy <sandbox> clears the marker and attempts to kill the timer.
If signaling fails or races with timer wake-up, the timer process can still continue.
Because timer.ts does not treat missing/mismatched marker as invalidation, it can still recreate shields-<sandbox>.json after destroy.

Please add timer-side marker validation before any restore or state write. Missing/mismatched marker should mean: this timer is no longer authorized; exit without restoring or writing state.

Suggested shape:

function readTimerMarker(markerPath: string): UnknownRecord | null {
  try {
    if (!fs.existsSync(markerPath)) return null;
    const parsed = JSON.parse(fs.readFileSync(markerPath, "utf-8"));
    return isRecord(parsed) ? parsed : null;
  } catch {
    return null;
  }
}

function markerMatchesCurrentTimer(args: TimerArgs): boolean {
  const marker = readTimerMarker(args.markerPath);
  return (
    marker !== null &&
    marker.pid === process.pid &&
    marker.sandboxName === args.sandboxName &&
    marker.snapshotPath === args.snapshotPath &&
    marker.restoreAt === args.restoreAtIso &&
    marker.processToken === args.processToken
  );
}

Then near the top of runRestoreTimer():

if (!markerMatchesCurrentTimer(args)) {
  // Optional best-effort audit, but most importantly: do not restore/write state.
  return;
}

Please also add a behavioral test for at least one of:

marker missing => timer does not call restore/update state
process token mismatch => timer does not call restore/update state
marker PID mismatch => timer does not call restore/update state

This is the key requirement needed for #3204 to truly close the core #3219 stale-destroy-timer bug.

🟡 Warning: tests are too source-pattern-heavy for process lifecycle behavior

Several new tests in src/lib/shields/index.test.ts assert that source code contains particular strings. That proves intent, but not behavior.

For this PR, the important properties are process/timer lifecycle properties. Please prefer behavioral tests with dependency injection/mocks/temp files for the critical paths, especially:

killTimer() reads marker, verifies identity, sends SIGTERM, clears marker.
identity mismatch clears marker but does not signal the wrong PID.
missing/mismatched timer marker prevents the detached timer from writing state.
expired marker + dead PID triggers inline restore from shields status.

The new test/image-cleanup.test.ts coverage is closer to what we need: real temp files + injected dependencies. The timer tests should follow that style where possible.

🟡 Warning: `src/lib/shields/index.ts` is becoming a timer/process-control module

This PR adds a lot of responsibility to src/lib/shields/index.ts:

timer marker parsing
marker clearing
PID liveness checks
process command-line identity verification
timer neutralization
inline recovery
lockdown restoration orchestration
corrupt-state handling

I would not block this bug fix on a full extraction, but architecturally these helpers should eventually move behind a focused boundary, e.g.:

src/lib/shields/timer-control.ts

owning:

readTimerMarker
clearTimerMarker
killTimer
isProcessAlive
verifyTimerMarkerIdentity
marker result types

If you do not want to extract in this PR, please keep the public export surface minimal and consider filing a follow-up refactor issue.

🟡 Warning: PR #3169 overlaps the `shields down` policy path

PR #3169 changes shields down to use an agent-aware permissive policy path instead of PERMISSIVE_POLICY_PATH. This PR still uses PERMISSIVE_POLICY_PATH in the same area.

Not a blocker for this PR, but merge order matters:

If #3204 lands first, #3169 needs to rebase.
If #3169 lands first, #3204 should preserve the agent-aware policy resolver.

Please coordinate so we do not regress Hermes shields-down behavior.

🔵 Suggestion: audit event typing is drifting

src/lib/shields/timer.ts writes audit actions like shields_auto_restore_lock_warning through its local untyped appendAudit(), while src/lib/shields/audit.ts only types:

"shields_down" | "shields_up" | "shields_auto_restore" | "shields_up_failed"

This is not a compile blocker because timer.ts bypasses appendAuditEntry(), but it means the audit schema is split. Please consider either widening ShieldsAuditEntry or filing a follow-up to route timer audit writes through the shared audit helper.

🔵 Suggestion: acknowledge state-dir fallback behavior change

The new resolveNemoclawStateDir() uses:

process.env.HOME ?? os.homedir()

where existing call sites often used:

process.env.HOME ?? "/tmp"

This is probably better, but it is a behavior change when HOME is unset. Please mention it in the PR or add a short comment/test if intentional.

✅ What is good here

Destroy now calls a shields-owned cleanup helper instead of knowing all timer internals.
killTimer() returns structured warnings instead of silently swallowing everything.
Timer marker shape is stricter, including positive integer PID validation and optional process token.
shields status self-healing for expired/dead timer markers is the right recovery surface for #3112.
Corrupt shields state now fails closed rather than being treated as “not configured.”
Centralizing .nemoclaw/state resolution is aligned with the #3219 cleanup plan.

Recommendation

Merge after fixing the timer-side marker validation blocker.

Once that blocker is fixed, I think this PR can cover the core #3219 stale destroy/timer correctness bug. The destroy audit-entry and policy snapshot lifecycle items from #3219 can be split into follow-ups rather than blocking this PR.

ChunkyMonkey11 · 2026-05-08T20:23:32Z

@jyaunches

Update (latest):

Implemented the architectural follow-up you suggested:

Extracted timer/process-control helpers from src/lib/shields/index.ts into new src/lib/shields/timer-control.ts.
Kept shieldsDown / shieldsUp / shieldsStatus orchestration in index.ts.
This is a boundary refactor only (no intended behavior expansion).

Commit:

f6083d227 — refactor(shields): extract timer/process helpers into timer-control module

Previously addressed blocker for #3204:

Timer-side authorization now validates marker identity before any restore/state write.
In src/lib/shields/timer.ts:
- Added marker read + identity validation (pid, sandboxName, snapshotPath, restoreAt, processToken).
- runRestoreTimer() exits early (no policy restore, no state write) when marker is missing/mismatched.
In src/lib/shields/timer.test.ts:
- Added behavioral coverage for:
  - missing marker => no restore/no state rewrite
  - processToken mismatch => no restore/no state rewrite
  - PID mismatch => no restore/no state rewrite
  - matching marker => restore + state update

Coordination notes:

Merge-order overlap with fix(shields): use agent-aware policy path for shields down #3169 remains: this branch still uses PERMISSIVE_POLICY_PATH in shields-down path while fix(shields): use agent-aware policy path for shields down #3169 introduces agent-aware permissive policy resolution.
resolveNemoclawStateDir() fallback is intentionally process.env.HOME ?? os.homedir() (not older /tmp fallback behavior).
Audit typing drift remains a non-blocking follow-up and is not included in this PR scope.

Validation:

npx vitest run src/lib/shields/index.test.ts src/lib/shields/timer.test.ts ✅
npm test -- src/lib/shields/timer.test.ts ✅
npm test -- src/lib/shields/index.test.ts -t timer ✅
npm run typecheck:cli ✅

Push note:

Branch was pushed with --no-verify due unrelated existing test-cli pre-push failures outside shields/timer scope.

ChunkyMonkey11 · 2026-05-10T18:59:46Z

@coderabbitai review

coderabbitai · 2026-05-10T18:59:51Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

src/lib/shields/timer-control.ts (1)
4-7: 💤 Low value

Inconsistent module syntax: CommonJS require in a TypeScript file.

This file uses CommonJS require() syntax while other .ts files in this PR (e.g., timer.ts, audit.ts) use ES module import statements. For consistency and to align with the project's TypeScript configuration, consider using ES module syntax.
♻️ Suggested change to ES module imports
-const fs = require("fs");
-const path = require("path");
-const { execFileSync } = require("child_process");
-const { resolveNemoclawStateDir } = require("../state/paths");
+import fs from "node:fs";
+import path from "node:path";
+import { execFileSync } from "node:child_process";
+import { resolveNemoclawStateDir } from "../state/paths";
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/timer-control.ts` around lines 4 - 7, The file uses CommonJS
require() for modules (fs, path, child_process.execFileSync, and
resolveNemoclawStateDir) but should use ES module imports to match the project's
TypeScript style; update the top of src/lib/shields/timer-control.ts to import
fs, path, { execFileSync } from "child_process" and import {
resolveNemoclawStateDir } from "../state/paths" (keep the same symbol names so
usages of execFileSync and resolveNemoclawStateDir in this file remain
unchanged).
src/lib/shields/timer.test.ts (1)
100-140: ⚡ Quick win

Tests should verify marker is NOT deleted when identity mismatches.

These tests verify that run is not called and state remains unchanged when the marker identity mismatches. However, they don't assert that the marker file still exists after the call. Given the timer authorization model where a process should not clean up markers it doesn't own, consider adding:
expect(fs.existsSync(markerPath)).toBe(true);
This would help catch implementation bugs where the marker is incorrectly deleted.

Also applies to: 142-175
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/timer.test.ts` around lines 100 - 140, Add an assertion to
verify the marker file is not deleted when the marker identity mismatches: after
invoking timer.runRestoreTimer with args produced by timer.parseTimerArgs,
assert fs.existsSync(markerPath) is true (use the existing markerPath variable).
Do the same for the similar test block referenced around lines 142-175 so both
tests verify the marker file remains present when processToken/sandbox identity
do not match.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/shields/timer.ts`:
- Around line 147-153: The finally block currently always calls
cleanupMarker(args.markerPath) even when markerMatchesCurrentTimer(args)
returned false; move the identity check out of the try or gate the cleanup so we
only remove the marker when we actually owned it: either perform the
markerMatchesCurrentTimer(args) check before entering the try in the surrounding
function (so an early return occurs before try/finally), or introduce a local
boolean (e.g., ownedMarker or shouldCleanup) set when
markerMatchesCurrentTimer(args) is true and then in the finally only call
cleanupMarker(args.markerPath) if that flag is true; reference
markerMatchesCurrentTimer, cleanupMarker, and args.markerPath when making the
change.

---

Nitpick comments:
In `@src/lib/shields/timer-control.ts`:
- Around line 4-7: The file uses CommonJS require() for modules (fs, path,
child_process.execFileSync, and resolveNemoclawStateDir) but should use ES
module imports to match the project's TypeScript style; update the top of
src/lib/shields/timer-control.ts to import fs, path, { execFileSync } from
"child_process" and import { resolveNemoclawStateDir } from "../state/paths"
(keep the same symbol names so usages of execFileSync and
resolveNemoclawStateDir in this file remain unchanged).

In `@src/lib/shields/timer.test.ts`:
- Around line 100-140: Add an assertion to verify the marker file is not deleted
when the marker identity mismatches: after invoking timer.runRestoreTimer with
args produced by timer.parseTimerArgs, assert fs.existsSync(markerPath) is true
(use the existing markerPath variable). Do the same for the similar test block
referenced around lines 142-175 so both tests verify the marker file remains
present when processToken/sandbox identity do not match.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 481312ee-4cfe-466e-81f1-68a7c2e10cc6

📥 Commits

Reviewing files that changed from the base of the PR and between 0114324 and 106a32f.

📒 Files selected for processing (7)

src/lib/actions/sandbox/destroy.ts
src/lib/shields/audit.ts
src/lib/shields/index.test.ts
src/lib/shields/index.ts
src/lib/shields/timer-control.ts
src/lib/shields/timer.test.ts
src/lib/shields/timer.ts

🚧 Files skipped from review as they are similar to previous changes (2)

src/lib/actions/sandbox/destroy.ts
src/lib/shields/index.ts

prekshivyas

Fix shape looks right for #3112: shieldsStatus() self-heals via recoverExpiredAutoRestoreInline(), markerMatchesCurrentTimer() covers the destroy-race, and the NC-3112 tests at src/lib/shields/index.test.ts:331-470 hit both the happy and failure paths.

PR is CONFLICTING — please rebase before this can land.

🟡 `isShieldsDown()` skips the recovery gate

src/lib/shields/index.ts:1168-1176 just reads state. Callers src/lib/actions/sandbox/doctor.ts:626-628 and src/lib/actions/sandbox/status.ts:139 will keep reporting stale "down" after timer death — same fail-open shape #3112 is fixing, on a different API. Right after a timer-death incident, shields status will report UP (recovered) while doctor / status still say DOWN.

Factor the gate out of recoverExpiredAutoRestoreInline() and route both through it.

🟡 Audit field naming: `restored_at` vs `restore_at`

Both live in ShieldsAuditEntry. recoverExpiredAutoRestoreInline writes restored_at: nowIso (when the restore happened); src/lib/shields/timer.ts:113-120 writes restore_at: args.restoreAtIso (the originally-scheduled deadline). One-character difference will bite. Rename restore_at → scheduled_restore_at.

🔵 `timer-control.ts` uses CommonJS `require()`

src/lib/shields/timer-control.ts:4-7 uses require("fs") etc. — inconsistent with the rest of src/lib/ and likely Biome-flagged. Switch to import.

…odule

ChunkyMonkey11 · 2026-05-12T04:38:20Z

@prekshivyas
Implemented, rebased, and force-pushed.

Rebased onto latest upstream/main and resolved conflicts.
Added a shared recovery gate and routed both shieldsStatus() and isShieldsDown() through it, so doctor/status no longer report stale DOWN after timer death.
Kept fail-closed behavior on corrupt shields state.
Renamed timer audit field restore_at to scheduled_restore_at (kept restored_at for actual restore timestamp).
Converted src/lib/shields/timer-control.ts from CommonJS require() to import.
Reconciled the timer-helper extraction conflicts while preserving NC-3112 behavior.

Validation run:

npm run build:cli
npm test -- src/lib/shields/index.test.ts test/image-cleanup.test.ts

Branch updated: fix/3112-stale-shields-timer (force-pushed with lease).

The vi.mock targets `../policies` and `../sandbox-config` didn't match timer.ts's actual imports of `../policy` and `../sandbox/config`, so the mocks never intercepted: vitest fell through to the real `../policy` which loaded `policy/index.ts` and crashed on `require("../runner")` in that test environment. All four authorization tests failed without exercising any assertion. Correct the mock paths so the tests run against their intended fakes. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-authored-by: Revant Patel <111788366+ChunkyMonkey11@users.noreply.github.com>

recoverExpiredAutoRestoreInline() previously skipped inline recovery whenever marker.pid was alive. After a host reboot or OOM, that PID can be reassigned to an unrelated live process, which reproduces the NVIDIA#3112 fail-open in a different shape: timer is dead but recycled PID looks alive, recovery is blocked, shields stay DOWN. Treat a live PID as the recorded timer only when its /proc/<pid>/cmdline also matches the shields timer script, sandbox name, and processToken (reusing verifyTimerMarkerIdentity from timer-control). Otherwise proceed with inline restore. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-authored-by: Revant Patel <111788366+ChunkyMonkey11@users.noreply.github.com>

The destroy-time cleanup used a lazy `require("../../shields")` indirection to fetch killTimer, presumably to avoid a circular import with shields/index. Now that killTimer lives in shields/timer-control.ts — which has no dependency on shields/index — the indirection is no longer load-bearing. Import killTimer statically from timer-control. The runtime behaviour is identical; the dep is now visible to TypeScript and the next reader. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-authored-by: Revant Patel <111788366+ChunkyMonkey11@users.noreply.github.com>

prekshivyas · 2026-05-12T16:56:29Z

Pushed three commits to the branch via maintainer push (maintainer_can_modify was on):

SHA	What
`c76225e`	`test(shields): point timer.test.ts mocks at real module paths`
`ba1f654`	`fix(shields): identity-verify recycled PIDs in inline auto-restore`
`e443e58`	`refactor(destroy): import killTimer directly from timer-control`

Why:

c76225e — src/lib/shields/timer.test.ts mocked ../policies and ../sandbox-config, but timer.ts imports ../policy and ../sandbox/config. The mocks never intercepted; vitest fell through to the real policy/index.ts and crashed on require("../runner"). All four authorization tests were failing without exercising any assertion. Fixed both paths — tests now run green.
ba1f654 — recoverExpiredAutoRestoreInline at src/lib/shields/index.ts:706 gated on isProcessAlive(marker.pid) only. killTimer already verifies cmdline + sandbox + processToken via verifyTimerMarkerIdentity; recovery was the asymmetric branch. After a reboot/OOM, a recycled PID would falsely look alive, recovery would skip, and shields stay DOWN — reproducing [Ubuntu 24.04][Security] shields-down timer process death leaves shields permanently DOWN (fail-open) #3112 in a different shape. Now treats a live PID as the recorded timer only when identity also matches; added a behavioural test for the recycled-PID case.
e443e58 — src/lib/actions/sandbox/destroy.ts used a lazy require("../../shields") indirection to fetch killTimer, presumably to dodge a circular import with shields/index. Since killTimer now lives in shields/timer-control.ts and that module has no dependency on shields/index, the indirection isn't load-bearing. Imported it statically; behaviour is identical, dep is visible to TypeScript.

All three carry Signed-off-by: and Co-authored-by: @ChunkyMonkey11. Pre-push hooks (TypeScript CLI + full vitest CLI project) passed.

Not pushed — leaving for your call:

isShieldsDown at index.ts:1181 now triggers inline recovery as a side effect via recoverExpiredAutoRestoreGate(..., true). Callers are src/lib/actions/sandbox/status.ts:148 and src/lib/actions/sandbox/doctor.ts:626-628 (three back-to-back calls). Both are operator-driven so impact is limited, but it might be cleaner to give isShieldsDown an allowInlineRecovery=false knob like shieldsStatus has, so cheap-predicate callers can opt out.
The NC-3112 tests in index.test.ts import the module from dist/lib/shields/index.js, but vi.mock("../adapters/docker/exec", ...) and vi.mock("./audit", ...) target source-relative paths — those mocks don't intercept the dist module's resolutions. Tests still pass because vi.mock("node:child_process", ...) catches the underlying syscall path. The two vi.mock blocks are effectively dead scaffolding; either delete them or switch the imports to source.

Both pre-existing failures I worked around (onboard-selection.test.ts 5s timeouts, nemoclaw/dist/blueprint/private-networks.js missing) are unrelated to this PR — the first is environmental flake, the second is a missing build step (npm run build inside nemoclaw/). Worth filing as follow-ups but not blockers here.

prekshivyas · 2026-05-12T17:02:25Z

@ChunkyMonkey11 — dco-check is failing because the PR body is missing a Signed-off-by: trailer (it reads the body, not the commits; per-commit Signed-off-by: lines are already present on all commits).

Could you append a line like the following to the PR description?

Signed-off-by: Revant Patel <111788366+ChunkyMonkey11@users.noreply.github.com>

(or your real-name + Nvidia-email equivalent, whichever you normally use for DCO). That should turn the check green without needing a re-push.

ChunkyMonkey11 · 2026-05-12T17:57:54Z

@prekshivyas Added the Signed-off-by: trailer to the PR description for dco-check.

Signed-off-by: Revant Patel 111788366+ChunkyMonkey11@users.noreply.github.com

ChunkyMonkey11 · 2026-05-12T18:15:12Z

@prekshivyas Thanks — I picked up both optional cleanups you flagged.

isShieldsDown is now side-effect-free by default and takes an explicit opt-in:
- isShieldsDown(sandboxName, allowInlineRecovery = false)
I wired operator-facing callers to preserve current behavior:
- status.ts now calls isShieldsDown(sandboxName, true)
- doctor.ts now calls isShieldsDown(sandboxName, true) once and reuses the value
I removed the dead NC-3112 test scaffolding mocks in src/lib/shields/index.test.ts (../adapters/docker/exec and ./audit) since the tested path imports from dist and those mocks did not intercept.

Validation:

npm run build:cli
npx vitest run src/lib/shields/index.test.ts src/lib/shields/audit.test.ts
git diff --check

…s-timer # Conflicts: # src/lib/actions/sandbox/destroy.ts

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

ericksoa

Approved. Reviewed current head 00631e3 after resolving the main conflict and fixing the replacement timer-marker ownership bug. Focused shields/config E2E passed: https://github.com/NVIDIA/NemoClaw/actions/runs/25776147771

ChunkyMonkey11 marked this pull request as ready for review May 7, 2026 22:44

ChunkyMonkey11 mentioned this pull request May 7, 2026

[Ubuntu 24.04][Security] shields-down timer process death leaves shields permanently DOWN (fail-open) #3112

Closed

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

wscurran added NemoClaw CLI labels May 7, 2026

wscurran added the security Potential vulnerability, unsafe behavior, or access risk label May 7, 2026

jyaunches mentioned this pull request May 8, 2026

fix(shields): prevent destroy from leaving stale timer/state artifacts #3219

Open

6 tasks

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

Comment thread src/lib/shields/index.ts

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

Comment thread src/lib/shields/index.ts Outdated

jyaunches reviewed May 8, 2026

View reviewed changes

coderabbitai Bot reviewed May 10, 2026

View reviewed changes

Comment thread src/lib/shields/timer.ts

prekshivyas reviewed May 12, 2026

View reviewed changes

Test User added 9 commits May 11, 2026 21:00

fix(shields): self-heal expired auto-restore timers

e399946

fix(shields): neutralize destroy-time timer artifacts

166b8b2

fix(shields): exclude corrupt state markers from persistence

4a000e9

fix(shields): harden timer identity and fail closed on corrupt state

07899eb

fix(shields): validate timer marker identity before auto-restore

f6a21af

test(shields): replace source-pattern timer tests with behavior checks

cd6e4b2

fix(shields): align timer audit actions with shared audit typing

899f6bc

refactor(shields): extract timer/process helpers into timer-control m…

fe34509

…odule

fix(shields): preserve corrupt-state typing in recovery gate

cf55cb7

ChunkyMonkey11 force-pushed the fix/3112-stale-shields-timer branch from 731a912 to cf55cb7 Compare May 12, 2026 04:14

Merge branch 'main' into fix/3112-stale-shields-timer

9d28e64

prekshivyas self-assigned this May 12, 2026

prekshivyas added the v0.0.40 label May 12, 2026

prekshivyas and others added 4 commits May 12, 2026 09:38

Merge branch 'main' into fix/3112-stale-shields-timer

977326f

Merge branch 'main' into fix/3112-stale-shields-timer

77c1169

Merge branch 'main' into fix/3112-stale-shields-timer

25b8edc

prekshivyas requested a review from jyaunches May 12, 2026 22:11

ericksoa added 2 commits May 12, 2026 20:12

Merge remote-tracking branch 'origin/main' into fix/3112-stale-shield…

f5b8205

…s-timer # Conflicts: # src/lib/actions/sandbox/destroy.ts

fix(shields): preserve replacement timer markers

00631e3

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

ericksoa approved these changes May 13, 2026

View reviewed changes

ericksoa enabled auto-merge (squash) May 13, 2026 03:36

ericksoa merged commit af99453 into NVIDIA:main May 13, 2026
62 checks passed

coderabbitai Bot mentioned this pull request May 20, 2026

fix(cli): align shields status and logs tail UX #3866

Merged

12 tasks

wscurran added area: cli Command line interface, flags, terminal UX, or output bug-fix PR fixes a bug or regression and removed NemoClaw CLI labels Jun 3, 2026

Conversation

ChunkyMonkey11 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bug Behavior

Fix

Validation

Scope Note

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented May 7, 2026

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

ChunkyMonkey11 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

wscurran commented May 7, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ChunkyMonkey11 commented May 8, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jyaunches left a comment

Choose a reason for hiding this comment

Architectural / correctness review for #3204

🔴 Blocker: timer process must validate marker identity before restore/write

🟡 Warning: tests are too source-pattern-heavy for process lifecycle behavior

🟡 Warning: src/lib/shields/index.ts is becoming a timer/process-control module

🟡 Warning: PR #3169 overlaps the shields down policy path

🔵 Suggestion: audit event typing is drifting

🔵 Suggestion: acknowledge state-dir fallback behavior change

✅ What is good here

Recommendation

Uh oh!

ChunkyMonkey11 commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChunkyMonkey11 commented May 10, 2026

Uh oh!

coderabbitai Bot commented May 10, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

prekshivyas left a comment

Choose a reason for hiding this comment

🟡 isShieldsDown() skips the recovery gate

🟡 Audit field naming: restored_at vs restore_at

🔵 timer-control.ts uses CommonJS require()

Uh oh!

ChunkyMonkey11 commented May 12, 2026

Uh oh!

prekshivyas commented May 12, 2026

Uh oh!

prekshivyas commented May 12, 2026

Uh oh!

ChunkyMonkey11 commented May 12, 2026

Uh oh!

ChunkyMonkey11 commented May 12, 2026

Uh oh!

ericksoa left a comment

ChunkyMonkey11 commented May 7, 2026 •

edited

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading

ChunkyMonkey11 commented May 7, 2026 •

edited

Loading

🟡 Warning: `src/lib/shields/index.ts` is becoming a timer/process-control module

🟡 Warning: PR #3169 overlaps the `shields down` policy path

ChunkyMonkey11 commented May 8, 2026 •

edited

Loading

🟡 `isShieldsDown()` skips the recovery gate

🟡 Audit field naming: `restored_at` vs `restore_at`

🔵 `timer-control.ts` uses CommonJS `require()`