fix(recovery): keep gateway guard preloads active after respawn#2777
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughSandbox and ciao preload logic now detect gateway context via a computed Changes
Sequence DiagramsequenceDiagram
participant Launcher as Launcher (openclaw gateway)
participant Child as Child Process (openclaw-gateway)
participant SafetyNet as Safety-Net Preload
participant CiaoGuard as Ciao Guard Preload
participant Runtime as Runtime
Launcher->>Child: Re-exec (NODE_OPTIONS includes preloads)
Child->>SafetyNet: Node loads safety-net preload
SafetyNet->>SafetyNet: Compute _gatewayProcess (argv/title/basename)
alt Gateway detected
SafetyNet->>Child: Emit stderr "sandbox-safety-net loaded (...)"
SafetyNet->>Runtime: Enable gateway-only guard behavior
else Not a gateway
SafetyNet->>SafetyNet: Early-return (no gateway behavior)
end
Child->>CiaoGuard: Node loads ciao-network-guard preload
CiaoGuard->>CiaoGuard: Check _gatewayProcess truthiness
alt Gateway detected
CiaoGuard->>Child: Emit stderr "ciao-network-guard loaded (...)"
CiaoGuard->>Runtime: Register uncaughtException handler
else Not a gateway
CiaoGuard->>CiaoGuard: Skip handler registration
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 8/10 reviews remaining, refill in 10 minutes and 58 seconds. Comment |
Selective E2E Results — ✅ All requested jobs passedRun: 25190082955
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/nemoclaw.ts (1)
317-324: Run the two recovery-focused E2E lanes before merge.Since this changes restart/recovery behavior, validate with
sandbox-survival-e2eandsandbox-operations-e2ein CI to catch post-respawn integration regressions.As per coding guidelines:
src/nemoclaw.tsE2E test recommendation includessandbox-survival-e2eandsandbox-operations-e2e.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nemoclaw.ts` around lines 317 - 324, This change to restart/recovery behavior in src/nemoclaw.ts (the shell startup/recovery command array that includes checks for NODE_OPTIONS, curl healthcheck to DASHBOARD_PORT, and temp log/lock cleanup) requires running the two recovery-focused E2E lanes before merging: trigger sandbox-survival-e2e and sandbox-operations-e2e for this PR/branch in CI and attach their green runs to the review; if your CI config requires it, add those two jobs to the PR pipeline or re-run the workflow that includes them so their successful results are present prior to merge.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/nemoclaw.ts`:
- Around line 317-324: This change to restart/recovery behavior in
src/nemoclaw.ts (the shell startup/recovery command array that includes checks
for NODE_OPTIONS, curl healthcheck to DASHBOARD_PORT, and temp log/lock cleanup)
requires running the two recovery-focused E2E lanes before merging: trigger
sandbox-survival-e2e and sandbox-operations-e2e for this PR/branch in CI and
attach their green runs to the review; if your CI config requires it, add those
two jobs to the PR pipeline or re-run the workflow that includes them so their
successful results are present prior to merge.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 73818e87-2af2-4d85-9b52-a588458e3627
📒 Files selected for processing (6)
scripts/nemoclaw-start.shsrc/lib/agent-runtime.test.tssrc/lib/agent-runtime.tssrc/nemoclaw.tstest/e2e/test-issue-2478-crash-loop-recovery.shtest/nemoclaw-start.test.ts
Selective E2E Results — ✅ All requested jobs passedRun: 25190568453
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/nemoclaw-start.test.ts (1)
38-42: ⚡ Quick winEscape the
markerparameter before using it in RegExp constructor.While the current call sites use safe string literals, the
markerparameter should be escaped to prevent issues if it contains regex metacharacters. This is a defensive coding improvement.🛡️ Proposed fix
function startScriptHeredoc(src: string, marker: string): string { - const match = src.match(new RegExp(`<<'${marker}'\\n([\\s\\S]*?)\\n${marker}`)); + const escapedMarker = marker.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); + const match = src.match(new RegExp(`<<'${escapedMarker}'\\n([\\s\\S]*?)\\n${escapedMarker}`)); expect(match).toBeTruthy(); return match![1]; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/nemoclaw-start.test.ts` around lines 38 - 42, The startScriptHeredoc function builds a RegExp from the marker parameter without escaping it; escape marker before constructing the RegExp to avoid treating regex metacharacters as special. Update startScriptHeredoc to create an escapedMarker (e.g. via a small escapeRegExp helper or marker.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')) and use that escapedMarker when building the RegExp passed to src.match so the heredoc boundary is matched literally.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 38-42: The startScriptHeredoc function builds a RegExp from the
marker parameter without escaping it; escape marker before constructing the
RegExp to avoid treating regex metacharacters as special. Update
startScriptHeredoc to create an escapedMarker (e.g. via a small escapeRegExp
helper or marker.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')) and use that
escapedMarker when building the RegExp passed to src.match so the heredoc
boundary is matched literally.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 33554ae1-5d30-467b-9063-4e81737a76a4
📒 Files selected for processing (2)
scripts/nemoclaw-start.shtest/nemoclaw-start.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/nemoclaw-start.sh
Selective E2E Results — ✅ All requested jobs passedRun: 25190997787
|
Selective E2E Results — ❌ Some jobs failedRun: 25191894511
|
Summary
openclaw-gatewaychild as a gateway process for the sandbox safety-net and ciao guard preloads.os.networkInterfaces()failures.NODE_OPTIONSproduce an observable gateway-recovery warning.Root cause
The failing
issue-2478-crash-loop-recovery-e2erun onmaindid recover the gateway: cycle 1 respawnedopenclaw-gatewayas pid 708. The failure was the guard-chain proof after respawn. The guard scripts only consideredprocess.argv[2] === "gateway"to be a gateway invocation, but the live recovered process is the re-execedopenclaw-gatewaychild. The e2e also waited for the ciao guard'sos.networkInterfaces()failure log, which is an incidental library call and did not fire after the recovered gateway came up.Validation
npm ci --ignore-scriptsnpm run build:clinpm test -- test/nemoclaw-start.test.ts src/lib/agent-runtime.test.tsbash -n scripts/nemoclaw-start.shbash -n test/e2e/test-issue-2478-crash-loop-recovery.shNote: local commit/push hooks were skipped for the final commit/push because the full CLI coverage hook entered a recursive/long-running fixture path; the focused validation above passed.
Summary by CodeRabbit
Bug Fixes
Tests