test: fix crash-loop gateway PID detection#4045
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR hardens the E2E crash-loop recovery test by replacing a single pgrep-based gateway PID lookup with an in-sandbox ps/awk selection that prefers explicit gateway argv/comm matches and conditionally falls back to an older openclaw process, and changes sandbox diagnostics to use ChangesTest Helper Improvements
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
## Summary - Revert the crash-loop E2E PID-detector change from #4045. - #4045 adapted `test-issue-2478-crash-loop-recovery.sh` for the OpenClaw 2026.5.x process-title shape seen after #3820. - #4051 reverted #3820 and the latest sandbox is back on OpenClaw 2026.4.24, where the pre-#3820 detector is the known-good path. ## Why - The known-good pre-#3820 run at `80ee341686d695147c5cd118d1049c32f52d5af9` passed `issue-2478-crash-loop-recovery-e2e`. - The current failing run on reverted main showed `openclaw-gateway` alive, `[gateway] ready`, sandbox `Ready`, and `Agent: OpenClaw v2026.4.24`, but the #4045 detector still returned empty. ## Validation - `bash -n test/e2e/test-issue-2478-crash-loop-recovery.sh` - `git diff --check` - `git diff --stat 80ee341 -- test/e2e/test-issue-2478-crash-loop-recovery.sh` is empty; the test file now matches the known-good pre-#3820 version. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Enhanced crash-loop recovery testing with simplified process detection and improved diagnostic capabilities for system troubleshooting. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4056?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
openclaw gateway/openclaw-gatewaymatches, then fall back to the oldest liveopenclawprocess only aftergateway.logproves the gateway reached ready/listeningopenshell sandbox info --namediagnostics with supportedopenshell sandbox get <name>Why
Run 26262708948 failed with "Gateway never came up after onboard", but diagnostics showed the gateway was actually healthy:
pgrepfound311 openclaw,gateway.logcontainedhttp server listeningandgateway ready, andnemoclaw statusreported sandboxPhase: Readywith healthy inference. The E2E matcher only looked foropenclaw gateway/openclaw-gateway, so current OpenClaw process-title shape produced a false negative.Validation
bash -n test/e2e/test-issue-2478-crash-loop-recovery.shgit diff --checkopenclaw gateway run, retitledopenclaw-gateway, plainopenclawwith ready gateway log, and plainopenclawwithout ready logSummary by CodeRabbit