Skip to content

[Station][Recovery] openclaw gateway killed — no auto-respawn; parent daemon exits with child; recovery requires nemoclaw connect #2757

@zNeill

Description

@zNeill

Description

Description

After kill -9 on the openclaw-gateway process inside the sandbox, the parent openclaw daemon exits with the child and no auto-respawn occurs. Recovery requires the user to manually run "nemoclaw connect". The expected behavior is that the gateway process is supervised and auto-restarted within 60 seconds without user intervention.
Environment
Device:        Station
OS:            Ubuntu 24.04.4 (aarch64)
Architecture:  aarch64
OpenShell CLI: 0.0.36
NemoClaw:      v0.0.31
Steps to Reproduce
1. Onboard a sandbox and verify it is healthy:
   nemoclaw onboard --name my-assistant --non-interactive
   nemoclaw my-assistant status

2. Find the openclaw-gateway PID inside the sandbox:
   nemoclaw my-assistant connect
   # inside sandbox:
   pgrep -a openclaw

3. Kill the gateway with SIGKILL:
   sudo kill -9

4. Exit the sandbox and wait 60 seconds. Do NOT run nemoclaw connect.

5. Check gateway status:
   nemoclaw my-assistant status
Expected Result
Within 60 seconds, the openclaw gateway auto-respawns without user intervention. nemoclaw status shows the sandbox healthy again.
Actual Result
After kill -9, the parent openclaw daemon exits alongside the gateway child process. No respawn occurs for 60+ seconds. nemoclaw status shows gateway is dead.

Recovery only works after running:
  nemoclaw my-assistant connect

This triggers manual gateway restart — it is not automatic.
Root Cause
src/lib/agent-runtime.ts — gateway is launched with:
  nohup "$AGENT_BIN" gateway run --port ${port} >> /tmp/gateway.log 2>&1 &

There is no SIGCHLD handler, no watchdog timer, and no respawn loop monitoring the gateway PID. If the gateway exits for any reason, the parent daemon has no code to detect and restart it.
Logs
Confirmed via code analysis: src/lib/agent-runtime.ts (gateway launch, no supervisor logic).
Observed on galaxy-sku2-018 (NemoClaw v0.0.31, Ubuntu 24.04.4 aarch64).
]]>

Bug Details

Field Value
Priority Unprioritized
Action Dev - Open - To fix
Disposition Open issue
Module Machine Learning - NemoClaw
Keyword NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Sandbox

[NVB#6130437]

Metadata

Metadata

Assignees

Labels

NV QABugs found by the NVIDIA QA TeamUATIssues flagged for User Acceptance Testing.area: sandboxOpenShell sandbox lifecycle, runtime, config, or recoveryplatform: ubuntuAffects Ubuntu Linux environments

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions