Skip to content

[Linux/systemd] Gateway restart from within agent session causes SIGTERM crash loop #32348

@mohfoda1982-create

Description

@mohfoda1982-create

Summary

On Linux with systemd, triggering a gateway restart from within a running agent session causes the session to be killed via SIGTERM, often leading to a crash loop.

Environment

  • OpenClaw: 2026.3.1
  • OS: Ubuntu 24.04 (Linux 6.17.0)
  • Install: npm global (sudo npm i -g openclaw)
  • Gateway: systemd user service (openclaw-gateway.service)
  • Gateway bind: loopback

Steps to Reproduce

  1. Run gateway as systemd user service
  2. From within an active agent session, trigger systemctl --user restart openclaw-gateway.service (or any restart mechanism including openclaw gateway restart, nohup bash restart.sh &)
  3. The restart SIGTERM kills the gateway process group, which includes the exec tool child process
  4. Session dies mid-execution
  5. Systemd auto-restarts the gateway, but the new instance cannot bind the port if the old process has not fully exited → crash loop (observed 284 restart attempts over ~2 hours)

Root Cause

The exec tool runs shell commands as child processes within the gateway systemd cgroup. When systemd stops the gateway service with KillMode=control-group, it sends SIGTERM to the entire cgroup — including the exec subprocess. Even nohup and systemd-run --user do not fully escape this because they still run within the same user session cgroup.

Impact

  • Gateway becomes unreachable requiring manual intervention
  • Crash loop can persist for hours (observed 284 restart attempts over ~2 hours)
  • Any in-flight agent sessions are lost

Workaround

  • Use kill -HUP <gateway-pid> for config reloads (non-disruptive)
  • Only do full restarts manually from outside the gateway session
  • Set KillMode=control-group + TimeoutStopSec=15 in service file to prevent port conflicts on restart

Suggestion

  1. Provide a safe out-of-band restart mechanism (e.g. a dedicated restart endpoint or a pre-installed systemd oneshot service that runs in its own scope)
  2. Document clearly that openclaw gateway restart must never be called from within an agent session on Linux/systemd
  3. Consider making the gateway exec tool subprocess escape the cgroup via systemd-run --scope

Relevant Log

Mar 02 17:00:01 systemd: openclaw-gateway.service: Scheduled restart job, restart counter is at 7
Mar 02 17:00:03 systemd: openclaw-gateway.service: Main process exited, code=exited, status=1/FAILURE
[...repeated 284 times over ~2 hours...]

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions