Skip to content

[Bug] Gateway accumulates zombie processes from MCP servers and subprocess calls #15012

@tudogit

Description

@tudogit

Description

The Hermes gateway (running as PID 1 in Docker) accumulates zombie processes over time from:

  • MCP server processes (gbrain, bun)
  • Git operations
  • Browser automation subprocesses
  • Shell pipe commands (head, etc.)

Evidence

$ ps aux | awk '$8 ~ /Z/ {print}'
hermes      1689  0.0  0.0      0     0 ?        Zs   14:40   0:00 [agent-browser-l] <defunct>
hermes      1902  0.0  0.0      0     0 ?        Zs   14:50   0:00 [git] <defunct>
hermes      1984  0.0  0.0      0     0 ?        Zs   14:52   0:00 [git] <defunct>
hermes      1988  0.0  0.0      0     0 ?        Zs   14:52   0:00 [git] <defunct>
hermes      2861  0.0  0.0      0     0 ?        Z    14:59   0:00 [gbrain] <defunct>
hermes      2862  0.0  0.0      0     0 ?        Z    14:59   0:00 [head] <defunct>
hermes      2863  0.0  0.0      0     0 ?        Z    14:59   0:01 [bun] <defunct>
...

All zombies have PPID=1 (the gateway process), indicating the gateway is not reaping child processes.

Root Cause

When running as PID 1, the gateway does not handle SIGCHLD signals to reap terminated child processes. In standard Unix, orphan processes are reparented to PID 1, which is expected to call wait() on them.

Proposed Solutions

  1. In gateway code: Add signal.signal(signal.SIGCHLD, signal.SIG_IGN) to auto-reap zombies, or implement a proper SIGCHLD handler that calls waitpid(-1, WNOHANG).

  2. In Docker: Use --init flag or add tini as PID 1 to handle signal forwarding and zombie reaping.

  3. In Dockerfile: Consider using tini as the entrypoint:

    ENTRYPOINT ["/sbin/tini", "--"]
    CMD ["hermes", "gateway", "run"]

Environment

  • Hermes version: v0.11.0
  • Deployment: Docker
  • OS: Linux (Debian-based)

Impact

Zombie processes don't consume CPU/memory, but they do occupy PID slots. Long-running containers may eventually exhaust available PIDs, causing "fork: cannot allocate memory" errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsarea/dockerDocker image, Compose, packagingcomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions