OpenClaw 2026.5.26 — Native hook relay bridge never spawns; all native/local tool calls return "Native hook relay unavailable"
Status: DRAFT — reporter Adam Houk, with AI agents ARC + Claude as investigation tools. Pending post.
Target repo: https://github.com/openclaw/openclaw/issues
Suggested labels: bug, native-hook-relay, codex, 2026.5.26
Reporter: Adam Houk, Applied AI Solutions — adam@appliedai.solutions
Summary
After a fresh gateway start on OpenClaw 2026.5.26, the native hook relay bridge HTTP server is never created and no registration file is written to /tmp/openclaw-native-hook-relays-<uid>/. Every subsequent native/local tool call (shell, file read, MCP memory) fails immediately with Native hook relay unavailable. OpenClaw dynamic tools (cron, web_fetch, MCP-only surfaces) continue to work because they don't depend on the native hook relay path.
The bug is reproducible across both a main gateway profile (openclaw-gateway.service) and a custom profile (openclaw-beacon-gateway.service). Gateway restarts do not recover. Cleaning stale registration files does not help — the issue is that no new registration is ever written for the current PID.
Environment
- OpenClaw
2026.5.26
- Codex plugin
@openclaw/codex (bundled), transport: stdio, mode: yolo, sandbox: danger-full-access, requestTimeoutMs: 60000, turnCompletionIdleTimeoutMs: 60000
- Discord plugin enabled (channel surface in use)
- AgentMemory plugin v0.9.21 (
/agentmemory/* MCP routes)
- Ubuntu 22.04 in WSL2 on Windows 11
- Node.js bundled with the gateway runtime
- Gateway run as a systemd user service (
linger=yes)
Repro steps
- Fresh gateway start (
systemctl --user restart openclaw-gateway.service).
- Wait ~30s for
[gateway] ready in the log.
- From any channel (Discord DM, web chat, etc.), trigger any native/local tool call:
date, file read, MCP memory_sessions, anything that goes through the native hook relay.
- Result: tool call fails with
Native hook relay unavailable. Behavior persists indefinitely.
Expected behavior
On gateway start, the native hook relay bridge HTTP server is created on a random 127.0.0.1 port, and a registration file is written to /tmp/openclaw-native-hook-relays-<uid>/<key>.json with the gateway's PID, port, bearer token, and expiry. Subsequent invocations from the Codex app-server (via pre_tool_use, permission_request, etc.) find this file, POST to the port with the bearer token, and the relay handles the request.
Actual behavior
/tmp/openclaw-native-hook-relays-<uid>/ contains only stale registration files from prior gateway sessions whose PIDs are no longer alive and whose ports are no longer listening.
- No registration file is ever written for the current gateway PID.
- Every native hook invocation falls through to
renderNativeHookRelayUnavailableResponse and renders the "Native hook relay unavailable" message.
Evidence
1. Bridge directory snapshot during repro
$ ls -la /tmp/openclaw-native-hook-relays-1000/
drwx------ 2 user user 120 May 28 01:39 .
drwxrwxrwt 15 root root 3020 May 28 01:40 ..
-rw------- 1 user user 197 May 27 23:04 2413c88d....json
-rw------- 1 user user 196 May 27 14:29 5835f8d6....json
-rw------- 1 user user 196 May 27 15:05 654149....json
-rw------- 1 user user 196 May 27 19:51 90c38fd2....json
(Directory mtime updated by ExecStartPre cleanup at gateway start, but no new file created since.)
2. PIDs in registration files are all dead
Each *.json file's pid field corresponds to a process that no longer exists.
$ for f in /tmp/openclaw-native-hook-relays-1000/*.json; do
pid=$(python3 -c "import json; print(json.load(open('$f'))['pid'])")
echo "$f -> pid=$pid alive=$(kill -0 $pid 2>/dev/null && echo yes || echo no)"
done
.../2413c88d....json -> pid=211925 alive=no
.../5835f8d6....json -> pid=11601 alive=no
.../654149....json -> pid=17733 alive=no
.../90c38fd2....json -> pid=90952 alive=no
3. Ports in registration files are not listening
$ ss -ltnp | grep 127.0.0.1
LISTEN 127.0.0.1:18789 openclaw-gateway pid=226303 # gateway HTTP server
LISTEN 127.0.0.1:18791 pid=226303 # browser control
# Ports 44387, 37109, 46593, 38385 (from registration files) — NOT LISTENING
4. Current gateway PID has no registration
The live gateway PID (226303) appears in ss -ltnp listening on 18789/18791, but is not the pid in any registration file. No file exists for the current relay key.
Suspected root cause
In dist/native-hook-relay-AN6S_wz5.js, function registerNativeHookRelayBridge:
relayBridges.set(registration.relayId, bridge);
server.on("error", (error) => {
log.debug("native hook relay bridge server error", { error, relayId: ... });
});
server.listen(0, "127.0.0.1", () => {
if (relayBridges.get(registration.relayId) !== bridge) return;
writeNativeHookRelayBridgeRecordForRegistration(registration, bridge);
});
if (relayBridges.get(registration.relayId) === bridge)
writeNativeHookRelayBridgeRecordForRegistration(registration, bridge);
server.unref();
And writeNativeHookRelayBridgeRecordForRegistration:
const address = bridge.server.address();
if (!address || typeof address === "string") {
log.debug("native hook relay bridge server address unavailable", { relayId });
return;
}
// ... write the JSON file
The registration-file write is attempted in two places:
- Immediately after
server.listen() — at this point the bind has not completed yet and server.address() returns null. writeNativeHookRelayBridgeRecordForRegistration correctly bails on null.
- Inside the
listen() callback — after bind completes, when server.address() should return the bound port.
Hypothesis: in our build, either (a) the listen() callback is never being invoked, or (b) by the time it fires, relayBridges.get(registration.relayId) !== bridge (i.e., the bridge has been replaced/unregistered between create and bind-complete).
The server.on("error", ...) handler only logs at debug level — so any silent bind failure or premature unregistration is invisible without LOG_LEVEL=debug or equivalent.
Mitigations attempted (all failed to recover)
- Cleaned stale registration files (
rm /tmp/openclaw-native-hook-relays-<uid>/*.json) and restarted gateway → directory wiped, but no new file written by the restarted gateway.
- Added
ExecStartPre to the systemd unit to wipe the dir before each gateway start → confirmed wipe happens, confirmed no new file appears.
- Mirrored
codex.config.appServer knobs (transport: stdio, requestTimeoutMs: 60000, turnCompletionIdleTimeoutMs: 60000, codeModeOnly: false) from a known-good config → no effect on the bridge registration.
- Safe-restart via
scripts/openclaw-safe-restart.sh (graceful supervisor restart) → no effect on the bridge registration.
Workaround
Channels and tools that don't go through the native hook relay continue to work:
- OpenClaw dynamic tools (cron, web_fetch)
- MCP-based surfaces (AgentMemory MCP, GitHub MCP)
- Direct shell access via channels outside OpenClaw
Agent productivity is severely degraded because the affected tool set includes file reads, native shell, and MCP memory queries from inside agent turns. Local-first agents cannot inspect their own workspace.
Suggested fix paths
- Surface the silent
server.on("error", ...) log at WARN/ERROR level so silent bind failures are visible without DEBUG.
- Recover automatically when
listen() callback doesn't write a registration within a timeout — re-attempt registration, or surface a startup error that prevents the gateway from reporting ready.
- Document a config knob to disable the native hook relay so users can fall back to pure-MCP operation when the bridge fails.
- Auto-prune stale registration files at gateway start (currently only cleaned on graceful shutdown, not on crash/SIGUSR1).
Reproducible test case
Standalone Node script that exercises registerNativeHookRelayBridge directly would isolate whether the bug is in the bridge module itself or in how the surrounding lifecycle code invokes it. We have not built this yet, but can if helpful.
Additional debug info to capture
LOG_LEVEL=debug npx openclaw gateway run boot transcript to surface the silent server.on("error", ...) events.
- Output of
strace -f -e trace=bind,listen against the gateway PID at startup to confirm whether listen(0, 127.0.0.1) is even being called.
- Comparison of the broken state vs a known-good prior release (last working version on this user's machine was [TO BE FILLED IN] — Adam, can you check
npm-cache or shell history for the prior pinned version?).
Investigation team
Reporter: Adam Houk — Applied AI Solutions. Discovered the symptom on a production install, drove the investigation, owns the repro.
AI agents used to gather evidence:
- ARC — my custom OpenClaw agent ("Applied Responsive Companion") running on the affected install. Observed the symptom from inside the broken environment, identified that OpenClaw dynamic tools and MCP-based surfaces continue to work as a partial workaround.
- Claude (Anthropic, running in Cowork on my desktop) — ran the diagnostic from outside OpenClaw via direct WSL shell, identified that no registration file exists for the current gateway PID, and traced the failure to the
registerNativeHookRelayBridge lifecycle in the dist bundle.
Channel for follow-up: adam@appliedai.solutions or this GitHub issue thread.
Optional debug info we can add post-filing
LOG_LEVEL=debug boot transcript to surface the silent server.on("error", ...) events.
- Last known-good OpenClaw version on this install (from npm cache / install history).
- Comparison to a fresh fork-clone build of HEAD if the regression is post-2026.5.26.
I'm happy to attach any of these to the issue thread once a maintainer responds.
OpenClaw 2026.5.26 — Native hook relay bridge never spawns; all native/local tool calls return "Native hook relay unavailable"
Status: DRAFT — reporter Adam Houk, with AI agents ARC + Claude as investigation tools. Pending post.
Target repo: https://github.com/openclaw/openclaw/issues
Suggested labels:
bug,native-hook-relay,codex,2026.5.26Reporter: Adam Houk, Applied AI Solutions —
adam@appliedai.solutionsSummary
After a fresh gateway start on OpenClaw 2026.5.26, the native hook relay bridge HTTP server is never created and no registration file is written to
/tmp/openclaw-native-hook-relays-<uid>/. Every subsequent native/local tool call (shell, file read, MCP memory) fails immediately withNative hook relay unavailable. OpenClaw dynamic tools (cron, web_fetch, MCP-only surfaces) continue to work because they don't depend on the native hook relay path.The bug is reproducible across both a main gateway profile (
openclaw-gateway.service) and a custom profile (openclaw-beacon-gateway.service). Gateway restarts do not recover. Cleaning stale registration files does not help — the issue is that no new registration is ever written for the current PID.Environment
2026.5.26@openclaw/codex(bundled),transport: stdio,mode: yolo,sandbox: danger-full-access,requestTimeoutMs: 60000,turnCompletionIdleTimeoutMs: 60000/agentmemory/*MCP routes)linger=yes)Repro steps
systemctl --user restart openclaw-gateway.service).[gateway] readyin the log.date, file read, MCP memory_sessions, anything that goes through the native hook relay.Native hook relay unavailable. Behavior persists indefinitely.Expected behavior
On gateway start, the native hook relay bridge HTTP server is created on a random 127.0.0.1 port, and a registration file is written to
/tmp/openclaw-native-hook-relays-<uid>/<key>.jsonwith the gateway's PID, port, bearer token, and expiry. Subsequent invocations from the Codex app-server (viapre_tool_use,permission_request, etc.) find this file, POST to the port with the bearer token, and the relay handles the request.Actual behavior
/tmp/openclaw-native-hook-relays-<uid>/contains only stale registration files from prior gateway sessions whose PIDs are no longer alive and whose ports are no longer listening.renderNativeHookRelayUnavailableResponseand renders the "Native hook relay unavailable" message.Evidence
1. Bridge directory snapshot during repro
(Directory mtime updated by
ExecStartPrecleanup at gateway start, but no new file created since.)2. PIDs in registration files are all dead
Each
*.jsonfile'spidfield corresponds to a process that no longer exists.3. Ports in registration files are not listening
4. Current gateway PID has no registration
The live gateway PID (226303) appears in
ss -ltnplistening on 18789/18791, but is not thepidin any registration file. No file exists for the current relay key.Suspected root cause
In
dist/native-hook-relay-AN6S_wz5.js, functionregisterNativeHookRelayBridge:And
writeNativeHookRelayBridgeRecordForRegistration:The registration-file write is attempted in two places:
server.listen()— at this point the bind has not completed yet andserver.address()returnsnull.writeNativeHookRelayBridgeRecordForRegistrationcorrectly bails on null.listen()callback — after bind completes, whenserver.address()should return the bound port.Hypothesis: in our build, either (a) the
listen()callback is never being invoked, or (b) by the time it fires,relayBridges.get(registration.relayId) !== bridge(i.e., the bridge has been replaced/unregistered between create and bind-complete).The
server.on("error", ...)handler only logs at debug level — so any silent bind failure or premature unregistration is invisible withoutLOG_LEVEL=debugor equivalent.Mitigations attempted (all failed to recover)
rm /tmp/openclaw-native-hook-relays-<uid>/*.json) and restarted gateway → directory wiped, but no new file written by the restarted gateway.ExecStartPreto the systemd unit to wipe the dir before each gateway start → confirmed wipe happens, confirmed no new file appears.codex.config.appServerknobs (transport: stdio,requestTimeoutMs: 60000,turnCompletionIdleTimeoutMs: 60000,codeModeOnly: false) from a known-good config → no effect on the bridge registration.scripts/openclaw-safe-restart.sh(graceful supervisor restart) → no effect on the bridge registration.Workaround
Channels and tools that don't go through the native hook relay continue to work:
Agent productivity is severely degraded because the affected tool set includes file reads, native shell, and MCP memory queries from inside agent turns. Local-first agents cannot inspect their own workspace.
Suggested fix paths
server.on("error", ...)log at WARN/ERROR level so silent bind failures are visible without DEBUG.listen()callback doesn't write a registration within a timeout — re-attempt registration, or surface a startup error that prevents the gateway from reportingready.Reproducible test case
Standalone Node script that exercises
registerNativeHookRelayBridgedirectly would isolate whether the bug is in the bridge module itself or in how the surrounding lifecycle code invokes it. We have not built this yet, but can if helpful.Additional debug info to capture
LOG_LEVEL=debug npx openclaw gateway runboot transcript to surface the silentserver.on("error", ...)events.strace -f -e trace=bind,listenagainst the gateway PID at startup to confirm whetherlisten(0, 127.0.0.1)is even being called.npm-cacheor shell history for the prior pinned version?).Investigation team
Reporter: Adam Houk — Applied AI Solutions. Discovered the symptom on a production install, drove the investigation, owns the repro.
AI agents used to gather evidence:
registerNativeHookRelayBridgelifecycle in the dist bundle.Channel for follow-up:
adam@appliedai.solutionsor this GitHub issue thread.Optional debug info we can add post-filing
LOG_LEVEL=debugboot transcript to surface the silentserver.on("error", ...)events.I'm happy to attach any of these to the issue thread once a maintainer responds.