Bug type
Bug / async completion routing
Summary
Async task completion reporting is unreliable when external task runners (for example Codex via exec) try to notify OpenClaw with:
openclaw system event --text "...done..." --mode now
In practice this can fail in two ways:
- the CLI call itself fails with local gateway websocket errors such as:
gateway closed (1006 abnormal closure)
- target
ws://127.0.0.1:18789
- even when the call succeeds conceptually,
system event / wake is not session-targeted, so completion reporting does not reliably route back to the originating user conversation
This makes long-running background tasks feel "done but never reported" unless the user asks again.
Environment
- OpenClaw:
2026.3.13 (61d171a)
- OS: macOS
- Gateway mode: local
- Gateway bind: loopback
- Messaging surface: Telegram direct chat
- Typical runner:
exec background task launching Codex / external orchestrator
What I observed
I reproduced this with a simple background coding task:
- Start a long-ish Codex task via background
exec
- Ask Codex to run this on completion:
openclaw system event --text "Done: Built a concise responsive login page in the temp project directory" --mode now
- The coding task finishes successfully
- The completion notification does not arrive in the originating Telegram conversation
In one captured run, Codex did execute the command, but it failed with:
gateway closed (1006 abnormal closure)
Gateway target: ws://127.0.0.1:18789
At the same time the actual task output/files were present, confirming the work completed.
Root cause analysis
After tracing the current implementation, this looks like a product/architecture gap rather than only a single transport glitch:
A. openclaw system event is implemented as a wake
The CLI path does not act like a reliable completion callback. It effectively does:
- enqueue system event text
- request heartbeat
B. wake is not session-targeted
The current wake path does not carry sessionKey in the relevant CLI flow.
That means the event is not reliably bound to the originating conversation that launched the async task.
C. heartbeat defaults to the agent's main session when no forced session is provided
So even if the wake/system event path works, it does not guarantee delivery back to the original user thread / DM that triggered the task.
D. local websocket fragility makes it worse
From external task runners, the local gateway websocket path can also fail with 1006 abnormal closure, so the fallback notification bridge is itself not reliable.
Why this matters
This creates a bad UX for background tasks:
- task actually completes
- OpenClaw may know something happened
- user still gets no completion report
- user has to manually ask "is it done?"
This is especially noticeable for:
- Codex / ACP tasks launched from chat
- background
exec jobs
- external orchestrators like ClawTeam
Expected behavior
At least one of these should be true:
openclaw system event / wake supports an explicit sessionKey and reliably wakes the originating session
- async exec completion events preserve originating session context automatically
- there is a first-class completion notification path for background tasks that can deliver to the originating channel/session without depending on main-session heartbeat inference
Related work already in the repo
This seems closely related to:
Suggestion
I suspect the real fix is not just transport retry. The bigger gap is that system event / wake is currently used as if it were a completion callback, but it is really an internal wake/heartbeat mechanism.
So the best fix is probably one or both of:
- explicit session targeting for wake/system-event entry points
- a first-class completion notification mechanism for async/background tasks
If useful, I can provide a more detailed repro timeline and the exact local logs / Codex transcript snippets that showed:
- successful task completion
- attempted
openclaw system event
- failure with
gateway closed (1006 abnormal closure)
- no proactive Telegram completion report
Bug type
Bug / async completion routing
Summary
Async task completion reporting is unreliable when external task runners (for example Codex via
exec) try to notify OpenClaw with:openclaw system event --text "...done..." --mode nowIn practice this can fail in two ways:
gateway closed (1006 abnormal closure)ws://127.0.0.1:18789system event/wakeis not session-targeted, so completion reporting does not reliably route back to the originating user conversationThis makes long-running background tasks feel "done but never reported" unless the user asks again.
Environment
2026.3.13 (61d171a)execbackground task launching Codex / external orchestratorWhat I observed
I reproduced this with a simple background coding task:
execopenclaw system event --text "Done: Built a concise responsive login page in the temp project directory" --mode nowIn one captured run, Codex did execute the command, but it failed with:
At the same time the actual task output/files were present, confirming the work completed.
Root cause analysis
After tracing the current implementation, this looks like a product/architecture gap rather than only a single transport glitch:
A.
openclaw system eventis implemented as awakeThe CLI path does not act like a reliable completion callback. It effectively does:
B.
wakeis not session-targetedThe current
wakepath does not carrysessionKeyin the relevant CLI flow.That means the event is not reliably bound to the originating conversation that launched the async task.
C. heartbeat defaults to the agent's main session when no forced session is provided
So even if the wake/system event path works, it does not guarantee delivery back to the original user thread / DM that triggered the task.
D. local websocket fragility makes it worse
From external task runners, the local gateway websocket path can also fail with
1006 abnormal closure, so the fallback notification bridge is itself not reliable.Why this matters
This creates a bad UX for background tasks:
This is especially noticeable for:
execjobsExpected behavior
At least one of these should be true:
openclaw system event/wakesupports an explicitsessionKeyand reliably wakes the originating sessionRelated work already in the repo
This seems closely related to:
--session-keysupport to system wake/eventsessionKeyin exec/hooks to fix async context lossSuggestion
I suspect the real fix is not just transport retry. The bigger gap is that
system event/wakeis currently used as if it were a completion callback, but it is really an internal wake/heartbeat mechanism.So the best fix is probably one or both of:
If useful, I can provide a more detailed repro timeline and the exact local logs / Codex transcript snippets that showed:
openclaw system eventgateway closed (1006 abnormal closure)