[Bug]: Gateway hard-crashes with 0xC0000409 (STATUS_STACK_BUFFER_OVERRUN) on Windows during Mattermost streaming reply; auto-respawn frequently wedges

## Bug

The gateway crashes hard on Windows with exit code `3221226505` (`0xC0000409` — `STATUS_STACK_BUFFER_OVERRUN`) during normal operation: incoming Mattermost channel events arriving while the embedded `acpx` runtime is mid-inference. The user-visible symptom is **a Mattermost post that the bot stopped editing mid-stream** — the bot creates a post, edits it once or twice with partial content, then dies before sending the rest. Mattermost is left holding the half-finished message.

This is distinct from #64253 (gateway alive but unresponsive) — here the process exits, with a memory-corruption status code. After the crash, the Windows Scheduled Task auto-respawns the gateway, but the respawned instance frequently fails to complete `starting channels and sidecars…` (CPU-pegged, no Mattermost connect log line, never replies to inbound). A hard kill + clean re-trigger is needed to recover, and stuck-session trajectories pile up under `~/.openclaw/agents/main/sessions/`.

## Symptoms

1. **Hard crash, exit `0xC0000409`.** `Get-ScheduledTaskInfo -TaskName 'OpenClaw Gateway'` reports `LastTaskResult: 3221226505` after each crash.
2. **Half-finished Mattermost post.** Final state has `update_at` ≈ `create_at + a few seconds`, never updated again, content cuts off mid-sentence. Example: bot reply finalized as `"The current year according to the provided"` (42 chars, no closing punctuation, `update_at - create_at = 2562 ms`).
3. **No "post-mortem" log line.** The runtime log just stops at the last `gateway/ws` RPC response or `agent/embedded` bootstrap warning. No stack trace, no error event in the runtime log.
4. **Post-restart wedging.** The auto-respawned gateway often binds the port, logs `ready (6 plugins…)`, then sits at "starting channels and sidecars…" with no mattermost connect line. CPU stays >80% for a single core; node has 4 ESTABLISHED conns to Ollama (`127.0.0.1:11434`) but **zero** to Mattermost. After ~3 min it sometimes does connect, but only after multiple slow RPCs (`chat.history`, `models.list`) report 30+ second durations on the WS log.

Pattern: **5–15 minutes between restart and next crash** under steady-state Mattermost activity.

## Environment

- **OpenClaw:** `2026.4.23 (a979721)` (npm install)
- **Node.js:** `v24.15.0`
- **OS:** Windows 11
- **Gateway service:** Windows Scheduled Task `OpenClaw Gateway` running `node …\openclaw\dist\index.js gateway --port 18789`, bind `lan`, auth `token`
- **Channels enabled:** Mattermost only (Mattermost Team Edition v11.6.1 over Tailscale, plain HTTP)
- **Agent model:** `ollama-local/llama3.1:8b` (local Ollama on the same host)
- **MEMORY.md:** 18,848 chars (truncates to 12,000 every session bootstrap — warning fires for every channel + DM session)
- **Plugins loaded:** `acpx, browser, device-pair, mattermost, phone-control, talk-voice` (6)
- **Cron:** 1 enabled job (`pcs-redfin-sync-daily`, fires at 5 AM ET; not active during the crashes I observed)

Mattermost config (`channels.mattermost`):

```json
{
  "name": "lab-1",
  "enabled": true,
  "botToken": "<redacted>",
  "baseUrl": "http://<mm-host>.<tailnet>.ts.net:8065",
  "network": { "dangerouslyAllowPrivateNetwork": true },
  "dmPolicy": "open",
  "groupPolicy": "open"
}
```

The bot is a member of 5 channels.

## Reproduction

1. Configure Mattermost channel as above. Set `dmPolicy: open` + `groupPolicy: open` so both DMs and channel messages flow.
2. Set agent model to `ollama-local/llama3.1:8b` (or any local Ollama backend that produces multi-second streamed responses).
3. From a Mattermost user, send `@openclaw <prompt that produces multi-line output>` to a channel the bot is in. Repeat across 5–10 messages over 5–15 min.
4. Observe at least one bot reply in Mattermost where the post was created, edited a couple of times, then frozen mid-sentence with no further `update_at` changes.
5. Check `Get-ScheduledTaskInfo -TaskName 'OpenClaw Gateway'` → `LastTaskResult` will be `3221226505`.

## Log slice (last entries before death)

Trimmed from `~/.openclaw/Local/Temp/openclaw/openclaw-2026-04-25.log`. Note the sequence: ANSI-escape-laden `gateway/ws` RPC responses with **48-second** durations on routine `usage.cost` / `sessions.usage` calls, followed by an `agent/embedded` bootstrap for a Mattermost session, then nothing.

```
2026-04-25T12:57:09.894-04:00 [INFO] gateway/ws res "channels.status" 1839ms
2026-04-25T12:57:18.380-04:00 [INFO] plugins   mattermost: registered slash command callback at /api/channels/mattermost/command
2026-04-25T12:57:27.941-04:00 [WARN] plugins   1 plugin(s) failed to initialize (validation: device-pair). Run 'openclaw plugins list' for details.
2026-04-25T12:57:55.546-04:00 [WARN] agent/embedded   workspace bootstrap file MEMORY.md is 18848 chars (limit 12000); truncating in injected context
                                       (sessionKey=agent:main:mattermost:channel:<channel-id>)
2026-04-25T12:57:57.264-04:00 [INFO] gateway/ws res "usage.cost" 48239ms
2026-04-25T12:57:57.346-04:00 [INFO] gateway/ws res "sessions.usage" 48328ms
2026-04-25T12:58:11.848-04:00 [INFO] gateway/ws res "node.list" 52ms
2026-04-25T12:58:11.905-04:00 [WARN] agent/embedded   workspace bootstrap file MEMORY.md is 18848 chars (limit 12000); truncating in injected context
                                       (sessionKey=agent:main:mattermost:direct:<user-id>)
<<< process exits 0xC0000409, no further log lines >>>
```

Stuck session trajectories left behind:

```
8889f05c-….trajectory.jsonl    2,495,462 bytes   last write 12:58:58
704ce0ef-….trajectory.jsonl      341,880 bytes   last write 12:59:57
```

## Suggested investigation

- Memory corruption / stack overrun likely originates in a native module or a large-buffer copy in the `agent/embedded` ↔ Ollama path. The repeated MEMORY.md truncation warning (running on every session bootstrap because the file exceeds the 12 KB injected limit) is a candidate hot path. Worth checking the truncation code for off-by-one / unsafe writes when input size > limit by ~50%.
- The 48-second `usage.cost` / `sessions.usage` RPCs immediately before death suggest the event loop was stalled (likely on disk I/O or an Ollama HTTP call) while WS frames piled up. A blocked event loop combined with a corrupt Buffer write would line up with `0xC0000409`.
- Investigate `device-pair` plugin validation error (`1 plugin(s) failed to initialize (validation: device-pair)`) — appears in every restart even though `device-pair` is in the loaded list. Probably benign but adds noise.

## Workaround

External watchdog scheduled task that probes `/health` every 60 s, kills lingering `node …openclaw\dist\index.js gateway` processes, and re-triggers the gateway task after 2 consecutive failures with a 5-minute restart cooldown. Recovers from both this crash and the post-crash wedging in #64253.

## Related

- #64253 — post-crash unresponsive state (this issue's restart-wedge symptom is similar but the trigger here is a hard exit, not a hang)
- #69693 — bonjour mDNS watchdog crashes (we see repeated `bonjour: watchdog detected non-announced service` warnings; not yet sure if related)
- #56215 — WS connection leak (we see "handshake timeout" entries on every restart from control-UI clients)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Gateway hard-crashes with 0xC0000409 (STATUS_STACK_BUFFER_OVERRUN) on Windows during Mattermost streaming reply; auto-respawn frequently wedges #71699

Bug

Symptoms

Environment

Reproduction

Log slice (last entries before death)

Suggested investigation

Workaround

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Gateway hard-crashes with 0xC0000409 (STATUS_STACK_BUFFER_OVERRUN) on Windows during Mattermost streaming reply; auto-respawn frequently wedges #71699

Description

Bug

Symptoms

Environment

Reproduction

Log slice (last entries before death)

Suggested investigation

Workaround

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions