Authorship note: This report was prepared by Otti, Ruben's OpenClaw assistant, at Ruben's request and based on repeated failures in Ruben's OpenClaw instance.
Summary
OpenClaw 2026.5.19 repeatedly reports file lock stale for <session>.jsonl on active session transcripts during normal assistant/tool use. This looks related to #3092, but the current failure mode is more specific and still occurs on the modern sidecar-lock/session-transcript path:
- the failing path is an active session transcript JSONL, e.g.
/home/node/.openclaw/agents/main/sessions/39f57588-0090-4386-af1b-7710d32ecfdc.jsonl
- after the run, there is no corresponding
<session>.jsonl.lock left on disk
lsof / fuser show no holder on the affected transcript file
- a direct lock test can later acquire/release the same transcript path successfully
- disabling Discord tool-progress preview did not eliminate the problem
This suggests a transient sidecar-lock lifecycle/race or stale-recovery edge case, not simply an abandoned lockfile that an operator can safely delete.
Environment
- OpenClaw CLI/Gateway: 2026.5.19
- OS: Linux container on Unraid host
- Gateway: custom bind,
0.0.0.0:18789; connectivity probe ok
- Channels/surfaces where this was seen: Discord
#zentrale, #otti-log, Webchat/Dashboard child session, Heartbeat/Main sessions
- Relevant config test:
channels.discord.streaming.preview.toolProgress=false was tried for ~24h; stale-lock errors still occurred
openclaw gateway status --deep excerpt:
Gateway: bind=custom (0.0.0.0), port=18789 (env/config)
Probe target: ws://192.168.178.57:18789
Dashboard: http://192.168.178.57:18789/
CLI version: 2026.5.19 (/usr/local/bin/openclaw)
Gateway version: 2026.5.19
Connectivity probe: ok
Capability: admin-capable
Listening: *:18789
Evidence
Recent local scan of session transcripts/trajectories shows repeated file lock stale occurrences, including currently active/recent sessions:
2026-05-27 10:11:14 44 39f57588-0090-4386-af1b-7710d32ecfdc.jsonl
2026-05-27 10:08:31 186 39f57588-0090-4386-af1b-7710d32ecfdc.trajectory.jsonl
2026-05-27 09:00:18 132 c22c5cce-908b-46e1-a951-556af482bbbf.trajectory.jsonl
2026-05-27 09:00:11 13 c22c5cce-908b-46e1-a951-556af482bbbf.jsonl
2026-05-27 03:30:49 7 554bc6bd-db15-4fc0-aea9-4008020e79d9.jsonl
2026-05-27 03:30:49 5 554bc6bd-db15-4fc0-aea9-4008020e79d9.trajectory.jsonl
Top historical counts include Discord-associated sessions and dashboard/webchat child sessions:
66 ee5dcd57-93c7-4b5a-8ccd-3a7cf4621b25.trajectory.jsonl
60 2026-05-26T16-19-09-019Z_b5c1158c-e710-498e-87ac-96850067cc0e.trajectory.jsonl
57 6eae195f-9a9b-47f0-8724-df6535620886.trajectory.jsonl
52 6eae195f-9a9b-47f0-8724-df6535620886.jsonl.reset.2026-05-26T07-50-31.677Z
49 2026-05-26T17-39-38-317Z_c277627e-0b39-4c59-926e-b17b40e92fac.trajectory.jsonl
45 bc3456e5-2f60-49f1-a11e-2ec4485f4747.trajectory.jsonl
42 fdb55c08-ae04-469a-9dbf-de6694907d70.trajectory.jsonl
33 bc3456e5-2f60-49f1-a11e-2ec4485f4747.jsonl
32 39f57588-0090-4386-af1b-7710d32ecfdc.trajectory.jsonl
26 6fbcf0fb-85ac-488c-a1e7-7d6e876e42c7.trajectory.jsonl
Only one sidecar lockfile currently remains, and it is unrelated to the failing session transcripts:
2026-05-24 14:17 /home/node/.openclaw/agents/main/sessions/.usage-cost-cache.json.lock
For the active/recent failing transcript path, lsof/fuser returned no holder and no <session>.jsonl.lock was present when checked.
Earlier investigation of a concrete affected session found:
OpenClaw version: 2026.5.19
Session: agent:main:dashboard:b18e21f6-e681-4510-8573-9eeb11e7fc01
Transcript: bc3456e5-2f60-49f1-a11e-2ec4485f4747.jsonl
Error: file lock stale for /home/node/.openclaw/agents/main/sessions/bc3456e5-2f60-49f1-a11e-2ec4485f4747.jsonl
No current lockfile: bc3456e5-...jsonl.lock did not exist
Direct SDK locktest on same path: LOCK_OK / RELEASE_OK
lsof/fuser: no process holder on the transcript file
What seems to trigger it
It appears most often during runs with overlapping transcript writes/tool-result persistence, especially when multiple tool calls are active or when Discord/Webchat/heartbeat surfaces are involved. The first investigation saw several parallel tool results fail early in a run, all around the same active transcript lock.
Discord was initially suspected because the highest counts appeared in Discord-channel sessions. Reducing Discord live tool-progress preview did not remove the issue, which suggests the problem is lower-level than Discord preview alone.
Expected behavior
If a session transcript is locked by active work, later writes should either:
- queue/retry with bounded backoff, or
- recover safely when the lock is genuinely stale, or
- emit a diagnostic that identifies the lock owner/payload and why recovery was denied.
A user-visible assistant/tool run should not accumulate repeated file lock stale tool-result failures while the matching .jsonl.lock file is already gone and the target path is lockable afterwards.
Actual behavior
The session accumulates repeated tool-result failures containing:
file lock stale for /home/node/.openclaw/agents/main/sessions/<session>.jsonl
After the run, the operator cannot find a corresponding lockfile or process holder. Retrying later can succeed, which makes this hard to diagnose and impossible to fix safely from outside by deleting lockfiles.
Relation to #3092
#3092 described older channel-handler lock timeouts on sessions.json.lock during long operations. This report is likely in the same family, but the concrete failure has shifted:
- current version: 2026.5.19, not 2026.1.24-era Clawdbot
- target: per-session transcript JSONL/trajectory path, not only global
sessions.json.lock
- failure text:
file lock stale for <session>.jsonl
- post-run state: no matching lockfile or holder remains, so manual stale-lock cleanup is not an effective workaround
Suggested investigation direction
- Add lock payload/path diagnostics to the thrown
file_lock_stale error: owner pid, createdAt, observed mtime, recovery mode, whether lockfile changed during recovery attempt.
- Audit whether parallel tool-result persistence can attempt competing sidecar locks on the same transcript path from the same process/run.
- Consider bounded retry/backoff for active transcript writes before surfacing
file_lock_stale as a tool-result failure.
- If stale recovery intentionally fails closed, expose enough diagnostic context for maintainers/operators to distinguish unsafe third-party stale locks from internal races.
Authorship note: This report was prepared by Otti, Ruben's OpenClaw assistant, at Ruben's request and based on repeated failures in Ruben's OpenClaw instance.
Summary
OpenClaw 2026.5.19 repeatedly reports
file lock stale for <session>.jsonlon active session transcripts during normal assistant/tool use. This looks related to #3092, but the current failure mode is more specific and still occurs on the modern sidecar-lock/session-transcript path:/home/node/.openclaw/agents/main/sessions/39f57588-0090-4386-af1b-7710d32ecfdc.jsonl<session>.jsonl.lockleft on disklsof/fusershow no holder on the affected transcript fileThis suggests a transient sidecar-lock lifecycle/race or stale-recovery edge case, not simply an abandoned lockfile that an operator can safely delete.
Environment
0.0.0.0:18789; connectivity probe ok#zentrale,#otti-log, Webchat/Dashboard child session, Heartbeat/Main sessionschannels.discord.streaming.preview.toolProgress=falsewas tried for ~24h; stale-lock errors still occurredopenclaw gateway status --deepexcerpt:Evidence
Recent local scan of session transcripts/trajectories shows repeated
file lock staleoccurrences, including currently active/recent sessions:Top historical counts include Discord-associated sessions and dashboard/webchat child sessions:
Only one sidecar lockfile currently remains, and it is unrelated to the failing session transcripts:
For the active/recent failing transcript path,
lsof/fuserreturned no holder and no<session>.jsonl.lockwas present when checked.Earlier investigation of a concrete affected session found:
What seems to trigger it
It appears most often during runs with overlapping transcript writes/tool-result persistence, especially when multiple tool calls are active or when Discord/Webchat/heartbeat surfaces are involved. The first investigation saw several parallel tool results fail early in a run, all around the same active transcript lock.
Discord was initially suspected because the highest counts appeared in Discord-channel sessions. Reducing Discord live tool-progress preview did not remove the issue, which suggests the problem is lower-level than Discord preview alone.
Expected behavior
If a session transcript is locked by active work, later writes should either:
A user-visible assistant/tool run should not accumulate repeated
file lock staletool-result failures while the matching.jsonl.lockfile is already gone and the target path is lockable afterwards.Actual behavior
The session accumulates repeated tool-result failures containing:
After the run, the operator cannot find a corresponding lockfile or process holder. Retrying later can succeed, which makes this hard to diagnose and impossible to fix safely from outside by deleting lockfiles.
Relation to #3092
#3092 described older channel-handler lock timeouts on
sessions.json.lockduring long operations. This report is likely in the same family, but the concrete failure has shifted:sessions.json.lockfile lock stale for <session>.jsonlSuggested investigation direction
file_lock_staleerror: owner pid, createdAt, observed mtime, recovery mode, whether lockfile changed during recovery attempt.file_lock_staleas a tool-result failure.