Skip to content

[Bug]: Session write-lock timeouts block subagent delivery lanes #86538

@galiniliev

Description

@galiniliev

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Session JSONL write-lock timeouts block main, cron-nested, and subagent lanes, then surface as delivery/lifecycle failures without enough owner diagnostics.

Steps to reproduce

  1. Run OpenClaw with main-session and subagent delivery lanes writing to session JSONL files.
  2. Contend a session write lock long enough for lane tasks and final delivery to hit the write-lock timeout.
  3. Observe lane errors and direct announce failure logs.

Expected behavior

Write-lock timeouts should preserve enough owner liveness/starttime/staleness evidence for recovery triage, and subagent final output should remain durable even when direct announce cannot acquire the lock.

Actual behavior

Gateway logs showed 161 SessionWriteLockTimeoutError lines, 41 subagent lane rejections, and repeated main-session lane errors in the 2026-05-23T14:42:08Z through 2026-05-25T14:42:08Z window.

OpenClaw version

2026.5.25 dev checkout

Operating system

Linux WSL2

Install method

pnpm dev

Model

NOT_ENOUGH_INFO

Provider / routing chain

NOT_ENOUGH_INFO

Additional provider/model setup details

NOT_ENOUGH_INFO

Logs, screenshots, and evidence

Window analyzed: 2026-05-23T14:42:08Z through 2026-05-25T14:42:08Z
Counts:
- 161 lines contained: SessionWriteLockTimeoutError
- 41 lines contained: lane task rejected after timeout
- 158 lines contained: lane task error

Redacted excerpts:
lane task rejected after timeout: lane=subagent timeoutMs=31000 error="SessionWriteLockTimeoutError: session file locked (timeout 60000ms): pid=[redacted pid] [redacted session lock path]"
lane task error: lane=cron-nested durationMs=380507 error="SessionWriteLockTimeoutError: session file locked (timeout 60000ms): pid=[redacted pid] [redacted session lock path]"
[warn] Subagent completion direct announce failed for run [redacted run id]: SessionWriteLockTimeoutError: session file locked (timeout 60000ms): pid=[redacted pid] [redacted session lock path]: code=OPENCLAW_SESSION_WRITE_LOCK_TIMEOUT

Impact and severity

Affected: main session, cron-nested, and subagent delivery lanes.
Severity: High, because completion output can fail after child work is otherwise done.
Frequency: 161 timeout lines were observed in the analyzed two-day log window.
Consequence: lifecycle delivery and direct announce can fail behind a session write lock.

Additional information

The implicated code paths were src/agents/session-write-lock.ts timeout reporting and subagent final-delivery persistence.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions