Skip to content

EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released #84059

@wuruofan

Description

@wuruofan

Bug Description

After upgrading from OpenClaw 03.13 to 05.18, all embedded agent runs fail with:

EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/meow/.openclaw/agents/main/sessions/<session-id>.jsonl

This error occurs on every single message sent via Feishu channel, across all sessions (including fresh /new sessions). The issue is 100% reproducible.

Environment

  • OpenClaw version: 2026.5.18 (50a2481)
  • macOS version: Darwin 21.6.0
  • Channel: Feishu (websocket mode)
  • Provider: minimax/MiniMax-M2.7
  • Node version: 22.22.1
  • File system: APFS
  • Compaction mode: safeguard (default)

Root Cause

The error originates in pi-agent-core@0.75.1 (the dependency was @mariozechner/pi-agent-core@0.58.0 in v03.13).

The embedded runner uses a session file fingerprint mechanism to detect session takeover:

  1. releaseForPrompt() — releases the write lock but saves the session file fingerprint (mtimeNs, size, ino, ctimeNs)
  2. [LLM processes the prompt, lock is released]
  3. withSessionWriteLock() — re-acquires the lock and calls assertSessionFileFence() to verify the fingerprint matches
  4. If fingerprint changed since step 1 → throws EmbeddedAttemptSessionTakeoverError

The fingerprint check is overly sensitive — it compares mtimeNs at nanosecond precision. Any change to the file during the prompt processing window (including internal writes from session hooks, trajectory writers, or filesystem metadata updates) triggers the error.

Key observations

  • Compaction mode is safeguard (default), but no compaction is actually being triggered based on logs
  • No concurrent requests — only one user sending messages
  • Not filesystem time drift — APFS mtime updates are genuine within-process writes
  • The problem is in pi-agent-core@0.75.1 — it was not present in pi-agent-core@0.58.0 (v03.13)

The fingerprint check was introduced in pi-agent-core@0.75.1 to detect when a session file is modified by a competing process during embedded prompt processing. However, it fails to distinguish between:

  • Legitimate internal writes (session hooks, trajectory writers within the same process)
  • External/takeover modifications (actual session hijacking)

Log Excerpt

2026-05-19T08:35:08.679Z info channels/feishu {"subsystem":"channels/feishu"} feishu[default]: dispatching to agent (session=agent:main:feishu:direct:ou_83a5b7cae4c04464194d61ea3613d837)
2026-05-19T08:36:36.842Z error diagnostic {"subsystem":"diagnostic"} lane task error: lane=main durationMs=87458 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/meow/.openclaw/agents/main/sessions/95b31266-88de-4b33-a1ba-0cb86b3c1303.jsonl"
2026-05-19T08:36:36.846Z error diagnostic {"subsystem":"diagnostic"} lane task error: lane=session:agent:main:feishu:direct:ou_83a5b7cae4c04464194d61ea3613d837 durationMs=87461 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/meow/.openclaw/agents/main/sessions/95b31266-88de-4b33-a1ba-0cb86b3c1303.jsonl"
2026-05-19T08:36:36.852Z error Embedded agent failed before reply: session file changed while embedded prompt lock was released: /Users/meow/.openclaw/agents/main/sessions/95b31266-88de-4b33-a1ba-0cb86b3c1303.jsonl

Affected Sessions

Every session file that was used after the upgrade triggers this error. Examples:

  • 7b4b64d4-4c3c-467a-a0f7-a0399f1cb009.jsonl
  • 35949831-37bb-40fb-8030-5da1a3454714.jsonl
  • 95ffa57c-87cc-4bf7-804d-5b257af42f01.jsonl
  • 9b997e88-4fb3-4154-a02c-e63e1dac219f.jsonl
  • 95b31266-88de-4b33-a1ba-0cb86b3c1303.jsonl

Attempted Workarounds

  • /new to create fresh sessions — does not fix the issue
  • Restarting gateway — temporarily resolves, then recurs
  • Setting agents.defaults.compaction.mode to off — no effect (the error is not caused by compaction)

Severity

Critical — OpenClaw is completely non-functional via Feishu channel after upgrade.

Suggested Fix Direction

The fingerprint check in pi-agent-core@0.75.1 should either:

  1. Exclude internal writes: Mark files written by the same embedded runner process as non-takeover changes
  2. Relax fingerprint precision: Use second-level mtime instead of nanosecond mtimeNs to ignore trivial filesystem metadata updates
  3. Reload fingerprint after internal hooks: If a session hook writes to the session file during prompt processing, refresh the saved fingerprint before assertSessionFileFence()

Error class: EmbeddedAttemptSessionTakeoverError
Error message pattern: session file changed while embedded prompt lock was released: {sessionFile}

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions