Skip to content

fix(embedded-agent-runner): false positive EmbeddedAttemptSessionTakeoverError during session compaction#88348

Open
raymondjxj wants to merge 1 commit into
openclaw:mainfrom
raymondjxj:fix/session-fence-compaction-false-positive
Open

fix(embedded-agent-runner): false positive EmbeddedAttemptSessionTakeoverError during session compaction#88348
raymondjxj wants to merge 1 commit into
openclaw:mainfrom
raymondjxj:fix/session-fence-compaction-false-positive

Conversation

@raymondjxj

Copy link
Copy Markdown

Bug Summary

Error: EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released
Symptom: OpenClaw bot stops processing messages, repeating lane task error + Embedded agent failed before reply
Root cause location: src/agents/embedded-agent-runner/run/attempt.session-lock.ts, sessionFenceRewriteIsBenign()

Root Cause

When a prompt is being streamed, releaseForPrompt() releases the write lock on the session file and sets fenceActive = true. While the lock is released (during LLM streaming), OpenClaw's compaction process may rewrite the session file to compact old messages.

When reacquireAfterPrompt() re-acquires the lock and calls assertSessionFileFence(), the fence check fails because:

  1. sessionFenceRewriteIsBenign has sameSessionFileIdentity() (inode comparison) as its first guard condition (line 266)
  2. Compaction creates a new file with a new inode even though the content is valid
  3. sameSessionFileIdentity returns false → function returns false immediately without reaching content validation
  4. Both sessionFenceAdvanceIsBenign and sessionFenceRewriteIsBenign return false
  5. EmbeddedAttemptSessionTakeoverError is thrown (false positive)

The Fix

Remove sameSessionFileIdentity() from the early-return guard in sessionFenceRewriteIsBenign. The function already rigorously validates content via lineMatchesLinearTranscriptMigration + isTranscriptOnlyOpenClawAssistantLine. When content validation passes, the rewrite is safe regardless of whether the inode changed — a compaction rewrite with new inode is benign.

- !sameSessionFileIdentity(params.previous.fingerprint, params.current) ||
  params.current.size > BigInt(MAX_BENIGN_SESSION_FENCE_REWRITE_RESULT_BYTES) ||

And the terminal return statement becomes a content-check + true:

- return appendedLines.every(isTranscriptOnlyOpenClawAssistantLine);
+ if (!appendedLines.every(isTranscriptOnlyOpenClawAssistantLine)) {
+   return false;
+ }
+ // Content validation passed — benign rewrite regardless of inode.
+ return true;

Why This Is Safe

The lineMatchesLinearTranscriptMigration + isTranscriptOnlyOpenClawAssistantLine content validation rigorously confirms:

  • All previous transcript lines are preserved unchanged
  • Only legitimate OpenClaw assistant lines were appended

If a malicious process tried to fake a "compaction", it would need to produce content that passes this validation AND still trigger the takeover — which is impossible by design.

Reproduction Scenario

  1. User sends message to Telegram bot (ling agent)
  2. DeepSeek starts streaming response
  3. releaseForPrompt() releases write lock (to allow streaming)
  4. Compaction triggers and rewrites ~/.openclaw/agents/ling/sessions/{session-id}.jsonl
  5. reacquireAfterPrompt() detects file changed → throws EmbeddedAttemptSessionTakeoverError
  6. Bot enters error loop: lane task error + Embedded agent failed before reply

Testing

  • No existing unit tests cover the sessionFenceRewriteIsBenign rewrite-with-new-inode scenario
  • TypeScript compiles cleanly with the change
  • Manual verification: restart OpenClaw and observe bot processes messages without EmbeddedAttemptSessionTakeoverError

Fix branch: raymondjxj/fix/session-fence-compaction-false-positive
File changed: src/agents/embedded-agent-runner/run/attempt.session-lock.ts

…ce check

sessionFenceRewriteIsBenign rejected all rewrites where the file inode
changed, even when content validation passed. When compaction rewrites
the session file (new inode, smaller size), the early
sameSessionFileIdentity check caused both benign-check functions to
return false, triggering a false-positive EmbeddedAttemptSessionTakeoverError.

Fix: remove sameSessionFileIdentity from the early-return guard in
sessionFenceRewriteIsBenign. The function already rigorously validates
content via lineMatchesLinearTranscriptMigration. When content validation
passes, the rewrite is safe regardless of whether the inode changed —
a compaction rewrite with new inode is benign.
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 30, 2026
@clawsweeper

clawsweeper Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed May 30, 2026, 10:16 AM ET / 14:16 UTC.

Summary
The PR removes the same-inode guard from sessionFenceRewriteIsBenign so content-valid session-file rewrites with a different inode can be treated as benign.

PR surface: Source +9. Total +9 across 1 file.

Reproducibility: unclear. source inspection shows current main rejects different-inode rewrites before content validation, but I did not establish that OpenClaw's actual compaction path rewrites the same active session file with a new inode.

Review metrics: 1 noteworthy metric.

  • Session fence guard relaxed: 1 inode-identity check removed. This is the safety check that prevents a replacement file from being trusted solely because its JSONL contents look benign.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted real behavior proof for the reported compaction/reacquire scenario, such as logs or terminal output showing the pre-fix failure and after-fix result.
  • Replace the broad content-only trust with a narrow owned/trusted compaction path and focused regression tests.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR body states manual verification, but it does not include logs, terminal output, screenshots, recordings, or linked artifacts showing the after-fix compaction scenario; contributor-provided redacted proof is still needed. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Mantis proof suggestion
The reported failure is a Telegram bot stop-processing loop, so a live Telegram proof would materially improve confidence after the session-fence repair is narrowed. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram live: verify a Telegram agent continues replying after embedded session compaction while a prompt stream is active.

Risk before merge

  • [P1] The relaxed rewrite path could accept an unowned replacement file that preserves prior lines and appends OpenClaw-looking assistant entries, weakening takeover detection.
  • [P1] The PR does not show after-fix real behavior proof for the Telegram/DeepSeek compaction scenario, and the inspected compaction code does not clearly match the claimed same-path new-inode rewrite.
  • [P1] No focused regression test proves that actual compaction is benign while arbitrary replacement files are still rejected.

Maintainer options:

  1. Preserve ownership proof before merge (recommended)
    Keep replacement-file acceptance tied to an owned/trusted compaction write or another narrow provenance signal, and add a regression test that still rejects unowned replacements.
  2. Accept content-only trust explicitly
    Maintainers could intentionally decide that content-valid replacement files are safe, but that should be an explicit session-fence policy choice with tests and documented threat assumptions.
  3. Pause for a real reproduction
    If the actual compaction path cannot be reproduced, pause this branch until logs or a live run show which write path needs to be trusted.

Next step before merge

  • [P1] Maintainer review is needed before repair because the patch changes a session takeover safety boundary and still needs proof of the actual compaction failure path.

Security
Needs attention: The patch weakens a session takeover safety check by trusting replacement-file contents without ownership or inode provenance.

Review findings

  • [P1] Keep replacement writes tied to owned compaction — src/agents/embedded-agent-runner/run/attempt.session-lock.ts:262-269
Review details

Best possible solution:

Keep the takeover fence strict by tying any accepted replacement to an owned or trusted compaction write, or to a narrowly validated compaction-specific path, with focused session-lock coverage and real run proof.

Do we have a high-confidence way to reproduce the issue?

Unclear: source inspection shows current main rejects different-inode rewrites before content validation, but I did not establish that OpenClaw's actual compaction path rewrites the same active session file with a new inode.

Is this the best way to solve the issue?

No: accepting all content-valid replacement files is broader than the compaction problem; the safer solution is an owned/trusted compaction path plus regression coverage.

Full review comments:

  • [P1] Keep replacement writes tied to owned compaction — src/agents/embedded-agent-runner/run/attempt.session-lock.ts:262-269
    By removing the inode identity guard for every rewrite, an unowned process can replace the session file with one that preserves the previous lines and appends OpenClaw-looking assistant entries, and assertSessionFileFence() will trust it as benign. The inspected compaction code acquires the session lock and appends compaction entries, while rotation writes a successor file, so this should be fixed by proving and trusting the actual compaction write path rather than accepting arbitrary replacement files by content alone.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.86

AGENTS.md: found and applied where relevant.

Codex review notes: reasoning high; reviewed against d5be702f86e7.

Label changes

Label justifications:

  • P1: The PR targets an agent/channel workflow that can stop processing messages, and the proposed fix touches the session takeover guard for active runs.
  • merge-risk: 🚨 security-boundary: Removing the inode identity requirement can weaken the boundary between owned OpenClaw writes and external session-file replacement.
  • merge-risk: 🚨 session-state: A trusted replacement file can change the transcript state observed after prompt-lock reacquisition.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body states manual verification, but it does not include logs, terminal output, screenshots, recordings, or linked artifacts showing the after-fix compaction scenario; contributor-provided redacted proof is still needed. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

PR surface:

Source +9. Total +9 across 1 file.

View PR surface stats
Area Files Added Removed Net
Source 1 11 2 +9
Tests 0 0 0 0
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 1 11 2 +9

Security concerns:

  • [medium] Replacement file can bypass takeover detection — src/agents/embedded-agent-runner/run/attempt.session-lock.ts:262
    Dropping the identity guard means a different inode can be accepted if its content matches the helper's limited benign-rewrite checks, which weakens protection against unowned session transcript replacement.
    Confidence: 0.84

What I checked:

Likely related people:

  • Vincent Koc: Blame and symbol history show the current embedded session-lock fence helper and identity guard were introduced in commit 27dce6c6bb191141a23b8a2c80573f890f4195f5. (role: recent area contributor; confidence: high; commits: 27dce6c6bb19; files: src/agents/embedded-agent-runner/run/attempt.session-lock.ts)
  • Peter Steinberger: Recent compaction transcript-rotation work touched the adjacent compaction successor path relevant to the PR's claimed replacement/new-inode behavior. (role: recent adjacent contributor; confidence: medium; commits: 912f66317334; files: src/agents/embedded-agent-runner/compaction-successor-transcript.ts, src/agents/embedded-agent-runner/compaction-successor-transcript.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. labels May 30, 2026
@byungskers

Copy link
Copy Markdown

I don't think the new-inode acceptance is safe in this shape. Dropping sameSessionFileIdentity(...) means any replacement file that preserves the prior prefix and appends OpenClaw-looking assistant lines can now pass the benign-rewrite path, even if it was not produced by the owned compaction/write flow. If the real issue is a compaction-specific rewrite, I'd be more comfortable proving that exact path and tying the exception to owned/trusted compaction provenance rather than accepting all content-valid replacement files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: XS status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants