fix(agent-core): allow benign session rewrites with different inode by raymondjxj · Pull Request #88653 · openclaw/openclaw

raymondjxj · 2026-05-31T14:16:45Z

Summary

Compaction rewrites session files via writeFile+rename, which always creates a new inode. Previously, sessionFenceRewriteIsBenign rejected such rewrites at the inode identity gate (sameSessionFileIdentity), so the content validation that follows never ran, causing EmbeddedAttemptSessionTakeoverError false positives.

Root Cause

In sessionFenceRewriteIsBenign (line 266), the sameSessionFileIdentity check compares dev + ino. When compaction runs, it replaces the session file with a new inode, causing this check to return false and the function to exit early — without ever performing content validation.

The actual error path:

releaseForPrompt() records fingerprint (inode A)
Compaction runs, renames new file (inode B)
assertSessionFileFence() compares fingerprint: inode B ≠ inode A
sessionFenceRewriteIsBenign called, fails sameSessionFileIdentity check
EmbeddedAttemptSessionTakeoverError thrown — false positive

Fix

Remove the inode identity gate from sessionFenceRewriteIsBenign while preserving all content validation. The content checks remain fully in place: file must exist, end with a newline, append only valid transcript lines, and stay within size limits. A file with a different inode but valid compaction-style content is indistinguishable from a compaction from the agent's perspective.

Source

src/agents/embedded-agent-runner/run/attempt.session-lock.ts, function sessionFenceRewriteIsBenign, line 266.

Testing

Unit test: verify sessionFenceRewriteIsBenign returns true for same-content file with different inode
Integration test: run compaction concurrently with message processing, verify no EmbeddedAttemptSessionTakeoverError

Compaction rewrites session files via writeFile+rename, which always creates a new inode. Previously, sessionFenceRewriteIsBenign rejected such rewrites at the identity check (sameSessionFileIdentity), so the content validation that follows never ran, causing EmbeddedAttemptSessionTakeoverError false positives. This change removes the inode identity gate from sessionFenceRewriteIsBenign while preserving all content validation: file must exist, end with a newline, append only valid transcript lines. A file with a different inode but valid compaction-style content is indistinguishable from a compaction from the agent's perspective, so the inode check was both too strict and redundant with the content checks. Fixes EmbeddedAttemptSessionTakeoverError during concurrent compaction.

clawsweeper · 2026-05-31T14:23:48Z

Codex review: needs real behavior proof before merge. Reviewed May 31, 2026, 10:28 AM ET / 14:28 UTC.

Summary
The PR removes the sameSessionFileIdentity guard from sessionFenceRewriteIsBenign so different-inode session rewrites can pass content validation.

PR surface: Source -1. Total -1 across 1 file.

Reproducibility: unclear. for the reported real-world compaction failure: source inspection confirms current main rejects a different-inode rewrite before content validation, but I did not establish that the actual OpenClaw compaction path rewrites the active fenced file that way.

Review metrics: 1 noteworthy metric.

Session fence guard relaxed: 1 inode-identity check removed. This is the guard that currently prevents a replacement file from being trusted solely because its JSONL contents look benign.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P1] Add redacted real behavior proof for the reported compaction/reacquire scenario, such as terminal output or logs showing the failure path and after-fix result.
Replace the broad content-only trust with a narrow owned/trusted compaction path and focused regression tests.

Proof guidance:

[P1] Needs real behavior proof before merge: The external PR includes no after-fix logs, terminal output, recording, screenshot, or linked artifact for the compaction scenario; contributors should add redacted proof and then trigger a fresh review if needed. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Mantis proof suggestion
The reported failure manifests as a Telegram bot stop-processing loop, so live Telegram proof would materially improve confidence after the session-fence repair is narrowed. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram live: verify a Telegram agent continues replying after embedded session compaction while a prompt stream is active.

Risk before merge

[P1] The relaxed rewrite path could accept an unowned replacement file that preserves prior lines and appends OpenClaw-looking assistant entries, weakening takeover detection.
[P1] The PR does not include after-fix real behavior proof for the reported compaction scenario, and the inspected compaction paths do not clearly match the same-path writeFile+rename claim.
[P1] No focused regression test proves that actual compaction is benign while arbitrary replacement files are still rejected.

Maintainer options:

Preserve ownership proof before merge (recommended)
Tie any different-inode acceptance to an owned/trusted compaction write or equivalent provenance and add regression coverage for both accepted compaction and rejected unowned replacement.
Accept content-only trust explicitly
Maintainers can intentionally accept content-valid replacement files, but that should be a documented session-fence policy choice with tests covering the threat assumptions.
Pause for a real reproduction
If the actual compaction write path cannot be reproduced, pause this branch until logs or a live run show which writer needs to be trusted.

Next step before merge

[P1] Maintainer review is needed because the remaining blocker is a session takeover safety decision plus missing real behavior proof, not a narrow mechanical repair.

Security
Needs attention: The patch weakens a session takeover safety check by trusting replacement-file contents without ownership or inode provenance.

Review findings

[P1] Keep replacement writes tied to owned compaction — src/agents/embedded-agent-runner/run/attempt.session-lock.ts:263-268

Review details

Best possible solution:

Keep replacement-file acceptance tied to an owned or trusted compaction write, or another narrow provenance signal, with regression coverage and redacted real-run proof.

Do we have a high-confidence way to reproduce the issue?

Unclear for the reported real-world compaction failure: source inspection confirms current main rejects a different-inode rewrite before content validation, but I did not establish that the actual OpenClaw compaction path rewrites the active fenced file that way.

Is this the best way to solve the issue?

No: accepting every content-valid replacement file is broader than the claimed compaction false positive. The safer fix is to prove the actual writer and preserve ownership or trusted-compaction provenance.

Full review comments:

[P1] Keep replacement writes tied to owned compaction — src/agents/embedded-agent-runner/run/attempt.session-lock.ts:263-268
By removing the identity guard for every rewrite, an unowned process can replace the session file with one that preserves the previous lines and appends OpenClaw-looking assistant entries, and assertSessionFileFence() will trust it as benign. The inspected compaction paths acquire the session lock and append entries or write a successor file, so this should be fixed by proving and trusting the actual compaction write path rather than accepting arbitrary replacement files by content alone.
Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.86

AGENTS.md: found and applied where relevant.

Codex review notes: reasoning high; reviewed against 88c99ddf5f82.

Label changes

Label justifications:

P1: The PR targets an agent workflow that can stop processing messages and changes the active session takeover guard.
merge-risk: 🚨 session-state: Merging this patch could make an unowned replacement transcript become trusted session state after prompt-lock reacquisition.
merge-risk: 🚨 security-boundary: Removing inode provenance from the benign rewrite path weakens the boundary between owned OpenClaw writes and external file replacement.
rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The external PR includes no after-fix logs, terminal output, recording, screenshot, or linked artifact for the compaction scenario; contributors should add redacted proof and then trigger a fresh review if needed. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Evidence reviewed

PR surface:

Source -1. Total -1 across 1 file.

View PR surface stats

Area	Files	Removed	Net
Source	1	1	-1
Tests	0	0	0
Docs	0	0	0
Config	0	0	0
Generated	0	0	0
Other	0	0	0
Total	1	1	-1

Security concerns:

[medium] Replacement file can bypass takeover detection — src/agents/embedded-agent-runner/run/attempt.session-lock.ts:263
Dropping the identity guard means a different inode can be accepted if its content matches the helper's limited benign-rewrite checks, which weakens protection against unowned session transcript replacement.
Confidence: 0.84

What I checked:

PR diff relaxes the rewrite identity gate: The proposed hunk removes the sameSessionFileIdentity(params.previous.fingerprint, params.current) early-return check from sessionFenceRewriteIsBenign, making content validation alone decide whether a replacement inode is trusted. (src/agents/embedded-agent-runner/run/attempt.session-lock.ts:263, 7e6995d0fbd5)
Current fence distinguishes identical state, owned writes, and benign helpers: Current assertSessionFileFence() first accepts unchanged fingerprints or recorded owned writes, then falls back to benign advance/rewrite checks before marking takeover and throwing EmbeddedAttemptSessionTakeoverError. (src/agents/embedded-agent-runner/run/attempt.session-lock.ts:702, 88c99ddf5f82)
Rewrite helper currently requires same inode before content checks: Current main requires previous text, current file existence, and same dev/ino before reading and validating rewritten content, which keeps arbitrary replacement files outside the benign rewrite path. (src/agents/embedded-agent-runner/run/attempt.session-lock.ts:257, 88c99ddf5f82)
OpenClaw compaction direct path takes the session write lock: The direct compaction path acquires the session write lock for params.sessionFile before opening the session manager, so the inspected path does not by itself prove an unowned same-path replacement should be trusted during prompt release. (src/agents/embedded-agent-runner/compact.ts:1061, 88c99ddf5f82)
Agent-core JSONL storage appends compaction/session entries: JsonlSessionStorage.appendEntry() persists entries with appendFile, and those compaction/session entries are not the same as the OpenClaw assistant mirror lines accepted by the benign rewrite helper. (packages/agent-core/src/harness/session/jsonl-storage.ts:231, 88c99ddf5f82)
Compaction transcript rotation writes a successor file: rotateTranscriptAfterCompaction() writes a new successor transcript path and returns that path, rather than replacing the active session file in place at the fenced path. (src/agents/embedded-agent-runner/compaction-successor-transcript.ts:33, 88c99ddf5f82)

Likely related people:

steipete: Recent path history shows steipete carried the large agent-runtime internalization, moved session write locking into the owned session runtime, and touched adjacent compaction successor code. (role: recent area contributor; confidence: high; commits: bb46b79d3c14, 5f68291f4f54, 912f66317334; files: src/agents/embedded-agent-runner/run/attempt.session-lock.ts, src/agents/embedded-agent-runner/compaction-successor-transcript.ts)
openperf: Recent merged history on the same session-lock file includes the timeout-abort lock release fix, which is adjacent to prompt-lock and takeover handling. (role: recent area contributor; confidence: medium; commits: 65fb56513fb2; files: src/agents/embedded-agent-runner/run/attempt.session-lock.ts)
luoyanglang: Recent merged history on the session-lock tests and implementation fixed session event queue self-wait behavior in the same embedded runner lock surface. (role: recent area contributor; confidence: medium; commits: b789e71e57d9; files: src/agents/embedded-agent-runner/run/attempt.session-lock.ts, src/agents/embedded-agent-runner/run/attempt.session-lock.test.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

openclaw-barnacle Bot added agents Agent runtime and tooling size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. triage: refactor-only Candidate: refactor/cleanup-only PR without maintainer context. labels May 31, 2026

lykeion-dev mentioned this pull request Jun 1, 2026

[Feature]: Suppress EmbeddedAttemptSessionTakeoverError from conversation context on user-initiated abort #88903

Open

jalehman mentioned this pull request Jun 5, 2026

fix: refresh prompt fence after compaction writes #90775

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agent-core): allow benign session rewrites with different inode#88653

fix(agent-core): allow benign session rewrites with different inode#88653
raymondjxj wants to merge 1 commit into
openclaw:mainfrom
raymondjxj:fix/session-fence-inode-false-positive

raymondjxj commented May 31, 2026

Uh oh!

clawsweeper Bot commented May 31, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

raymondjxj commented May 31, 2026

Summary

Root Cause

Fix

Source

Testing

Uh oh!

clawsweeper Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented May 31, 2026 •

edited

Loading