fix: refresh prompt fence after compaction writes by jalehman · Pull Request #90775 · openclaw/openclaw

jalehman · 2026-06-05T22:43:32Z

Summary

Fixes embedded attempts that falsely report session takeover after OpenClaw-owned auto-compaction writes a type: "compaction" session entry while the prompt fence is released.
Threads compaction persistence through the existing guarded SessionManager owned-write callback path, matching the existing message-persisted fence refresh behavior.
Keeps external raw compaction-looking JSONL edits rejected; this does not broaden the benign rewrite whitelist.
Reviewers should focus on the ownership boundary in guardSessionManager and the session-lock regression coverage.

Linked context

Closes #90729

Related #88348 and #88653 for prior benign rewrite handling; those do not cover same-file owned compaction appends.

Requested by Josh in this maintainer thread.

Real behavior proof (required for external PRs)

Behavior or issue addressed: OpenClaw-owned compaction appends no longer trip EmbeddedAttemptSessionTakeoverError when the embedded prompt fence is released.
Real environment tested: Blacksmith Testbox through Crabbox, provider blacksmith-testbox.
Exact steps or command run after this patch: node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox -- env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed
Evidence after fix: Testbox tbx_01ktcyq66htgrbs2yer89p0s7f, Actions run https://github.com/openclaw/openclaw/actions/runs/27043607767, exit 0.
Observed result after fix: changed gate selected core, coreTests; core typecheck, core test typecheck, lint on changed files, import-cycle guard, and related runtime guards passed.
What was not tested: no live provider turn was run after the patch; this change is covered by focused session-lock regression tests and remote type/lint/guard proof.
Proof limitations or environment constraints: direct AWS Crabbox was blocked earlier by coordinator auth, so the broad remote proof used delegated Blacksmith Testbox through the Crabbox wrapper.
Before evidence: the original issue was reproduced before the patch on Testbox tbx_01ktcnjft8yk11w7ar9mdnv1yr, Actions run https://github.com/openclaw/openclaw/actions/runs/27036822896, with EmbeddedAttemptSessionTakeoverError.

Tests and validation

node scripts/run-vitest.mjs src/agents/embedded-agent-runner/run/attempt.session-lock.test.ts -- --reporter=verbose
node scripts/run-vitest.mjs src/agents/session-tool-result-guard.test.ts src/agents/session-tool-result-guard.transcript-events.test.ts -- --reporter=verbose
git diff --check
node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox -- env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed
.agents/skills/autoreview/scripts/autoreview --mode local

Regression coverage added:

Owned SessionManager.appendCompaction(...) refreshes the prompt fence before post-prompt cleanup reacquires the session lock.
Unowned external raw type: "compaction" JSONL appends still throw EmbeddedAttemptSessionTakeoverError.

Failed before fix:

The RED regression failed with EmbeddedAttemptSessionTakeoverError before the compaction persistence hook was wired.
Autoreview initially found the regression fixture was too weak after a test simplification; the fixture now seeds a persisted assistant entry before compaction and the final autoreview is clean.

Risk checklist

Did user-visible behavior change? (Yes/No)

Yes. Auto-compaction during embedded attempts should no longer produce a false takeover error.

Did config, environment, or migration behavior change? (Yes/No)

No.

Did security, auth, secrets, network, or tool execution behavior change? (Yes/No)

No.

What is the highest-risk area?

The session-file ownership boundary while an embedded attempt has released the prompt lock.

How is that risk mitigated?

The hook only marks writes made through the guarded SessionManager.appendCompaction path as owned. The negative test proves raw external compaction-looking writes remain takeover errors.

Current review state

What is the next action?

Review and CI.

What is still waiting on author, maintainer, CI, or external proof?

Waiting on PR CI after creation.

Which bot or reviewer comments were addressed?

Local autoreview P3 finding about a weak compaction persistence fixture was accepted and fixed. Final autoreview reported no accepted/actionable findings.

clawsweeper · 2026-06-05T22:45:06Z

Codex review: needs real behavior proof before merge. Reviewed June 5, 2026, 7:59 PM ET / 23:59 UTC.

Summary
The branch threads SessionManager.appendCompaction(...) through the guarded session-manager path so owned compaction appends can refresh the embedded prompt fence, with positive and negative session-lock regression coverage.

PR surface: Source +97, Tests +270. Total +367 across 5 files.

Reproducibility: yes. source inspection gives a clear reproduction path: current main refreshes the prompt fence for message persistence but not for appendCompaction while the prompt lock is released. The linked report and PR regression describe the failure, but I did not run tests in this read-only review.

Review metrics: none identified.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P1] Add redacted live provider, CLI, or transport proof that threshold auto-compaction completes without a takeover error.
Get an explicit maintainer proof override if the focused Testbox regression is considered sufficient.

Proof guidance:

[P1] Needs real behavior proof before merge: The PR body provides before/after Testbox and regression-test proof, but no live provider, CLI, or transport run after the patch; add redacted proof and update the PR body to trigger re-review, or ask a maintainer to comment @clawsweeper re-review.

Risk before merge

[P1] Contributor proof is still remote Testbox and focused regression-test proof; no live provider, CLI, or transport run after the patch shows threshold auto-compaction completing without EmbeddedAttemptSessionTakeoverError.
[P1] The diff changes the session takeover boundary, so maintainers should explicitly accept that only guarded SessionManager.appendCompaction(...) writes are recorded as owned.

Maintainer options:

Require live compaction proof or override (recommended)
Ask for redacted live provider, CLI, or transport output showing threshold auto-compaction completes without a takeover error, or have a maintainer explicitly override the proof gate.
Accept focused regression coverage
Maintainers can intentionally accept the Testbox before/after regression plus negative external-edit coverage as enough for this protected boundary.
Pause for owner review
If proof cannot be added, keep the PR paused until a session-boundary owner reviews the owned-write hook and validates the security boundary.

Next step before merge

[P1] No automated repair is needed; the remaining action is maintainer review and proof handling for a protected session-state/security-boundary PR with test-only real behavior proof.

Security
Cleared: No concrete security or supply-chain defect was found in the diff; the takeover-boundary sensitivity remains a maintainer merge-risk decision.

Review details

Best possible solution:

Land the narrow guarded compaction-owned write path after maintainer review plus redacted live compaction proof or an explicit proof override, while preserving rejection of raw external session edits.

Do we have a high-confidence way to reproduce the issue?

Yes, source inspection gives a clear reproduction path: current main refreshes the prompt fence for message persistence but not for appendCompaction while the prompt lock is released. The linked report and PR regression describe the failure, but I did not run tests in this read-only review.

Is this the best way to solve the issue?

Yes, the proposed shape is the best fix I found: it threads compaction persistence through the existing guarded SessionManager owned-write path and validates the exact append before trusting it. Broadening the benign rewrite whitelist would be a weaker security-boundary tradeoff.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against a4f7e4cbb946.

Label changes

Label justifications:

P1: The PR targets a false takeover error that can kill an active embedded agent turn during threshold auto-compaction.
merge-risk: 🚨 session-state: The diff changes how owned compaction writes refresh the prompt fence for session JSONL state during an embedded attempt.
merge-risk: 🚨 security-boundary: The guard distinguishes OpenClaw-owned session writes from external raw edits, so an overly broad whitelist would affect takeover protection.
rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body provides before/after Testbox and regression-test proof, but no live provider, CLI, or transport run after the patch; add redacted proof and update the PR body to trigger re-review, or ask a maintainer to comment @clawsweeper re-review.

Evidence reviewed

PR surface:

Source +97, Tests +270. Total +367 across 5 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	4	98	1	+97
Tests	1	270	0	+270
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	5	368	1	+367

What I checked:

Current main missing compaction hook: Current main wires guardSessionManager to refresh the fence after message persistence, but there is no equivalent compaction-persistence hook in the embedded attempt setup. (src/agents/embedded-agent-runner/run/attempt.ts:1986, a4f7e4cbb946)
Current main takeover fence: The prompt fence accepts same fingerprints, recorded owned writes, or narrow benign assistant-only rewrites before throwing EmbeddedAttemptSessionTakeoverError, which explains why an unrecorded compaction append can trip the fence. (src/agents/embedded-agent-runner/run/attempt.session-lock.ts:707, a4f7e4cbb946)
PR owned-write hook: The PR adds withOwnedSessionFileWrite, captures pre-write fingerprint/text, validates the appended JSONL, and records the post-write fingerprint only when the previous state was already trusted. (src/agents/embedded-agent-runner/run/attempt.session-lock.ts:958, 054ea49f5754)
PR compaction validator: The PR wraps appendCompaction and requires exactly one appended type: "compaction" line whose id matches the returned entry id before treating the write as owned. (src/agents/session-tool-result-guard.ts:67, 054ea49f5754)
PR regression coverage: The added tests cover owned compaction refresh, raw external compaction-looking appends, external edits before compaction, interleaved edits, and multi-controller owned-write cases. (src/agents/embedded-agent-runner/run/attempt.session-lock.test.ts:907, 054ea49f5754)
Codex compaction source checked: Sibling Codex source shows compaction replacing history and persisting a RolloutItem::Compacted, supporting that compaction is an owned session-state transition rather than a foreign edit. (../codex/codex-rs/core/src/session/mod.rs:2658, 4f655bc3b740)

Likely related people:

steipete: The available all-ref shortlog shows Peter Steinberger as the most frequent contributor across the touched session-lock and guard files. (role: heavy adjacent area contributor; confidence: medium; commits: 0c278bb93c2d, 5586b3fd19cd, ee03ade0d63f; files: src/agents/embedded-agent-runner/run/attempt.session-lock.ts, src/agents/embedded-agent-runner/run/attempt.ts, src/agents/session-tool-result-guard.ts)
obviyus: Current main blame for the session-lock fence, guarded session-manager call site, and session tool guard points to Ayaan Zaidi's recent commit. (role: recent area contributor; confidence: medium; commits: 7c885528ba25; files: src/agents/embedded-agent-runner/run/attempt.session-lock.ts, src/agents/embedded-agent-runner/run/attempt.ts, src/agents/session-tool-result-guard.ts)
Takhoffman: Recent history on the touched files includes context-window and tool-result truncation work that shares the session/context boundary involved here. (role: adjacent context-window contributor; confidence: medium; commits: 4f00b769251d, 7fc1a74ee9e3; files: src/agents/embedded-agent-runner/run/attempt.ts, src/agents/session-tool-result-guard.ts)
jalehman: Beyond authoring this PR, Josh Lehman has prior merged history on context engine transcript maintenance touching this area. (role: related prior feature contributor; confidence: medium; commits: 751d5b7849ca, 054ea49f5754; files: src/agents/embedded-agent-runner/run/attempt.ts, src/agents/session-tool-result-guard.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

jalehman · 2026-06-06T00:03:33Z

Updated proof for the current head SHA 054ea49f57548eeac9d213a7cbf1a0744b2fd2d2.

Behavior addressed: owned auto-compaction session-manager appends no longer trip embedded prompt-fence takeover detection after the prompt lock is released, while unowned or interleaved external session-file edits are still rejected.

Real environment tested: Blacksmith Testbox through Crabbox on Linux.

Exact steps or command run after this patch:

node scripts/run-vitest.mjs src/agents/embedded-agent-runner/run/attempt.session-lock.test.ts -- --reporter=verbose
node scripts/run-vitest.mjs src/agents/session-tool-result-guard.test.ts src/agents/session-tool-result-guard.transcript-events.test.ts -- --reporter=verbose
.agents/skills/autoreview/scripts/autoreview --mode local
.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
Crabbox/Testbox runtime proof using production modules: SessionManager.open, guardSessionManager, createEmbeddedAttemptSessionLockController, and acquireSessionWriteLock.

Evidence after fix:

Testbox provider/id: blacksmith-testbox, tbx_01ktd3ckct31kq04bxw1500chg
Actions run: https://github.com/openclaw/openclaw/actions/runs/27046304204
Remote proof SHA: 054ea49f57548eeac9d213a7cbf1a0744b2fd2d2
Remote proof output:
- LIVE_PROOF owned-compaction: pass
- LIVE_PROOF interleaved-external: pass
- LIVE_PROOF other-owned-writer: pass
- LIVE_PROOF result: pass

Observed result after fix: focused session-lock regression tests passed (49 passed), guard tests passed (35 passed and 4 passed), both autoreview passes reported no accepted/actionable findings, and the Testbox runtime proof exited 0.

What was not tested: this proof does not drive a full provider/model embedded agent run all the way through threshold-triggered auto-compaction. It directly exercises the production session manager, compaction guard, embedded lock controller, and session write-lock paths that own the issue invariant.

jalehman · 2026-06-06T00:05:23Z

Land-ready verification for head 054ea49f57548eeac9d213a7cbf1a0744b2fd2d2.

PR state: mergeable and clean against main.
GitHub checks: passing or intentionally skipped as of the latest gh pr checks watch.
Focused local tests:
- node scripts/run-vitest.mjs src/agents/embedded-agent-runner/run/attempt.session-lock.test.ts -- --reporter=verbose -> 49 passed
- node scripts/run-vitest.mjs src/agents/session-tool-result-guard.test.ts src/agents/session-tool-result-guard.transcript-events.test.ts -- --reporter=verbose -> 35 passed and 4 passed
Formatting sanity: git diff --check -> clean.
Structured review:
- .agents/skills/autoreview/scripts/autoreview --mode local -> clean, no accepted/actionable findings.
- .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main -> clean, no accepted/actionable findings.
Remote runtime proof: Blacksmith Testbox through Crabbox, tbx_01ktd3ckct31kq04bxw1500chg, Actions run https://github.com/openclaw/openclaw/actions/runs/27046304204, exit 0.

Known proof scope: the Testbox proof directly exercises the production session manager, compaction guard, embedded lock controller, and session write-lock paths. It does not run a full live provider/model turn through threshold auto-compaction.

jalehman · 2026-06-06T00:06:09Z

Landed via squash merge onto main.

Head commit before merge: 054ea49f57548eeac9d213a7cbf1a0744b2fd2d2
Merge commit: bbfe8ccaf60e87e84a3c695e56f4bee7c687c5b0
Gate: latest PR checks passed or were intentionally skipped before merge.
Focused proof: session-lock regression tests, guard tests, git diff --check, local + branch autoreview.
Remote runtime proof: Blacksmith Testbox through Crabbox tbx_01ktd3ckct31kq04bxw1500chg, Actions run https://github.com/openclaw/openclaw/actions/runs/27046304204.

Thanks @jalehman.

openclaw-barnacle Bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels Jun 5, 2026

fix: refresh prompt fence after compaction writes

054ea49

jalehman force-pushed the openclaw-62b/issue-90729-compaction-fence branch from 327a3e1 to 054ea49 Compare June 5, 2026 23:52

openclaw-barnacle Bot added size: M and removed size: S labels Jun 5, 2026

jalehman merged commit bbfe8cc into main Jun 6, 2026
169 of 172 checks passed

jalehman deleted the openclaw-62b/issue-90729-compaction-fence branch June 6, 2026 00:05

Haderach-Ram mentioned this pull request Jun 6, 2026

Ecosystem Digest — 2026-06-06 Haderach-Ram/openclaw-radar#30

Open

ai-hpc mentioned this pull request Jun 6, 2026

fix(agents): keep auto-compaction appends owned during prompt lock release #90773

Closed

recruits mentioned this pull request Jun 11, 2026

[Bug]: EmbeddedAttemptSessionTakeoverError caused by Btrfs ctimeNs instability #92109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: refresh prompt fence after compaction writes#90775

fix: refresh prompt fence after compaction writes#90775
jalehman merged 1 commit into
mainfrom
openclaw-62b/issue-90729-compaction-fence

jalehman commented Jun 5, 2026

Uh oh!

clawsweeper Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

jalehman commented Jun 6, 2026

Uh oh!

jalehman commented Jun 6, 2026

Uh oh!

Uh oh!

jalehman commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jalehman commented Jun 5, 2026

Summary

Linked context

Real behavior proof (required for external PRs)

Tests and validation

Risk checklist

Current review state

Uh oh!

clawsweeper Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jalehman commented Jun 6, 2026

Uh oh!

jalehman commented Jun 6, 2026

Uh oh!

Uh oh!

jalehman commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented Jun 5, 2026 •

edited

Loading