fix(session-lock): allow co-tenant writes during prompt I/O window#84149
fix(session-lock): allow co-tenant writes during prompt I/O window#84149Jerry-Xin wants to merge 2 commits into
Conversation
|
Thanks for the context here. I swept through the related work, and this is now duplicate or superseded. Close as superseded: a newer open PR now tracks the same in-process/co-tenant session-write problem with a stronger ownership-aware fence and real filesystem proof, while this branch still has a blocking session-integrity weakness and no real behavior proof. Canonical path: Close this branch and review or land #84250, or an equivalent ownership-aware fence that accepts trusted same-process locked writes while preserving external-edit negative tests. So I’m closing this here and keeping the remaining discussion on #84250. Review detailsBest possible solution: Close this branch and review or land #84250, or an equivalent ownership-aware fence that accepts trusted same-process locked writes while preserving external-edit negative tests. Do we have a high-confidence way to reproduce the issue? Yes at source level: current main snapshots the session-file fingerprint at prompt release and throws on any changed fingerprint after reacquire, while legitimate event/hook writers can use the same session-lock path. I did not run a live heartbeat, cron, or channel reproduction in this read-only review. Is this the best way to solve the issue? No. The epoch shortcut fixes the false positive by weakening the fence for the whole prompt window; the safer direction is the superseding ownership-aware fingerprint approach in #84250 or equivalent. Security review: Security review needs attention: The patch changes the session-integrity fence and can weaken detection of lock-bypassing session-file mutation in a mixed prompt window.
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against a002c416c7af. |
238dcef to
cac1b69
Compare
|
Rebased on latest upstream/main (b7ba7c3, clean rebase, no conflicts). |
|
Thanks for the thorough review. Regarding the P1 finding about epoch-based fence weakness: Design rationale for the co-tenant epoch shortcut: The epoch counter (
The concern is that a co-tenant's lock acquisition could mask an external (lock-bypassing) session file append in the same window. This is technically correct, but:
I'll add a code comment documenting this deliberate trade-off. Happy to discuss alternative approaches if maintainers prefer a tighter check — for example, tracking a fingerprint chain per epoch, though that would add significant complexity. |
4acf38e to
0a5c530
Compare
|
Rebased on latest upstream/main (48acdd3, clean rebase, no conflicts). |
0a5c530 to
7795ae9
Compare
|
Rebased on latest upstream/main (94d8391, clean rebase, no conflicts). |
7795ae9 to
36f42d2
Compare
5b9fc8d to
a5305a8
Compare
|
Rebased on latest upstream/main (79197b3, clean rebase, no conflicts). |
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
Add a process-scoped write epoch counter to acquireSessionWriteLock so the embedded attempt controller can distinguish legitimate in-process co-tenant writes (heartbeat, cron, channel ingress) from true external session takeovers. The fingerprint fence now checks the epoch delta: delta > 1 means other in-process callers acquired the lock, so the file change is coordinated — update the fence baseline and continue. Fixes openclaw#84071
a5305a8 to
e14edf8
Compare
|
Rebased on latest upstream/main (a002c41, clean rebase, no conflicts). |
|
ClawSweeper applied the proposed close for this PR.
|
Summary
EmbeddedAttemptSessionTakeoverErrorfires on legitimate co-tenant writes (heartbeat, cron, channel ingress) to shared sessions during the prompt I/O window.Root Cause
attempt.session-lock.ts:284-291:releaseForPrompt()records a file stat fingerprint and releases the lock. Co-tenant writers acquire the same fs-level lock and append to the session file (changing the stat). When the controller reacquires,assertSessionFileFence()sees a fingerprint mismatch and sets the stickytakeoverDetectedflag — even though the write was coordinated through the same lock.Fix
Add a process-scoped write epoch counter to
acquireSessionWriteLock()that increments on each lock acquisition. The controller records the epoch at fence activation. On reacquisition, it checks the epoch delta:Cross-process writes don't advance the in-process epoch, so true external takeovers are still detected.
Tests
4 new tests: co-tenant write accepted, external write detected, mixed co-tenant + external detected, multiple co-tenant writes succeed. All 26 existing session-lock tests pass.
Fixes #84071