-
-
Notifications
You must be signed in to change notification settings - Fork 79.1k
Hoist withOwnedSessionTranscriptWrites ALS scope to span agent.prompt() to fix vanilla-openclaw same-lane fence trip #86572
Copy link
Copy link
Open
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.ClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Metadata
Metadata
Assignees
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.ClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Summary
Same-lane
EmbeddedAttemptSessionTakeoverErrorrace in vanilla openclaw / pi caused by pi's listener-driven session writes happening outside thewithOwnedSessionTranscriptWritesAsyncLocalStorage scope. The current ALS context only wrapsagent.streamFn(...), but pi's session-event listeners (_handleAgentEvent→_persist→appendFileSync) iterate the constructed stream afterstreamFnresolves, so their writes never publish toownedSessionFileWrites. The nextwithSessionWriteLockcall (e.g., abeforeToolCallhook) then tripsassertSessionFileFencebecause the file changed without a matching owned-write record.Why this is separate from #86067
#86067 closes two specific races:
65705d8c39)_agentEventQueue(commit4254386508, reported by @kesslerio for AlphaClaw)Neither change touches the ALS scope. In vanilla openclaw both fixes are either narrow (case 1) or no-op (case 2 —
_agentEventQueueisn't populated by the upstream session manager). The case described above is what vanilla openclaw users (e.g. @cby2k's deterministic repro on session UUID4f181fe9-…) are likely hitting.Proposed fix shape
Hoist the
withOwnedSessionTranscriptWriteswrapper frominstallPromptSubmissionLockRelease(currentlysrc/agents/pi-embedded-runner/run/attempt.session-lock.ts:1015-1041, wrapping onlyoriginalStreamFn) up to span the entireagent.prompt(...)call. This requires composing the wrapper atsrc/agents/pi-embedded-runner/run/attempt.ts:3243-3249(or equivalent) so pi's listener-driven writes during the for-await loop also see the ALS context, get published as owned writes, and the fence accepts them on subsequent checks.Alternative shape: instrument pi's
_handleAgentEventlistener wrapper to publish writes directly intoownedSessionFileWritesregardless of ALS state. Less elegant but avoids touching the call composition above.What this would address
4f181fe9-…(long inbound turns hanging ~22s then dropping; gateway-fallback fresh-session takeover path)Acceptance criteria (suggested)
cuobjdump-equivalent for the embedded-runner path: writes during the for-await loop appear inownedSessionFileWritesafter the patchwithSessionWriteLockRisk
Hoisting the ALS scope means anything pi's listener machinery writes during the prompt window will be classified as an "owned write." Need to verify no spurious classifications (e.g., from extension code that legitimately writes external data inside a listener). Test surface should cover the corner case where a session listener also touches a file outside the trusted scope.
Related
EmbeddedAttemptSessionTakeoverError