Skip to content

Hoist withOwnedSessionTranscriptWrites ALS scope to span agent.prompt() to fix vanilla-openclaw same-lane fence trip #86572

@ubehera

Description

@ubehera

Summary

Same-lane EmbeddedAttemptSessionTakeoverError race in vanilla openclaw / pi caused by pi's listener-driven session writes happening outside the withOwnedSessionTranscriptWrites AsyncLocalStorage scope. The current ALS context only wraps agent.streamFn(...), but pi's session-event listeners (_handleAgentEvent_persistappendFileSync) iterate the constructed stream after streamFn resolves, so their writes never publish to ownedSessionFileWrites. The next withSessionWriteLock call (e.g., a beforeToolCall hook) then trips assertSessionFileFence because the file changed without a matching owned-write record.

Why this is separate from #86067

#86067 closes two specific races:

  1. Cross-lane same-file race — file-scoped prompt-window guard (commit 65705d8c39)
  2. Same-lane post-stream drain — for deployments that populate _agentEventQueue (commit 4254386508, reported by @kesslerio for AlphaClaw)

Neither change touches the ALS scope. In vanilla openclaw both fixes are either narrow (case 1) or no-op (case 2 — _agentEventQueue isn't populated by the upstream session manager). The case described above is what vanilla openclaw users (e.g. @cby2k's deterministic repro on session UUID 4f181fe9-…) are likely hitting.

Proposed fix shape

Hoist the withOwnedSessionTranscriptWrites wrapper from installPromptSubmissionLockRelease (currently src/agents/pi-embedded-runner/run/attempt.session-lock.ts:1015-1041, wrapping only originalStreamFn) up to span the entire agent.prompt(...) call. This requires composing the wrapper at src/agents/pi-embedded-runner/run/attempt.ts:3243-3249 (or equivalent) so pi's listener-driven writes during the for-await loop also see the ALS context, get published as owned writes, and the fence accepts them on subsequent checks.

Alternative shape: instrument pi's _handleAgentEvent listener wrapper to publish writes directly into ownedSessionFileWrites regardless of ALS state. Less elegant but avoids touching the call composition above.

What this would address

Acceptance criteria (suggested)

  • Reproduce one of @cby2k's deterministic cases on a patched build
  • Confirm cuobjdump-equivalent for the embedded-runner path: writes during the for-await loop appear in ownedSessionFileWrites after the patch
  • A regression test that drives a synthetic prompt-emitted-during-streaming write and asserts no fence trip on the subsequent withSessionWriteLock

Risk

Hoisting the ALS scope means anything pi's listener machinery writes during the prompt window will be classified as an "owned write." Need to verify no spurious classifications (e.g., from extension code that legitimately writes external data inside a listener). Test surface should cover the corner case where a session listener also touches a file outside the trusted scope.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions