fix(agents): adopt rotated session id and file after embedded compaction rotation#88106
fix(agents): adopt rotated session id and file after embedded compaction rotation#88106coder999999999 wants to merge 2 commits into
Conversation
…ion rotation Embedded auto-compaction with truncateAfterCompaction rotates the transcript to a new session id + file mid-run. updateSessionStoreAfterAgentRun previously persisted the pre-rotation sessionId and never updated sessionFile, so sessions.json kept pointing at the stale transcript whose write lock the next turn blocked on (deadlock / crash loop). Adopt the rotated identity, gated on agentMeta.sessionFile being present so CLI runs (where agentMeta.sessionId is the external tool's id and sessionFile is unset) never have their openclaw session identity rewritten. Closes openclaw#88040 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR ensures the session store tracks embedded-run transcript rotation (session id + file) caused by auto-compaction, preventing sessions.json from referencing a stale, lock-held transcript file.
Changes:
- Detect embedded auto-compaction rotation from
agentMeta.sessionId+agentMeta.sessionFileand persist the rotated identity in the session store entry. - Reset
sessionStartedAtwhen the session identity changes due to rotation. - Add tests covering embedded rotation adoption and ensuring CLI runs don’t overwrite the OpenClaw session id.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/agents/command/session-store.ts | Adopts rotated session id/file for embedded-run transcript rotation and updates start timestamp behavior. |
| src/agents/command/session-store.test.ts | Adds test coverage for rotation adoption and verifies CLI runs keep the OpenClaw session identity. |
| // Embedded auto-compaction with `truncateAfterCompaction` rotates the transcript | ||
| // to a new session id + file mid-run. Adopt the rotated identity so sessions.json | ||
| // stops pointing at the stale pre-rotation file, whose write lock the next turn | ||
| // would otherwise block on. Only the embedded runner sets agentMeta.sessionFile; | ||
| // CLI runs leave it unset and reuse agentMeta.sessionId for the external tool's | ||
| // id, so this guard never rewrites a CLI session's openclaw identity. | ||
| const rotatedSessionId = normalizeOptionalString(result.meta.agentMeta?.sessionId); | ||
| const rotatedSessionFile = normalizeOptionalString(result.meta.agentMeta?.sessionFile); | ||
| const sessionRotated = Boolean( | ||
| rotatedSessionFile && rotatedSessionId && rotatedSessionId !== sessionId, | ||
| ); | ||
| const effectiveSessionId = sessionRotated && rotatedSessionId ? rotatedSessionId : sessionId; |
| if (sessionRotated && rotatedSessionFile) { | ||
| next.sessionFile = rotatedSessionFile; | ||
| } |
| const sessionRotated = Boolean( | ||
| rotatedSessionFile && rotatedSessionId && rotatedSessionId !== sessionId, | ||
| ); | ||
| const effectiveSessionId = sessionRotated && rotatedSessionId ? rotatedSessionId : sessionId; |
| await updateSessionStoreAfterAgentRun({ | ||
| cfg, | ||
| sessionId, | ||
| sessionKey, | ||
| storePath, | ||
| contextTokensOverride: 200_000, | ||
| sessionStore, | ||
| defaultProvider: "openai", | ||
| defaultModel: "gpt-5.5", | ||
| result, | ||
| }); |
|
Codex review: needs real behavior proof before merge. Reviewed May 29, 2026, 3:19 PM ET / 19:19 UTC. Summary PR surface: Source +27, Tests +173. Total +200 across 2 files. Reproducibility: yes. for source-level reproduction: current main rotates embedded compaction to a new id/file and reports both through agentMeta, while the existing session-store update path persists the caller id. I did not establish a live in-flight compaction repro in this read-only review. Review metrics: 2 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Risk before merge
Maintainer options:
Next step before merge
Security Review findings
Review detailsBest possible solution: Land a current-main patch that adopts the rotated id/file for normal and preserved-state runs, preserves usage family across the old and new session ids, and includes real diagnostic embedded-compaction proof. Do we have a high-confidence way to reproduce the issue? Yes for source-level reproduction: current main rotates embedded compaction to a new id/file and reports both through agentMeta, while the existing session-store update path persists the caller id. I did not establish a live in-flight compaction repro in this read-only review. Is this the best way to solve the issue? No: the PR is in the right helper and now covers preserved-state writes, but it should also preserve usage-family fields across the old and new session ids and still needs real embedded-run proof. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against a39c2d784efd. Label changesLabel justifications:
Evidence reviewedPR surface: Source +27, Tests +173. Total +200 across 2 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
ClawSweeper review of openclaw#88106 flagged that the preserveUserFacingSessionModelState (heartbeat) path builds metadataPatch from only updatedAt/lastInteractionAt, so a rotated sessionId/sessionFile assigned on `next` never reached mergeSessionEntry. A heartbeat turn runs through the embedded runner and can hit a compaction rotation, so that path would reintroduce the openclaw#88040 deadlock. Carry the rotated identity (sessionId + sessionFile + sessionStartedAt) into the preserve-path patch too: a physical transcript rotation is filesystem identity, not user-facing model state. Add a preserved-state rotation regression test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@clawsweeper re-review Addressed the The rotated identity ( Validation: |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Closing as a duplicate — #88040 was fixed on The landed fix covers the same rotated-identity adoption plus usage-family tracking, and handles the CLI-vs-embedded session-id distinction. No reason to carry a second, now-conflicting change for the same issue. Thanks @1052326311 — closing in favor of the merged fix. |
Summary
When the embedded agent runner hits its context limit mid-run, auto-compaction with
truncateAfterCompactionrotates the transcript to a brand-new session id + file (compaction-successor-transcript.tsmints a freshrandomUUID();run/attempt.tsupdatessessionIdUsed/sessionFileUsed;run.tssurfaces both onresult.meta.agentMeta).But
updateSessionStoreAfterAgentRunpersisted the pre-rotationsessionIdand never wrotesessionFile. So after a rotationsessions.jsonkept pointing at the stale pre-rotation transcript. BecauseresolveSessionTranscriptFileprefers the storedsessionEntry.sessionFile, the next turn reopened the stale file and blocked on its write lock — the deadlock / crash-loop reported in #88040.Fix
updateSessionStoreAfterAgentRunnow detects a rotation and adopts the new identity (id and file):agentMeta.sessionFileis present andagentMeta.sessionIddiffers from the run'ssessionId.sessionStartedAtresets when the id actually changes (a rotated transcript is a new session).Why the guard matters (CLI safety): only the embedded runner sets
agentMeta.sessionFile. For CLI providers,agentMeta.sessionIdis the external CLI tool's session id (seecli-runner.ts) andsessionFileis unset — so the guard never rewrites a CLI session's openclaw identity. A naiveagentMeta.sessionId ?? sessionIdadoption would corrupt every CLI run's session id and still leavesessionFilestale; this fix avoids both.Linked context
Closes #88040
Real behavior proof (required for external PRs)
Behavior addressed: after embedded auto-compaction rotated the transcript to a new session id/file, sessions.json kept the pre-rotation id and stale sessionFile, so the next turn blocked on the stale transcript's write lock (deadlock). After this patch the on-disk store adopts the rotated id + file; CLI session identity is left untouched.
Real environment tested: local macOS checkout, branch
fix/88040-session-rotationbased on currentupstream/main. The proof drives the real production functionupdateSessionStoreAfterAgentRunagainst a real on-disk session store (temp dir, no store mock) with aresult.meta.agentMetacarrying the rotatedsessionId+sessionFilethe embedded runner emits after rotation, then reloads sessions.json from disk to assert the persisted contents. A second case feeds a CLI-shapedagentMeta(externalsessionId, nosessionFile) to prove the openclaw id is preserved.Exact steps or command run after this patch: I isolated the fix by reverting only the production file to unpatched
upstream/mainwhile keeping the new tests, ran the production code path withnode, then restored the fix and re-ran. Commands:Evidence after fix: copied live console output from
node scripts/run-vitest.mjs. Against the unpatched production file the run reproduces the deadlock condition (the on-disk store keeps the stale identity); with the fix restored it resolves —Observed result after fix: with the fix restored, the
node scripts/run-vitest.mjsrun shows the reloaded on-disk sessions.json adopts the rotatedsessionIdandsessionFile(no stale pointer, so the next turn does not block on the stale transcript's write lock). The CLI case keepssessionIdequal to the openclaw id withsessionFileunchanged and records the external tool id undercliSessionIds.What was not tested: an end-to-end live agent run driven all the way to an in-flight context-limit compaction rotation (that requires a long real model run, infeasible offline). The proof instead exercises the identical post-rotation production path (
updateSessionStoreAfterAgentRunwriting/reloading a real sessions.json) with the exactagentMetashape the embedded runner emits after rotation.Tests and validation
node scripts/run-vitest.mjs src/agents/command/session-store.test.ts-> 70 passed (2 new: embedded-rotation adoption of id+file withsessionStartedAtreset; CLI run preserves openclaw id and leavessessionFileuntouched).oxfmt --checkclean,oxlintclean,git diff --checkclean,tsgo:coreexit 0,tsgo:core:testexit 0.Note on #88055
I opened this after noticing #88055 targets the same issue with a one-line
sessionId: result?.meta?.agentMeta?.sessionId ?? sessionIdat the call site. That approach has two problems this PR avoids: (1) for CLI providersagentMeta.sessionIdis the external tool's id, so it would overwrite the openclaw session id on every CLI run; (2) it never updatessessionFile, soresolveSessionTranscriptFilestill resolves the stale transcript and the deadlock persists. Happy to defer to maintainers on which to take.