Summary
Long-running WebChat main sessions can silently lose the live agent:main:main mapping after context overflow / compaction-path failures, causing the next visible user message to start a new session id even though the prior transcript still exists on disk.
From the user's perspective, this looks like the session "disappeared" or got wiped:
- the visible main WebChat chat appears to restart
- prior checkpoints/history are no longer attached to the active session row
- the agent behaves like it lost context
- old transcripts are still present on disk, but the active session store now points
agent:main:main at a new sessionId
This seems related to #70330, but the trigger here is different:
Environment
- OpenClaw CLI:
2026.4.21 (f788c88)
- Channel/surface: WebChat direct
- Session key:
agent:main:main
- Model:
openai-codex/gpt-5.4
- Host OS: Darwin
25.4.0 arm64
What happened
On Wednesday, April 22, 2026 (America/New_York), the main WebChat session hit context-overflow conditions multiple times during heavy tool use.
Relevant observed timeline:
- 10:19:21 PM EDT — log recorded a context overflow for
agent:main:main tied to session file b8a5c854-4632-4954-ae09-fd507ada1e8a.jsonl
- 10:21:14 PM EDT — log recorded
skipping compaction checkpoint persist: session not found for agent:main:main
- 10:30:51 PM EDT — another context overflow for
agent:main:main, now tied to df065d9f-9445-4401-a94b-61042c7eff40.jsonl
- 11:17:42 PM EDT — another context overflow for that same later session file
- after that, later visible user messages landed in a fresh active session mapping with session id
f36b8f4d-bdca-41a2-b13c-a6ec3016218a
Important detail: the earlier transcripts were still on disk. They were not actually deleted from the transcript folder. What changed was the active sessions.json entry for agent:main:main, which ended up pointing at a new session id.
Why this looks like a bug
The sequence suggests the session store entry for the active main session becomes missing/unavailable during overflow/compaction handling:
- main session grows very large
- provider/tool loop hits context overflow
- compaction checkpoint code tries to persist and logs
session not found
- active
agent:main:main mapping is no longer the old session
- next user message creates or uses a new session id
- user experiences this as spontaneous session loss
The log line below is especially suspicious because it shows the compaction/checkpoint path could not find the active session entry it expected:
skipping compaction checkpoint persist: session not found
Sanitized evidence
Observed log lines from /tmp/openclaw/openclaw-2026-04-22.log:
2026-04-22 22:19:21 EDT [context-overflow-diag] sessionKey=agent:main:main ... sessionFile=/Users/.../b8a5c854-4632-4954-ae09-fd507ada1e8a.jsonl
2026-04-22 22:21:14 EDT skipping compaction checkpoint persist: session not found { sessionKey: agent:main:main }
2026-04-22 22:30:51 EDT [context-overflow-diag] sessionKey=agent:main:main ... sessionFile=/Users/.../df065d9f-9445-4401-a94b-61042c7eff40.jsonl
2026-04-22 23:17:42 EDT [context-overflow-diag] sessionKey=agent:main:main ... sessionFile=/Users/.../df065d9f-9445-4401-a94b-61042c7eff40.jsonl
Session files observed afterward:
old transcript still present:
~/.openclaw/agents/main/sessions/df065d9f-9445-4401-a94b-61042c7eff40.jsonl
new active transcript:
~/.openclaw/agents/main/sessions/f36b8f4d-bdca-41a2-b13c-a6ec3016218a.jsonl
Current sessions.json afterward pointed agent:main:main at the new session id instead of the prior active transcript.
Expected behavior
When a main WebChat session overflows context or compaction/checkpoint persistence has trouble:
- OpenClaw should not silently orphan the active
agent:main:main mapping
- the existing session should remain the active logical session unless the user explicitly resets it
- checkpoint persistence failure should not cause a hidden session rotation
- if recovery/new-session behavior does happen, it should be explicit and auditable in the UI and store
Actual behavior
- overflow happened
- compaction/checkpoint path logged
session not found
- active WebChat main mapping later pointed to a different session id
- user-visible effect was "we lost the session again"
- prior transcript remained on disk, but continuity in the active UI/session mapping was broken
Possible root-cause area
This log pair looks like the key clue:
- context overflow in the embedded runner
- compaction checkpoint persistence cannot find the current session entry
The checkpoint persistence code already logs this exact case:
skipping compaction checkpoint persist: session not found
So one plausible failure mode is:
- overflow/compaction or related recovery mutates/removes the active session-store entry unexpectedly
- checkpoint persistence races or arrives after the store no longer contains the expected canonical key
- later inbound WebChat traffic reinitializes
agent:main:main onto a new session id
Suggested fixes
- Treat
agent:main:main disappearance during overflow/compaction as a high-severity invariant violation and log the old/new session ids plus store path.
- Prevent checkpoint persistence failure from leaving the active main session unmapped.
- Add an explicit recovery path that preserves the active session id unless the user intentionally resets.
- If a fallback/new session must be created, emit a visible/auditable session-rotation event in the transcript/store/UI.
- Add regression coverage for:
- large WebChat direct session
- context overflow during tool-heavy turn
- post-overflow compaction/checkpoint handling
- subsequent user message should continue same active session id
Severity / impact
This is risky for long-running operational sessions because the user can believe they are continuing the same stateful conversation when the agent has actually been remapped onto a fresh session.
That is especially dangerous for write-capable local-admin workflows because the agent may continue from incomplete or reconstructed context while the user thinks continuity was preserved.
Summary
Long-running WebChat main sessions can silently lose the live
agent:main:mainmapping after context overflow / compaction-path failures, causing the next visible user message to start a new session id even though the prior transcript still exists on disk.From the user's perspective, this looks like the session "disappeared" or got wiped:
agent:main:mainat a newsessionIdThis seems related to #70330, but the trigger here is different:
Environment
2026.4.21 (f788c88)agent:main:mainopenai-codex/gpt-5.425.4.0arm64What happened
On Wednesday, April 22, 2026 (America/New_York), the main WebChat session hit context-overflow conditions multiple times during heavy tool use.
Relevant observed timeline:
agent:main:maintied to session fileb8a5c854-4632-4954-ae09-fd507ada1e8a.jsonlskipping compaction checkpoint persist: session not foundforagent:main:mainagent:main:main, now tied todf065d9f-9445-4401-a94b-61042c7eff40.jsonlf36b8f4d-bdca-41a2-b13c-a6ec3016218aImportant detail: the earlier transcripts were still on disk. They were not actually deleted from the transcript folder. What changed was the active
sessions.jsonentry foragent:main:main, which ended up pointing at a new session id.Why this looks like a bug
The sequence suggests the session store entry for the active main session becomes missing/unavailable during overflow/compaction handling:
session not foundagent:main:mainmapping is no longer the old sessionThe log line below is especially suspicious because it shows the compaction/checkpoint path could not find the active session entry it expected:
Sanitized evidence
Observed log lines from
/tmp/openclaw/openclaw-2026-04-22.log:Session files observed afterward:
Current
sessions.jsonafterward pointedagent:main:mainat the new session id instead of the prior active transcript.Expected behavior
When a main WebChat session overflows context or compaction/checkpoint persistence has trouble:
agent:main:mainmappingActual behavior
session not foundPossible root-cause area
This log pair looks like the key clue:
The checkpoint persistence code already logs this exact case:
So one plausible failure mode is:
agent:main:mainonto a new session idSuggested fixes
agent:main:maindisappearance during overflow/compaction as a high-severity invariant violation and log the old/new session ids plus store path.Severity / impact
This is risky for long-running operational sessions because the user can believe they are continuing the same stateful conversation when the agent has actually been remapped onto a fresh session.
That is especially dangerous for write-capable local-admin workflows because the agent may continue from incomplete or reconstructed context while the user thinks continuity was preserved.