Skip to content

fix(gateway): preserve chat history across compaction checkpoint chains#68765

Closed
cholaolu-boop wants to merge 12 commits into
openclaw:mainfrom
cholaolu-boop:fix/read-session-messages-compaction-chain
Closed

fix(gateway): preserve chat history across compaction checkpoint chains#68765
cholaolu-boop wants to merge 12 commits into
openclaw:mainfrom
cholaolu-boop:fix/read-session-messages-compaction-chain

Conversation

@cholaolu-boop

@cholaolu-boop cholaolu-boop commented Apr 19, 2026

Copy link
Copy Markdown

Problem

When readSessionMessages is called for the configured main session (as
returned by resolveMainSessionKey"global", "agent:main:main", or
any agent:<default>:<mainKey> your config resolves to) and that session's
sessions.json entry carries a non-empty compactionCheckpoints[] chain,
the function only reads the single transcript file resolved by
resolveSessionTranscriptCandidates(...).find(existsSync).

That single-file read discards every pre-compaction transcript referenced
by the chain, so downstream consumers of readSessionMessages (notably
the chat.history gateway handler and anything that feeds it) see a
truncated history that silently ends at the most recent compaction
checkpoint boundary. Sequence numbers get reassigned starting from 1
inside the last segment only, which makes chain-continuity checks (gap
detection, dedup, ordering) report a healthy chain that is in fact
missing all pre-compaction turns.

The bug is not visible in sessions that have never been compacted (their
compactionCheckpoints[] is empty or absent), which is why it survives
basic smoke tests on fresh sessions.

Repro

  1. Run the configured main session long enough to trigger at least one
    compaction checkpoint (verify via sessions.json → the entry has
    compactionCheckpoints: [{ checkpointId: "...", preCompaction: { sessionFile: "..." }, ... }, ...]).
  2. Call
    openclaw gateway call chat.history --params '{"sessionKey":"<main_session_key>","limit":200}'.
  3. Observe that the returned messages[] starts at the most recent
    post-compaction segment. All messages from transcripts listed in
    compactionCheckpoints[].preCompaction.sessionFile are missing.
    Sequence numbers begin at 1 inside the returned window instead of
    being continuous across the chain.

Expected: the returned history spans every transcript referenced by the
checkpoint chain plus the current primary transcript, in mtime order,
with a single continuous seq counter.

Fix

Introduce a chained resolver in resolveMainSessionTranscriptFiles that
activates only for the configured main session key and walks the stored
checkpoint chain using the authoritative file references recorded at
compaction time:

  1. Look the session up in sessions.json via a new
    resolveSessionMessagesStoreContext(sessionId, storePath, sessionFile)
    helper.
  2. Gate the chained-read path on
    context.key === resolveMainSessionKeyFromConfig(). This respects
    session.scope ("global" short-circuit), session.mainKey, and
    agents.default — no hardcoded session-key literal. For any other
    session (isolated, subagent, heartbeat, explicit agent session) the
    resolver returns null and the legacy single-file path applies.
  3. Walk entry.compactionCheckpoints[] in recorded order. For each
    entry read checkpoint.preCompaction.sessionFile (the
    SessionCompactionTranscriptReference written by
    session-compaction-checkpoints.ts for files named like
    <name>.checkpoint.<uuid>.jsonl) and push that exact path.
  4. Append entry.sessionFile as the newest segment so the chain ends
    with the live primary transcript.
  5. Deduplicate by resolved path and drop non-existent files. If nothing
    resolves, return null and let the legacy single-file fallback apply.
  6. Sort the resulting list by mtimeMs ascending, with a filename
    localeCompare tie-break. fs.statSync results are cached per
    path so each file is stat'd once, not O(N log N) times inside the
    comparator.

readSessionMessages iterates over that file list and keeps a single
shared messageSeq counter across all files, so the seq on each
reassembled message is continuous across the full chain.

Scope is intentionally narrow:

  • No readdirSync sweep of the sessions directory.
  • No inclusion of .jsonl.reset.*, .jsonl.bak, or .jsonl.deleted.*
    sibling files. .bak is a pre-compaction snapshot that the checkpoint
    chain already represents via predecessor entries (reading both would
    double-count compacted turns). .reset.* archives are content the
    user explicitly cleared via /reset; replaying them would resurface
    cleared turns and break reset semantics
    (session-reset-service.ts preserves compactionCheckpoints[] across
    a reset).
  • No UUID-shape filter on session ids. The authoritative path already
    comes from the stored preCompaction.sessionFile reference, so no
    path-safety validation on raw session ids is needed.
  • No change to the inner parse loop for message / compaction
    entries.
  • No change to the resolveSessionTranscriptCandidates module.
  • Non-main sessions, sessions without a compactionCheckpoints[]
    chain, and any config-read error all fall through to the existing
    single-file path — behavior byte-compatible with today's upstream.

All changes are contained in src/gateway/session-utils.fs.ts.

Test notes for reviewers

Suggested test cases:

  • Single-file session with no compactionCheckpoints[] → behavior
    unchanged, same messages, same seq assignment.
  • Configured main session (via resolveMainSessionKeyFromConfig) with
    a two-step checkpoint chain whose entries carry
    preCompaction.sessionFile paths → returned messages span both
    checkpoint transcripts plus entry.sessionFile in mtime order, seq
    strictly monotonically increasing from 1 to N with no gaps.
  • Session whose store entry's sessionFile differs from the arg →
    entry mismatch is detected in resolveSessionMessagesStoreContext
    and does not produce chained reads.
  • Session where context.key !== resolveMainSessionKeyFromConfig()
    (subagent, heartbeat, explicit agent, or reconfigured main key) →
    resolver returns null, legacy single-file path applies.
  • Missing or corrupt sessions.json → store lookup returns null or
    { sessionsDir } only, the chained resolver returns null, legacy
    path applies.
  • A checkpoint entry with missing preCompaction.sessionFile → entry
    is skipped, remaining chain entries are still used.
  • Config-read failure during resolveMainSessionKeyFromConfig → caught
    and treated as "not the main session"; legacy path applies.
  • Two files with identical mtimeMs → deterministic ordering via
    filename localeCompare tie-break.

Relationship to #60409

#60409 ("fix(session): fall back to reset archive in chat.history when
primary transcript is missing") addresses a different failure mode in
the same function: it triggers when the primary transcript file is
absent and falls back to the latest .reset.<timestamp> archive. Its
trigger condition is !filePath && sessionId.

This PR addresses a distinct scenario: the primary transcript is
present, but the returned history misses every pre-compaction
transcript referenced via compactionCheckpoints[]. Trigger condition
is the existence of a non-empty checkpoint chain on the configured
main session.

The two fixes handle disjoint failure paths and stack additively —
neither removes functionality the other needs. If #60409 merges first,
this PR rebases mechanically: the resolveTranscriptReadFiles
dispatcher keeps the existing single-file resolution plus #60409's
reset-archive fallback intact as the non-chained path, and layers the
compaction-chain walk in front of it for the configured main session.
Happy to coordinate during review.

Out of scope

  • Boundary-marker synthetic system messages between transcript
    segments (kept for a possible follow-up).
  • The reset-archive fallback for non-chained sessions — handled by
    fix(session): fall back to reset archive in chat.history when primary transcript is missing #60409.
  • Any sanitize-pipeline, threshold, or oversized-message-placeholder
    changes in chat.history.
  • Extending the chained read to sessions other than the configured
    main (subagent, heartbeat, explicit agent sessions) — possible
    follow-up, but requires a separate conversation about which session
    shapes should carry compaction checkpoints.
  • Upstream-main rebase of this branch (force-push required,
    reviewer-gated).

…hain for agent:main:main

For `agent:main:main` sessions whose `sessions.json` entry carries a
non-empty `compactionCheckpoints[]` chain, `readSessionMessages` only
read the single transcript resolved by `resolveSessionTranscriptCandidates`.
All predecessor transcripts were silently dropped, so consumers of
`chat.history` saw a history truncated at the last compaction boundary,
with `seq` reset to 1 inside the trailing segment.

Introduce a multi-file transcript resolver that, for `agent:main:main`
only, walks `compactionCheckpoints[]` plus the current session id and
collects every existing transcript variant (`.jsonl`, `.jsonl.reset.*`,
`.jsonl.deleted.*`, `.jsonl.bak`, excluding `.lock`), deduplicating and
returning them in `mtimeMs` order. `readSessionMessages` iterates this
file list with a single shared `messageSeq` counter so the reassembled
history is continuous across the full chain.

For non-`agent:main:main` sessions, and when the chained lookup yields
no files, `resolveTranscriptReadFiles` falls back to the existing
single-file path. Behavior byte-compatible with today's upstream there.
Inner parse loop for `message` / `compaction` entries unchanged. No UX
boundary markers, no sanitize-pipeline or threshold changes.
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: M labels Apr 19, 2026
@greptile-apps

greptile-apps Bot commented Apr 19, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds multi-file transcript resolution to readSessionMessages so that agent:main:main sessions with compactionCheckpoints[] chains return a unified, continuously-sequenced history across all predecessor transcripts.

  • The resetFiles scan at lines 197–218 filters by .jsonl.reset. in any filename across the entire sessionsDir, not just files belonging to sessions in the checkpoint chain. In directories with multiple sessions this will inject a foreign session's reset archive into the current session's history — listExistingTranscriptVariantsForSessionId already handles reset variants for each chained session ID, so the block is redundant for the intended case and harmful for the general case.

Confidence Score: 3/5

Not safe to merge as-is: the unfiltered reset-file scan can corrupt session history with data from unrelated sessions.

The overall approach is architecturally sound and the compaction-chain walking logic is correct. However, the reset-file collection loop does not constrain itself to session IDs in the checkpoint chain, which is a present data-integrity defect — foreign messages will be returned in the history response. This needs to be fixed before merging.

src/gateway/session-utils.fs.ts — specifically the resetFiles IIFE in resolveMainSessionTranscriptFiles

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/gateway/session-utils.fs.ts
Line: 197-218

Comment:
**Unfiltered reset-file scan includes files from unrelated sessions**

`resetFiles` is built by scanning the entire `sessionsDir` for any filename containing `.jsonl.reset.` — with no check that the name belongs to a session ID in the checkpoint chain. The latest such file is then pushed unconditionally into the transcript list.

In a directory that holds transcripts for multiple sessions (the typical case), this picks up reset archives from completely unrelated sessions. `pushFile` deduplicates by path, but it doesn't filter by session ownership, so a foreign session's archive ends up merged into the current chain. Messages from the wrong session are then re-sequenced and returned to callers of `readSessionMessages`.

The existing `listExistingTranscriptVariantsForSessionId` already collects `.jsonl.reset.*` variants for every session ID in `orderedSessionIds` (because `name.startsWith(`${sessionId}.jsonl.`)` matches them). The separate `latestReset` block is therefore both redundant for in-chain sessions and harmful for out-of-chain ones. The safest fix is to remove the `resetFiles` block entirely and rely on the per-session-ID collection loop below it.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/gateway/session-utils.fs.ts
Line: 248-254

Comment:
**`fs.statSync` called inside sort comparators causes repeated syscalls**

Both `listExistingTranscriptVariantsForSessionId` and the final `files.toSorted` in `resolveMainSessionTranscriptFiles` call `fs.statSync` inside their comparator lambdas. A sort over N files triggers O(N log N) comparisons, each issuing a live syscall. After `listExistingTranscriptVariantsForSessionId` pre-sorts its list, `resolveMainSessionTranscriptFiles` immediately re-sorts the same files — doubling the syscall budget unnecessarily.

Pre-compute mtimes into a `Map<string, number>` before sorting and reuse those values in both places, or stat each file once and store the result alongside the path.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(gateway): read session transcripts a..." | Re-trigger Greptile

Comment thread src/gateway/session-utils.fs.ts Outdated
Comment on lines +197 to +218
const resetFiles: string[] = (() => {
try {
const parsed = JSON.parse(line);
if (parsed?.message) {
messageSeq += 1;
messages.push(
attachOpenClawTranscriptMeta(parsed.message, {
...(typeof parsed.id === "string" ? { id: parsed.id } : {}),
seq: messageSeq,
}),
);
return fs
.readdirSync(context.sessionsDir)
.filter((name) => name.includes(".jsonl.reset."))
.map((name) => path.join(context.sessionsDir, name))
.filter((filePath) => fs.existsSync(filePath))
.toSorted((a, b) => {
try {
return fs.statSync(a).mtimeMs - fs.statSync(b).mtimeMs;
} catch {
return a.localeCompare(b);
}
});
} catch {
return [];
}
})();
const latestReset = resetFiles[resetFiles.length - 1];
if (latestReset) {
pushFile(latestReset);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unfiltered reset-file scan includes files from unrelated sessions

resetFiles is built by scanning the entire sessionsDir for any filename containing .jsonl.reset. — with no check that the name belongs to a session ID in the checkpoint chain. The latest such file is then pushed unconditionally into the transcript list.

In a directory that holds transcripts for multiple sessions (the typical case), this picks up reset archives from completely unrelated sessions. pushFile deduplicates by path, but it doesn't filter by session ownership, so a foreign session's archive ends up merged into the current chain. Messages from the wrong session are then re-sequenced and returned to callers of readSessionMessages.

The existing listExistingTranscriptVariantsForSessionId already collects .jsonl.reset.* variants for every session ID in orderedSessionIds (because name.startsWith(${sessionId}.jsonl.) matches them). The separate latestReset block is therefore both redundant for in-chain sessions and harmful for out-of-chain ones. The safest fix is to remove the resetFiles block entirely and rely on the per-session-ID collection loop below it.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/gateway/session-utils.fs.ts
Line: 197-218

Comment:
**Unfiltered reset-file scan includes files from unrelated sessions**

`resetFiles` is built by scanning the entire `sessionsDir` for any filename containing `.jsonl.reset.` — with no check that the name belongs to a session ID in the checkpoint chain. The latest such file is then pushed unconditionally into the transcript list.

In a directory that holds transcripts for multiple sessions (the typical case), this picks up reset archives from completely unrelated sessions. `pushFile` deduplicates by path, but it doesn't filter by session ownership, so a foreign session's archive ends up merged into the current chain. Messages from the wrong session are then re-sequenced and returned to callers of `readSessionMessages`.

The existing `listExistingTranscriptVariantsForSessionId` already collects `.jsonl.reset.*` variants for every session ID in `orderedSessionIds` (because `name.startsWith(`${sessionId}.jsonl.`)` matches them). The separate `latestReset` block is therefore both redundant for in-chain sessions and harmful for out-of-chain ones. The safest fix is to remove the `resetFiles` block entirely and rely on the per-session-ID collection loop below it.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/gateway/session-utils.fs.ts Outdated
Comment on lines +248 to +254
return files.toSorted((a, b) => {
try {
return fs.statSync(a).mtimeMs - fs.statSync(b).mtimeMs;
} catch {
return a.localeCompare(b);
}
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 fs.statSync called inside sort comparators causes repeated syscalls

Both listExistingTranscriptVariantsForSessionId and the final files.toSorted in resolveMainSessionTranscriptFiles call fs.statSync inside their comparator lambdas. A sort over N files triggers O(N log N) comparisons, each issuing a live syscall. After listExistingTranscriptVariantsForSessionId pre-sorts its list, resolveMainSessionTranscriptFiles immediately re-sorts the same files — doubling the syscall budget unnecessarily.

Pre-compute mtimes into a Map<string, number> before sorting and reuse those values in both places, or stat each file once and store the result alongside the path.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/gateway/session-utils.fs.ts
Line: 248-254

Comment:
**`fs.statSync` called inside sort comparators causes repeated syscalls**

Both `listExistingTranscriptVariantsForSessionId` and the final `files.toSorted` in `resolveMainSessionTranscriptFiles` call `fs.statSync` inside their comparator lambdas. A sort over N files triggers O(N log N) comparisons, each issuing a live syscall. After `listExistingTranscriptVariantsForSessionId` pre-sorts its list, `resolveMainSessionTranscriptFiles` immediately re-sorts the same files — doubling the syscall budget unnecessarily.

Pre-compute mtimes into a `Map<string, number>` before sorting and reuse those values in both places, or stat each file once and store the result alongside the path.

How can I resolve this? If you propose a fix, please make it concise.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f90c0e0042

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/session-utils.fs.ts Outdated
);
return fs
.readdirSync(context.sessionsDir)
.filter((name) => name.includes(".jsonl.reset."))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Scope reset-archive fallback to the target session

This reset-file scan is global to sessionsDir, so resolveMainSessionTranscriptFiles can prepend the most recent .jsonl.reset.* from an unrelated (or previously reset) session when reading agent:main:main. In practice, once any reset archive exists in the store, readSessionMessages may merge stale/cross-session turns into the current chat history, which breaks reset semantics and can expose messages from the wrong session. The fallback should be constrained to the current session/checkpoint chain instead of using any reset file in the directory.

Useful? React with 👍 / 👎.

… cache stat for sort

Addresses review feedback on openclaw#68765 (greptile P1 + codex P1, greptile P2).

P1 (data integrity): remove the unscoped `.jsonl.reset.*` scan over
`sessionsDir` in `resolveMainSessionTranscriptFiles`. The previous
implementation picked the most recent reset archive in the directory
without filtering by session ownership, so reset files from unrelated
sessions could be merged into the returned `agent:main:main` history.
`listExistingTranscriptVariantsForSessionId` already collects
`<sessionId>.jsonl.reset.*` variants for every session id in the
checkpoint chain (including the current one), so the separate global
scan was redundant for in-chain sessions and actively harmful for
out-of-chain ones.

P2 (performance): pre-compute `fs.statSync().mtimeMs` per file into a
Map before sorting the merged transcript list, so each file is stat'd
once instead of O(N log N) times inside the comparator. Drop the
redundant inner sort in `listExistingTranscriptVariantsForSessionId`
now that the single outer sort in `resolveMainSessionTranscriptFiles`
is the authoritative ordering step across the full merged list.

Behavior: unchanged for any session that has a `.jsonl` or rotated
variant matching a session id in its checkpoint chain — which is the
designed and tested case. Behavior improvement: reset archives of
unrelated sessions are no longer merged into the response.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 619ed80284

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/session-utils.fs.ts Outdated
try {
return fs
.readdirSync(sessionsDir)
.filter((name) => name === `${sessionId}.jsonl` || name.startsWith(`${sessionId}.jsonl.`))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop reading .bak archives as live transcript segments

This variant matcher pulls in every file that starts with ${sessionId}.jsonl. so it will include .bak archives, but sessions.compact creates those by renaming the full transcript and then writing a truncated replacement (src/gateway/server-methods/sessions.ts:1631-1633). Reading both files replays the same turns twice (and can reintroduce lines intentionally compacted away), so readSessionMessages can return duplicated/stale history and inflated seq counts after compaction.

Useful? React with 👍 / 👎.

Comment thread src/gateway/session-utils.fs.ts Outdated
Comment on lines +205 to +206
[...checkpointSessionIds, sessionId].filter((value) => looksLikeGeneratedSessionId(value)),
),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove UUID-only filtering from checkpoint session traversal

The new chain resolver drops any session/checkpoint id that is not UUID-shaped, so those sessions immediately fall back to the old single-file read path. Session IDs are not UUID-only in this codebase (validateSessionId accepts broader IDs in src/config/sessions/paths.ts:61-69), so compacted main sessions with legacy/custom IDs will still lose pre-checkpoint history despite this fix.

Useful? React with 👍 / 👎.

…ID_RE

Addresses the second codex review pass on openclaw#68765 (P1 + P2 on commit
619ed80).

P1 (data integrity): restrict listExistingTranscriptVariantsForSessionId
to the primary `.jsonl` and `.jsonl.reset.*` files. The previous matcher
accepted every `${sessionId}.jsonl.*` filename, which included `.bak`
archives created by `sessions.compact` as pre-compaction snapshots. Read-
ing those alongside the truncated post-compaction `.jsonl` would double-
count the compacted turns and re-emit messages that compaction intention-
ally removed from the live transcript. Narrowing the matcher to the two
known-safe suffixes drops `.bak` (and `.deleted.*`) from the chain.

P2 (correctness of chain coverage): replace the UUID-only local
`looksLikeGeneratedSessionId` guard with the codebase's own
`SAFE_SESSION_ID_RE` from `src/config/sessions/paths.ts`. Session IDs in
this codebase are not UUID-only (e.g. `boot-<timestamp>-<shortid>` boot
sessions); the previous filter silently excluded them from the chain
walk and fell back to the legacy single-file path. Using the already-
exported SAFE_SESSION_ID_RE accepts the same set of IDs that
`validateSessionId` accepts, so every session id that can legitimately
appear in `compactionCheckpoints[]` or as the current sessionId is now
walked.

Behavior: for `agent:main:main` sessions with non-UUID session IDs, the
chain walk now activates (previously inactive, silently falling through
to single-file read); for UUID-based sessions, behavior is unchanged
except `.bak` / `.deleted.*` siblings are no longer merged. No reset-
archive behavior change — `.jsonl.reset.*` still participates in the
chain as before.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5842d3a9f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/session-utils.fs.ts Outdated
try {
return fs
.readdirSync(sessionsDir)
.filter((name) => name === jsonlName || name.startsWith(resetPrefix))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exclude reset archives from checkpoint-chain replay

This variant filter includes *.jsonl.reset.* for every checkpoint session id, so a main session that has been reset after compaction can replay pre-reset transcripts: reset preserves compactionCheckpoints (src/gateway/session-reset-service.ts), resolveMainSessionTranscriptFiles walks those old ids, and this line pulls their reset archives back into readSessionMessages. In that flow, chat.history shows messages that were explicitly cleared by reset, which breaks reset semantics.

Useful? React with 👍 / 👎.

const da = getMtime(a);
const db = getMtime(b);
if (da !== null && db !== null) {
return da - db;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add deterministic tie-break for equal mtimes

The chain sort returns only da - db when both mtimes are present, so equal timestamps produce 0 and preserve the original readdirSync order (which is not guaranteed). On filesystems with coarse timestamp resolution, this can reorder transcript segments nondeterministically, changing replay order and assigned seq values for the same session history. Add a stable secondary key (for example filename or explicit checkpoint order) when da === db.

Useful? React with 👍 / 👎.

…ise sort ties

Addresses third codex review pass on openclaw#68765 (both P1 and P2 on commit
b5842d3).

P1 (reset-semantics correctness): drop .jsonl.reset.* siblings from the
compaction-chain replay. resolveMainSessionTranscriptFiles now reads
only ${sessionId}.jsonl per checkpoint session id. Upstream
session-reset-service.ts preserves compactionCheckpoints[] across a
/reset, so including per-session .reset.* archives would resurrect
content the user explicitly cleared via reset and violate reset
semantics. Reset-archive fallback for the non-chained path remains out
of scope for this PR and is handled by openclaw#60409.

P2 (determinism): when both compared file mtimes are equal, fall
through to a filename localeCompare tie-break instead of returning 0
and relying on readdir order. Filesystems with coarse timestamp
resolution (FAT32, some network mounts) or rapid writes can produce
identical mtimes; without a secondary key, the replay order and
assigned seq values are non-deterministic across runs and platforms.

Behavior: for sessions whose chained transcripts have distinct mtimes,
unchanged. For sessions with coincident mtimes, deterministic and
stable. Reset-after-compaction scenarios no longer resurrect cleared
content. No change to the non-chained fallback path.

Also removes the now-redundant listExistingTranscriptVariantsForSessionId
helper (its single call site used only the primary-jsonl lookup after
this change); inlined into resolveMainSessionTranscriptFiles for
clarity.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 148bfce6b4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/session-utils.fs.ts Outdated
sessionFile?: string,
): string[] | null {
const context = resolveSessionMessagesStoreContext(sessionId, storePath, sessionFile);
if (!context?.sessionsDir || context.key !== "agent:main:main") {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Support configured main-session keys in chain replay

Gate the chained-read path on the resolved session being the configured main key, not the hardcoded literal agent:main:main. OpenClaw supports non-default main keys/agents (for example agent:ops:work), and for those installs this condition is always false, so readSessionMessages falls back to single-file reads and still truncates history at compaction boundaries for the actual main session.

Useful? React with 👍 / 👎.

Comment thread src/gateway/session-utils.fs.ts Outdated
new Set([...checkpointSessionIds, sessionId].filter((value) => SAFE_SESSION_ID_RE.test(value))),
);
for (const chainedSessionId of orderedSessionIds) {
const filePath = path.join(context.sessionsDir, `${chainedSessionId}.jsonl`);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Load checkpoint snapshots instead of only <sessionId>.jsonl

Resolve chained transcript files from checkpoint file references, not just path.join(sessionsDir, ${id}.jsonl). Compaction checkpoints persist pre-compaction snapshots as *.checkpoint.*.jsonl, and post-compaction truncation can remove older turns from the live <sessionId>.jsonl; with the current lookup, those snapshot-only turns are never read, so the new chain logic can still return history that drops pre-compaction messages.

Useful? React with 👍 / 👎.

…onFile refs and configured main key

Rewrites resolveMainSessionTranscriptFiles in src/gateway/session-utils.fs.ts
to deliver the stated PR objective instead of being a de-facto no-op.

What was wrong in the previous revision
----------------------------------------

1. Hardcoded main-session-key literal: the chain walk was gated on
   `context.key !== "agent:main:main"`. The codebase resolves the main
   session key via `resolveMainSessionKey` / `resolveMainSessionKeyFromConfig`,
   which respects `session.scope` (`"global"` short-circuit), `session.mainKey`,
   and the `agents.default` selection. Any non-default config silently
   disabled the chain walk.

2. Wrong file-resolution for chained transcripts: each chained sessionId
   was looked up via `path.join(sessionsDir, `${chainedSessionId}.jsonl`)`.
   That does not match the on-disk convention. session-compaction-checkpoints.ts
   writes pre-compaction snapshots as `<name>.checkpoint.<uuid>.jsonl` and
   records the authoritative path on each checkpoint entry under
   `preCompaction.sessionFile`. In practice the previous resolver found no
   files, returned null, and left callers on the legacy single-file path —
   the PR delivered nothing for real compaction chains.

3. Broad variant enumeration: `.jsonl.reset.*`, `.jsonl.bak`, `.jsonl.deleted.*`
   siblings were pulled into the chain. `.bak` is a pre-compaction snapshot
   already represented by a predecessor checkpoint entry and double-counts
   compacted turns. `.reset.*` archives resurface content that the user
   explicitly cleared via /reset, breaking reset semantics
   (session-reset-service.ts preserves `compactionCheckpoints[]` across a
   reset, so the previous code could replay cleared turns for every chained
   session id).

What this commit does
---------------------

- Imports `resolveMainSessionKeyFromConfig` from
  `../config/sessions/main-session.runtime.js`. The chained read activates
  only when `context.key === resolveMainSessionKeyFromConfig()`; on a
  config read error the function returns null and the legacy path takes
  over. Non-main sessions (isolated, subagent, heartbeat, explicit agent
  sessions) stay on the legacy single-file path.

- Walks `entry.compactionCheckpoints` in recorded order. For each entry
  reads `preCompaction.sessionFile` and pushes that exact path. Guards
  against non-object entries and missing fields.

- Appends `entry.sessionFile` as the newest segment so the chain ends
  with live content.

- No readdir scan. No `.reset.*`, `.bak`, or `.deleted.*` inclusion.

- Deterministic sort: ascending mtime with filename localeCompare
  tie-break. fs.statSync cached per path (one syscall per file instead
  of O(N log N) inside the comparator).

- Dropped SAFE_SESSION_ID_RE import and the UUID-shape filter; no
  filenames are constructed from session ids anymore, so path-safety
  validation is no longer required.

Review cluster addressed
------------------------

- greptile P1 (unfiltered reset-file scan leaks foreign-session archives):
  resolved — no directory scan remains.
- greptile P2 (fs.statSync inside sort comparator): resolved — cached.
- codex P1 on 619ed80 (scope reset-archive fallback to target session):
  resolved — no reset fallback in chain path at all.
- codex P1 on b5842d3 (stop reading .bak as live segments): resolved.
- codex P2 on b5842d3 (remove UUID-only filter): resolved — filter
  dropped; non-UUID session ids no longer excluded.
- codex P1 on 148bfce (exclude reset archives from replay / reset
  semantics): resolved — no reset replay.
- codex P2 on 148bfce (deterministic tie-break): resolved — retained.
- codex P2 line 152 (gate on configured main key, not literal): resolved.
- codex P1 line 182 (resolve chained files from preCompaction.sessionFile,
  not ${sessionId}.jsonl): resolved.

Behavior
--------

- Pre-patch (no-op in practice): no regression, no real improvement.
- Post-patch: for the configured main session with a non-empty
  `compactionCheckpoints[]` chain, `chat.history` returns messages from
  every recorded pre-compaction snapshot plus the current primary
  transcript, in mtime order, with a single continuous seq counter.
  Reset-cleared content stays cleared. Fallback for non-chained and
  non-main sessions is byte-compatible with today's upstream.

Scope strictly limited to src/gateway/session-utils.fs.ts.
Refs openclaw#68765

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5c9e2108a5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +201 to +203
const reference = (preCompaction as Record<string, unknown>).sessionFile;
if (typeof reference === "string" && reference.trim()) {
pushFile(reference.trim());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip pre-reset checkpoints when rebuilding transcript chain

This loop replays every compactionCheckpoints[*].preCompaction.sessionFile without checking whether the checkpoint belongs to the current post-reset session. Because reset preserves compactionCheckpoints on the new entry (src/gateway/session-reset-service.ts:612), a compacted session that is later reset will have readSessionMessages prepend old checkpoint snapshots and chat.history can show messages the user explicitly cleared. Restrict checkpoint replay to the active session/reset epoch (or otherwise suppress pre-reset checkpoints) before adding these files.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid concern. session-reset-service.ts:612 explicitly preserves compactionCheckpoints[] on the new entry after /reset, so this loop as-is will replay pre-reset checkpoint snapshots if the user resets a compacted session. I have not addressed this in the current PR because the right fix is a data-model decision, not a mechanical change:

  • Option A — filter checkpoints by comparing checkpoint.createdAt against a reset timestamp on the entry (requires a new resetAt / resetEpoch field on SessionEntry, since it doesn't currently carry one).
  • Option B — suppress the chain walk entirely when the entry shows reset markers (requires defining what a "reset marker" on the entry is).
  • Option C — change the reset service to drop compactionCheckpoints[] on reset (conflicts with the intent at session-reset-service.ts:612, which looks deliberate).

Each needs a maintainer call on what the intended semantics are for "chain history survives reset" vs "reset clears everything". Happy to open a dedicated follow-up PR for whichever direction you pick.

For this PR's narrow scope (restoring the chain read for the configured main session), the P2 canonicalization fix just landed as 6ab4646.

Comment thread src/gateway/session-utils.fs.ts Outdated
Comment on lines +166 to +167
if (context.key !== configuredMainKey) {
return null;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Canonicalize legacy main keys before enabling chain replay

Chained replay is currently enabled only when the raw store key exactly equals resolveMainSessionKeyFromConfig(). In stores that still use legacy/mixed-case aliases for the same main session key (a case the gateway explicitly supports elsewhere via case-insensitive legacy-key scans), this condition fails and readSessionMessages silently falls back to single-file reads, so compaction checkpoint history is still truncated for those users. Compare against canonicalized/alias-aware key forms instead of strict raw-key equality.

Useful? React with 👍 / 👎.

… replay

Addresses codex P2 review comment on 5c9e210 (openclaw#68765#discussion_r3111205457).

Chained transcript replay was gated on raw-key equality:
`context.key === resolveMainSessionKeyFromConfig()`. Stores with legacy or
mixed-case aliases of the configured main session key (`agent:main:main`
while the configured default agent is e.g. "ops", `agent:main:<configured
mainKey>`, `agent:<configured agent>:main`, the bare "main" / mainKey
forms) did not match and silently fell back to the legacy single-file
path, so compaction-chain history stayed truncated for those users.
Gateway write paths canonicalize the same way (see auto-reply/reply/
session.ts).

Replaces the raw comparison with `canonicalizeMainSessionAlias` from
`src/config/sessions/main-session.ts`, passed the resolved default agent
id and the full `OpenClawConfig`. The canonicalizer:

  - short-circuits to "global" when session.scope is "global" and the
    input is any main alias;
  - collapses the six legacy/aliased main-key shapes to the canonical
    `agent:<default>:<mainKey>` form;
  - returns the raw key unchanged for genuine non-main sessions
    (isolated, subagent, heartbeat, explicit agent sessions).

`canonicalContextKey === resolveMainSessionKey(cfg)` is now the gate.
Config-load failure still falls through to the legacy path.

Imports: +resolveDefaultAgentId from ../agents/agent-scope-config.js,
+loadConfig from ../config/config.js, +canonicalizeMainSessionAlias and
+resolveMainSessionKey from ../config/sessions/main-session.js, drops
the -resolveMainSessionKeyFromConfig shortcut (no longer needed).

Deliberately NOT addressed in this commit: codex P1 on the same review
(r3111205471, "Skip pre-reset checkpoints when rebuilding transcript
chain"). Pre-reset checkpoint filtering is a data-model question about
whether checkpoints should carry a reset-epoch tag or whether chain-walk
should be suppressed entirely when the entry shows reset markers, both
of which are beyond this PR's narrow scope of restoring chain-read for
the configured main session. Happy to coordinate a follow-up PR once
that approach is agreed.

Scope still limited to src/gateway/session-utils.fs.ts.
Refs openclaw#68765

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6ab4646526

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/session-utils.fs.ts Outdated
Comment on lines +136 to +137
if (targetSessionFile && entrySessionFile && entrySessionFile !== targetSessionFile) {
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Require sessionFile match when target transcript is provided

When sessionFile is passed in, this filter still accepts store rows that have the same sessionId but no record.sessionFile, because the mismatch check only runs when both sides are present. In stores with duplicate/legacy rows for one session id, resolveSessionMessagesStoreContext can return a stale row without a transcript path; resolveMainSessionTranscriptFiles then cannot append the live transcript from entry.sessionFile and may replay only checkpoint snapshots, dropping the newest turns from readSessionMessages.

Useful? React with 👍 / 👎.

Comment on lines +261 to +262
const chained = resolveMainSessionTranscriptFiles(sessionId, storePath, sessionFile);
if (chained && chained.length > 0) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid unconditional chained-resolver I/O on every message read

resolveTranscriptReadFiles always invokes resolveMainSessionTranscriptFiles, which synchronously parses sessions.json and calls loadConfig() before it can decide that chaining is not applicable. Since readSessionMessages is used on hot request paths, this adds blocking disk I/O and JSON parsing to non-main session reads as well, creating avoidable latency that grows with session-store size.

Useful? React with 👍 / 👎.

PAAI added 2 commits April 20, 2026 16:40
…vided

Rows with no record.sessionFile bypassed the entrySessionFile guard
when targetSessionFile was set, allowing a stale/legacy row to be
returned as the store context. Drop the redundant entrySessionFile &&
guard so any row without a matching sessionFile is skipped when the
caller specifies a target file.

Refs openclaw#68765
…ckpoints

resolveTranscriptReadFiles is on the hot path for every message read,
but resolveMainSessionTranscriptFiles parsed sessions.json via the
store context loader AND then called loadConfig() + the main-key
canonicalizer before deciding whether chaining applied. The existing
compactionCheckpoints[] guard sat after the config work, so isolated,
subagent, heartbeat, and explicit-agent session reads all paid for
config I/O that was always thrown away.

Move the checkpoint-array check ahead of the config/canonicalize work.
Behavior is unchanged: main sessions with a checkpoint chain still walk
the chain, and everything else still falls through to the legacy
single-file path. The only difference is that the loadConfig() and
canonicalizeMainSessionAlias() costs are now gated on the cheap
"entry has compactionCheckpoints" field check, which is false for the
vast majority of reads.

Refs openclaw#68765

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c877c3ad4b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +220 to +223
const reference = (preCompaction as Record<string, unknown>).sessionFile;
if (typeof reference === "string" && reference.trim()) {
pushFile(reference.trim());
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Ignore pre-restore checkpoints when chaining transcripts

When a user runs sessions.compaction.restore, the store keeps prior compactionCheckpoints while switching entry.sessionFile to a new transcript forked from the selected checkpoint snapshot (src/gateway/server-methods/sessions.ts:1142-1162). This loop always replays every preCompaction.sessionFile, so readSessionMessages reads both the checkpoint snapshot and the restored transcript, duplicating the same turns and inflating chat.history sequence/message counts right after restore. Restrict checkpoint replay to the active restore/reset epoch (or otherwise suppress snapshots that are already represented by the current transcript).

Useful? React with 👍 / 👎.

Add two red-bar fixtures for readSessionMessages that pin the
compaction-chain epoch semantics we expect, but that the current
replay-everything logic in resolveMainSessionTranscriptFiles does not
honor:

1. Compact -> Reset -> Read: performGatewaySessionReset keeps
   compactionCheckpoints[] on the new entry, so the chain walk
   resurfaces pre-reset preCompaction.sessionFile snapshots in the
   replay. The user explicitly archived that content; it must not
   reappear.

2. Compact -> Restore -> Read: sessions.compaction.restore forks a new
   transcript from a chosen checkpoint's preCompaction.sessionFile and
   writes it as the entry.sessionFile while preserving
   compactionCheckpoints[] (preserveCompactionCheckpoints: true). The
   chain walk then reads the checkpoint snapshot AND the forked copy,
   and also replays checkpoints taken between the restore target and
   the restore point.

These tests are intentionally left failing. The fix requires a product
decision about reset/restore epoch semantics (prune checkpoints at
/reset? suppress post-target checkpoints after restore? stamp a
resetEpoch / restoreEpoch on the entry?) rather than another mechanical
patch to the chain walker. Once that decision is made, a targeted fix
should turn these red tests green.

Separate follow-up noted but not addressed here:
canonicalizeMainSessionAlias (src/config/sessions/main-session.ts) has
no dedicated test file. Legacy alias coverage (agent:main:<mainKey> vs.
agent:<default>:<mainKey>, scope=global) should be pinned in its own
small test before the chain-walker is hardened.

Refs openclaw#68765

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 451fed833e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +211 to +213
for (const checkpoint of rawCheckpoints) {
if (!checkpoint || typeof checkpoint !== "object" || Array.isArray(checkpoint)) {
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop replaying checkpoints from prior reset/restore epochs

resolveMainSessionTranscriptFiles replays every compactionCheckpoints[*].preCompaction.sessionFile without filtering to the active reset/restore epoch, so after /reset or sessions.compaction.restore it can prepend snapshots that were intentionally discarded. In those flows readSessionMessages can resurface cleared history and double-count restored content (checkpoint snapshot + restored fork), which is a correctness/privacy regression in the new chain logic; the newly added tests in src/gateway/session-utils.fs.test.ts for reset/restore are fresh evidence of this failure mode.

Useful? React with 👍 / 👎.

PAAI added 2 commits April 20, 2026 21:30
resolveSessionMessagesStoreContext parsed sessions.json directly via
JSON.parse(fs.readFileSync(...)) on every call. Switch to the existing
loadSessionStore() helper from config/sessions/store.js so the chain
walker reuses the shared session-store cache instead of paying a fresh
parse for each readSessionMessages call.

Pure refactor: no behavior change for single reads; repeated reads now
reuse the cached parse.

Refs openclaw#68765
Add four green-bar tests that pin the current chain-walk semantics so
the behavior can't silently regress under future edits:

- skips stale duplicate rows without matching sessionFile: covers the
  sessionFile-match guard added by df6b239. A duplicate store row
  for the same sessionId without a matching record.sessionFile must be
  skipped; only the row whose sessionFile matches targetSessionFile
  provides the chain.
- chains checkpoints for configured main session even when the store
  key uses a legacy alias: exercises canonicalizeMainSessionAlias on
  the chain-activation gate. A store keyed "agent:main:main" under a
  config where the default agent is "ops" and session.mainKey is
  "work" must still canonicalize to the configured main key and walk
  the checkpoint chain.
- keeps non-main sessions on the legacy single-file fallback even when
  checkpoints exist: confirms the canonical-key mismatch branch. A
  non-main store entry with compactionCheckpoints[] must not trigger
  chain replay; only entry.sessionFile is read.
- reuses the cached session store on repeated chained reads: asserts
  that loadSessionStore's cache eliminates repeated fs.readFileSync
  calls for the same storePath inside the cache TTL.

Refs openclaw#68765

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4e26f54940

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +221 to +223
const reference = (preCompaction as Record<string, unknown>).sessionFile;
if (typeof reference === "string" && reference.trim()) {
pushFile(reference.trim());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Trim checkpoint snapshots at compaction boundaries

This code appends every preCompaction.sessionFile as a full transcript segment, but compaction checkpoints already record firstKeptEntryId precisely because each next transcript keeps an unsummarized tail from the prior one (see src/agents/pi-embedded-runner/session-truncation.ts:84-93). With two or more normal compactions, replaying whole snapshots here duplicates retained turns, so readSessionMessages returns inflated history and incorrect seq progression even without reset/restore flows. The chain reader should cut each snapshot at its boundary instead of concatenating entire files.

Useful? React with 👍 / 👎.

@cholaolu-boop

Copy link
Copy Markdown
Author

Added production-derived main-session checkpoint-chain evidence, plus a focused readSessionMessages regression test for the checkpoint-chain read path. Targeted test is green on this branch. Reset/restore remains intentionally out of scope for this evidence block.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 711c0d1c23

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +140 to +144
return {
key,
entry: value,
sessionsDir,
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prefer the canonical matching store row before returning

In resolveSessionMessagesStoreContext, the loop returns on the first row whose sessionId (and optional sessionFile) matches, so duplicate rows for the same session can select a stale/non-main entry and stop the scan early. If that first match has no compactionCheckpoints (or a non-main key), resolveMainSessionTranscriptFiles falls back to single-file reads and drops pre-compaction history even when a later canonical main entry has the full checkpoint chain. This regression is triggered by legacy alias duplicates in sessions.json and directly undermines this fix’s intended behavior.

Useful? React with 👍 / 👎.

Comment on lines +197 to +199
const normalized = path.resolve(filePath);
if (seen.has(normalized) || !fs.existsSync(normalized)) {
return;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Constrain chained transcript paths to the session store root

pushFile trusts compactionCheckpoints[*].preCompaction.sessionFile and entry.sessionFile via path.resolve(...) and only checks existence before reading, which bypasses the containment/sanitization logic used by resolveSessionTranscriptCandidates. With stale or corrupted sessions.json metadata, readSessionMessages can ingest unrelated files outside the active sessions directory and merge them into chat history, exposing wrong-session content instead of safely skipping invalid paths.

Useful? React with 👍 / 👎.

@clawsweeper

clawsweeper Bot commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

Thanks for the context here. I swept through the related work, and this is now duplicate or superseded.

Close as superseded: current main has a merged maintainer/member product path that preserves bounded active-transcript chat.history and exposes compaction recovery through the chat divider and Sessions checkpoint controls, while this branch still has correctness and privacy blockers for default checkpoint replay.

So I’m closing this here and keeping the remaining discussion on the canonical linked item.

Review details

Best possible solution:

Keep the shipped bounded active-history behavior and checkpoint recovery affordance; revisit full checkpoint replay only as a new opt-in API or recovery design with explicit reset, restore, branch, boundary, and containment rules.

Do we have a high-confidence way to reproduce the issue?

Yes, source-reproducible: current chat.history reads the current active transcript through readRecentSessionMessagesAsync, while compaction checkpoint metadata stores separate preCompaction.sessionFile references. I did not run a live long-session compaction in this read-only review.

Is this the best way to solve the issue?

No. This PR is not the best current solution because the merged #76437 path preserves bounded active history and points users to checkpoint controls, while this branch adds default replay without the needed privacy and correctness contract.

Security review:

Security review needs attention: The diff adds default transcript replay from stored checkpoint paths without containment, reset/restore epoch guards, or active-branch selection.

  • [medium] Uncontained checkpoint transcript reads — src/gateway/session-utils.fs.ts:196
    Stored checkpoint paths are normalized and read if they exist, bypassing the normal transcript-candidate containment rules and risking unrelated file ingestion from stale or corrupted metadata.
    Confidence: 0.9
  • [medium] Cleared history can reappear by default — src/gateway/session-utils.fs.ts:212
    Reset and restore preserve checkpoint metadata, so replaying every checkpoint snapshot by default can expose content the user cleared or explicitly restored away.
    Confidence: 0.88
  • [medium] Abandoned branch content can reappear — src/gateway/session-utils.fs.ts:287
    The raw JSONL loop bypasses active-branch filtering, which can expose stale sibling branch content that current readers suppress.
    Confidence: 0.86

What I checked:

  • Superseding merged PR: fix(ui): surface compaction checkpoints in chat history #76437 was merged with the explicit product choice to preserve bounded active-transcript chat.history and add a chat-level checkpoint recovery affordance instead of stitching checkpoint bytes into default history. (2810f1219a62)
  • Current docs define active-branch behavior: Current WebChat docs say chat.history follows the active transcript branch and that compaction entries render as dividers linking to Sessions checkpoint controls. Public docs: docs/web/webchat.md. (docs/web/webchat.md:27, 686b93e5c710)
  • Current chat.history path remains bounded async active transcript: Current chat.history calls readRecentSessionMessagesAsync for the current session entry, so the accepted runtime path remains the bounded active transcript rather than default checkpoint-chain concatenation. (src/gateway/server-methods/chat.ts:1752, 686b93e5c710)
  • PR head only wires sync reader: At PR head, the new chain resolver is called from sync readSessionMessages; the reported gateway RPC uses the async recent reader on current main. (src/gateway/session-utils.fs.ts:281, 711c0d1c2357)
  • PR head bypasses current active-branch parser: Current main routes sync full reads through readSelectedTranscriptRecords, but PR head replaces that path with a raw per-file JSONL loop over every segment line. (src/gateway/session-utils.fs.ts:287, 711c0d1c2357)
  • Reset/restore semantics conflict with default replay: Current reset preserves compactionCheckpoints, and checkpoint restore preserves them too, so unfiltered default replay can resurface cleared or discarded restore-era content. (src/gateway/session-reset-service.ts:722, 686b93e5c710)

Likely related people:

  • steipete: Recent history on the central transcript and chat-history files includes async transcript IO, active-branch history reads, bounded transcript reads, and compaction duplicate fixes. (role: recent gateway/session area contributor; confidence: high; commits: 6147e1b91d3e, 6dc8bd893572, 4d9c658f4058; files: src/gateway/session-utils.fs.ts, src/gateway/server-methods/chat.ts, src/gateway/session-compaction-checkpoints.ts)
  • vincentkoc: Recent commits touch bounded chat-history transcript reads, session-list transcript reads, oversized transcript preservation, and current blame in this checkout points the active reader lines through this area. (role: recent adjacent owner; confidence: medium; commits: aec83af23d49, 23e0be355ac3, d122a3492d17; files: src/gateway/session-utils.fs.ts, src/gateway/server-methods/chat.ts)
  • scoootscooob: GitHub path history identifies the compaction checkpoint feature commit that added the checkpoint metadata and recovery surface this PR tries to replay. (role: introduced compaction checkpoint feature; confidence: medium; commits: f4fcaa09a358; files: src/gateway/session-compaction-checkpoints.ts, src/gateway/server-methods/sessions.ts)
  • BunsDev: Authored the merged PR that made the current product choice to keep bounded active history and expose checkpoint recovery through the chat/Sessions UI. (role: superseding UX/product implementer; confidence: medium; commits: 2810f1219a62; files: docs/web/webchat.md, ui/src/ui/chat/build-chat-items.ts, ui/src/ui/views/chat.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 686b93e5c710.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant