Skip to content

feat(session): add includeArchived to chat.history and sessions_history#73883

Open
CadanHu wants to merge 3 commits into
openclaw:mainfrom
CadanHu:feat/sessions-history-include-archived
Open

feat(session): add includeArchived to chat.history and sessions_history#73883
CadanHu wants to merge 3 commits into
openclaw:mainfrom
CadanHu:feat/sessions-history-include-archived

Conversation

@CadanHu

@CadanHu CadanHu commented Apr 29, 2026

Copy link
Copy Markdown

Summary

Implements the read-only path of #45003's 3-PR roadmap: give agents and chat.history callers a way to recover context that lives in .reset.<ts> archives, without resurrecting those archives as live sessions.

This PR closes the three concrete upstream gaps identified by clawsweeper / Codex review on issue #45003 (2026-04-28T22:20:54Z):

  • sessions_history has no archive selector
  • chat.history reads current active transcript only
  • Protocol schema has no archived-history parameter

Changes

Protocol (additive, backwards-compatible)

  • src/gateway/protocol/schema/logs-chat.tsChatHistoryParamsSchema gains an optional includeArchived: boolean. Default false preserves all current behavior.

Gateway helpers

  • src/gateway/session-transcript-files.fs.ts — new listResetArchivesForSession(sessionId, searchDirs) lists every <sessionId>.jsonl.reset.<ts> archive sorted oldest-first.
  • src/gateway/session-utils.fs.ts — new readSessionMessagesIncludingArchives(sessionId, storePath, sessionFile) chains archive segments before the primary transcript with synthetic system messages tagged __openclaw.kind: "session-reset" so callers can render dividers. Archive segments are parsed line-by-line; the primary segment delegates to the existing readSessionMessages so it still gets SessionManager active-branch treatment for tree-format transcripts.

Gateway handler

  • src/gateway/server-methods/chat.tschat.history reads includeArchived from params and dispatches to the new aggregator only when explicitly true.

Agent tool

  • src/agents/tools/sessions-history-tool.tsSessionsHistoryToolSchema gains optional includeArchived. The tool forwards the flag verbatim to the chat.history call.

Tests

  • src/gateway/session-utils.fs.test.ts — 6 new tests:
    • default behavior unchanged when no archives exist
    • archive content surfaced when primary missing
    • multiple archives chained chronologically before primary, each with a session-reset boundary marker
    • empty result when neither primary nor any archive exists
    • other sessions' archives are not pulled in
    • empty sessionId short-circuits
  • src/agents/tools/sessions-history-tool.test.ts — 2 new tests verifying the param is plumbed through from tool → chat.history.

Design notes

This is the read-only path of the #45003 roadmap. It does not rename, move, or otherwise mutate archive files: archives stay archived, the user opts into visibility for one query at a time. That keeps it consistent with the existing "archive lifecycle is one-way" precedent (sessions.delete has no undo) and avoids the compaction-chain checkpoint and session-memory-hook double-summary risks that come with any unarchive/restore design.

includeArchived is also a user-explicit opt-in (not an automatic fallback), so it is consistent with /reset semantics: the user is asking for the recovery view, not having archived content silently re-injected into automated paths.

Verification

  • Scoped tests: pnpm test src/gateway/session-utils.fs.test.ts — 60/60 pass (54 originals + 6 new).
  • Type-check: pnpm tsgo:core introduces 0 new errors on the touched files (verified by stash-toggle baseline comparison).
  • Manual: production gateway built from this branch and started on :18789 against the local production data dir; chat.history RPC with includeArchived: true returns chained archive segments with session-reset boundary markers, while the default (false / omitted) returns the same set as before.

Related work

Implements PR 3 of the issue openclaw#45003 archive UX roadmap (read-only
path): give agents and the chat.history endpoint a way to recover
context that lives in .reset.<ts> archives, without resurrecting
those archives as live sessions.

Changes
-------

Protocol (additive, backwards-compatible):
- src/gateway/protocol/schema/logs-chat.ts:
  ChatHistoryParamsSchema gains an optional includeArchived: boolean.
  Default false preserves all current behavior.

Gateway helpers:
- src/gateway/session-transcript-files.fs.ts:
  New listResetArchivesForSession(sessionId, searchDirs) lists every
  <sessionId>.jsonl.reset.<ts> archive sorted oldest-first.
- src/gateway/session-utils.fs.ts:
  New readSessionMessagesIncludingArchives(sessionId, storePath, sessionFile)
  chains archive segments before the primary transcript with
  synthetic system messages tagged __openclaw.kind: "session-reset"
  so the Web UI / agent can render boundaries. Archive segments are
  parsed line-by-line; the primary segment delegates to the existing
  readSessionMessages so it still gets SessionManager active-branch
  treatment for tree-format transcripts.

Gateway handler:
- src/gateway/server-methods/chat.ts:
  chat.history reads includeArchived from params and dispatches to
  the new aggregator only when explicitly true.

Agent tool:
- src/agents/tools/sessions-history-tool.ts:
  SessionsHistoryToolSchema gains optional includeArchived. The tool
  forwards the flag verbatim to the chat.history call.

Tests
-----

src/gateway/session-utils.fs.test.ts (6 new tests):
- Default behavior unchanged when no archives exist
- Archive content surfaced when primary missing
- Multiple archives chained chronologically before primary, each with
  a session-reset boundary marker
- Empty result when neither primary nor any archive exists
- Other sessions' archives are not pulled in
- Empty sessionId short-circuits

src/agents/tools/sessions-history-tool.test.ts (2 new tests):
- includeArchived=true forwards through to chat.history
- includeArchived omitted forwards as false

Design notes
------------

This is the read-only path of the openclaw#45003 roadmap. It does not rename,
move, or otherwise mutate archive files: archives stay archived, the
user opts into visibility for one query at a time. That keeps it
consistent with existing "archive lifecycle is one-way" precedent
(sessions.delete has no undo) and removes the compaction-chain
checkpoint and session-memory-hook double-summary risks that come
with any unarchive/restore design.

Refs
----

Issue: openclaw#45003
Companion PRs in the same area:
- openclaw#60409 (chat.history fallback to most recent reset archive when
  primary is missing) — different code path, no overlap
- openclaw#71537 (session-memory hook + session-logs skill archive support) —
  different code path, no overlap

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime agents Agent runtime and tooling size: M labels Apr 29, 2026
@greptile-apps

greptile-apps Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds optional includeArchived: boolean to chat.history and sessions_history so callers can opt into a chained view of all .reset.<ts> archive segments, with synthetic session-reset boundary markers between segments. The change is fully additive and backwards-compatible: the default false preserves existing behaviour. Implementation is clean and well-tested (6 new unit tests for the core aggregator, 2 for tool plumbing).

Confidence Score: 4/5

Safe to merge; the one flagged issue is a non-blocking P2 about seq uniqueness that won't affect existing callers.

No P0 or P1 issues found. The single P2 concern (non-unique __openclaw.seq across chained archive segments) is speculative and only matters if a consumer treats seq as a globally unique identifier, which is a new use-case introduced by this feature.

src/gateway/session-utils.fs.ts — readArchivedTranscriptMessages local seq counter

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/gateway/session-utils.fs.ts
Line: 213-257

Comment:
**`seq` resets to 0 for every archive segment**

`messageSeq` is declared locally inside `readArchivedTranscriptMessages` and restarts at 1 for each archive file. When `readSessionMessagesIncludingArchives` chains N archives plus the primary, the combined result contains duplicate `__openclaw.seq` values — e.g., the first message from archive-1 and the first message from archive-2 both carry `seq: 1`. If any consumer (Web UI renderer, deduplication logic, React key prop) relies on `seq` being unique across the combined list, this will produce silent bugs or duplicate-key warnings.

Consider threading a shared offset into the helper so the combined output has monotonically increasing `seq` across all segments:

```typescript
function readArchivedTranscriptMessages(filePath: string, seqOffset = 0): { messages: unknown[]; nextSeq: number } {
  // ...
  let messageSeq = seqOffset;
  // ...
  return { messages, nextSeq: messageSeq };
}
```

And in `readSessionMessagesIncludingArchives`:
```typescript
let seqOffset = 0;
for (const archive of archives) {
  const { messages, nextSeq } = readArchivedTranscriptMessages(archive.path, seqOffset);
  combined.push(...messages);
  seqOffset = nextSeq;
  // push boundary marker ...
}
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat(session): add includeArchived to ch..." | Re-trigger Greptile

Comment on lines +213 to +257
function readArchivedTranscriptMessages(filePath: string): unknown[] {
let lines: string[];
try {
lines = fs.readFileSync(filePath, "utf-8").split(/\r?\n/);
} catch {
return [];
}
const messages: unknown[] = [];
let messageSeq = 0;
for (const line of lines) {
if (!line.trim()) {
continue;
}
try {
const parsed = JSON.parse(line);
if (parsed?.message) {
messageSeq += 1;
messages.push(
attachOpenClawTranscriptMeta(parsed.message, {
...(typeof parsed.id === "string" ? { id: parsed.id } : {}),
seq: messageSeq,
}),
);
continue;
}
if (parsed?.type === "compaction") {
const ts = typeof parsed.timestamp === "string" ? Date.parse(parsed.timestamp) : Number.NaN;
const timestamp = Number.isFinite(ts) ? ts : Date.now();
messageSeq += 1;
messages.push({
role: "system",
content: [{ type: "text", text: "Compaction" }],
timestamp,
__openclaw: {
kind: "compaction",
id: typeof parsed.id === "string" ? parsed.id : undefined,
seq: messageSeq,
},
});
}
} catch {
// ignore malformed lines
}
}
return messages;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 seq resets to 0 for every archive segment

messageSeq is declared locally inside readArchivedTranscriptMessages and restarts at 1 for each archive file. When readSessionMessagesIncludingArchives chains N archives plus the primary, the combined result contains duplicate __openclaw.seq values — e.g., the first message from archive-1 and the first message from archive-2 both carry seq: 1. If any consumer (Web UI renderer, deduplication logic, React key prop) relies on seq being unique across the combined list, this will produce silent bugs or duplicate-key warnings.

Consider threading a shared offset into the helper so the combined output has monotonically increasing seq across all segments:

function readArchivedTranscriptMessages(filePath: string, seqOffset = 0): { messages: unknown[]; nextSeq: number } {
  // ...
  let messageSeq = seqOffset;
  // ...
  return { messages, nextSeq: messageSeq };
}

And in readSessionMessagesIncludingArchives:

let seqOffset = 0;
for (const archive of archives) {
  const { messages, nextSeq } = readArchivedTranscriptMessages(archive.path, seqOffset);
  combined.push(...messages);
  seqOffset = nextSeq;
  // push boundary marker ...
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/gateway/session-utils.fs.ts
Line: 213-257

Comment:
**`seq` resets to 0 for every archive segment**

`messageSeq` is declared locally inside `readArchivedTranscriptMessages` and restarts at 1 for each archive file. When `readSessionMessagesIncludingArchives` chains N archives plus the primary, the combined result contains duplicate `__openclaw.seq` values — e.g., the first message from archive-1 and the first message from archive-2 both carry `seq: 1`. If any consumer (Web UI renderer, deduplication logic, React key prop) relies on `seq` being unique across the combined list, this will produce silent bugs or duplicate-key warnings.

Consider threading a shared offset into the helper so the combined output has monotonically increasing `seq` across all segments:

```typescript
function readArchivedTranscriptMessages(filePath: string, seqOffset = 0): { messages: unknown[]; nextSeq: number } {
  // ...
  let messageSeq = seqOffset;
  // ...
  return { messages, nextSeq: messageSeq };
}
```

And in `readSessionMessagesIncludingArchives`:
```typescript
let seqOffset = 0;
for (const archive of archives) {
  const { messages, nextSeq } = readArchivedTranscriptMessages(archive.path, seqOffset);
  combined.push(...messages);
  seqOffset = nextSeq;
  // push boundary marker ...
}
```

How can I resolve this? If you propose a fix, please make it concise.

@clawsweeper

clawsweeper Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed May 30, 2026, 12:55 AM ET / 04:55 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +168, Tests +183, Other +8. Total +359 across 10 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] No close action taken because the review did not complete.

Maintainer options:

  1. Decide the mitigation before merge
    Retry the Codex review after fixing the execution failure.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Review did not complete, so no work-lane recommendation was made.
Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against b9933b2ec119.

Label changes

Label justifications:

  • rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
Evidence reviewed

PR surface:

Source +168, Tests +183, Other +8. Total +359 across 10 files.

View PR surface stats
Area Files Added Removed Net
Source 6 171 3 +168
Tests 2 183 0 +183
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 2 10 2 +8
Total 10 364 5 +359

What I checked:

  • failure reason: codex execution failed.
  • codex failure detail: Codex review failed for this PR with exit 1.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

…udeArchived

Updates the auto-generated GatewayModels.swift in both
apps/macos and apps/shared/OpenClawKit to reflect the new
optional includeArchived field added to ChatHistoryParamsSchema.

Run: pnpm protocol:gen && pnpm protocol:gen:swift

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@openclaw-barnacle openclaw-barnacle Bot added the app: macos App: macos label Apr 29, 2026
The previous run failed in command-execution-startup.test.ts:75
(banner-emit assertion) which passes locally 3/3 — flaky CI run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
masatohoshino added a commit to masatohoshino/openclaw that referenced this pull request May 2, 2026
…reads when active is missing

When the active session transcript file is missing, async read paths used by
chat.history (`readSessionMessagesAsync`, `readRecentSessionMessagesAsync`)
and the session title/preview lookup
(`readSessionTitleFieldsFromTranscriptAsync`) returned an empty result instead
of any archived content. Reset rotates the active `<id>.jsonl` to a sibling
`<id>.jsonl.reset.<UTC timestamp>` and creates a new empty active file, so
sessions reset before the next read became invisible through the production
async API even when the archive still existed on disk.

This change adds a narrow auto-fallback on the async paths only:

- New `findLatestResetArchiveAsync(candidatePaths)` helper in
  `session-transcript-files.fs.ts` does an async `readdir` of each candidate
  parent directory, validates archive entries via the existing
  `parseSessionArchiveTimestamp(entry, "reset")` primitive, and returns only
  the path with the newest archive timestamp.
- A small internal helper `findActiveOrLatestResetArchiveAsync` in
  `session-utils.fs.ts` reuses the existing active-candidate `existsSync`
  check first; only when no active candidate exists does it consult the new
  helper.
- The three async functions above are re-routed through this helper.

Behavior boundaries (kept narrow on purpose):

- Active wins when present (no active+archive merging).
- Only the single newest archive is returned (no chain aggregation, no
  `previousFragments`, no rolled logical-history reconstruction).
- Sync paths, `findExistingTranscriptPath`, the protocol schema, the
  `includeArchived` opt-in, sessions_history schema, the session-memory hook,
  and the session-logs skill are untouched.

Refs openclaw#56131 openclaw#43929 openclaw#45003 openclaw#60409 openclaw#73883
masatohoshino added a commit to masatohoshino/openclaw that referenced this pull request May 2, 2026
…hive fallback as readers

Codex/clawsweeper review on PR openclaw#76134 found that
`readRecentSessionMessagesWithStatsAsync` still derived
`totalMessages` from `readSessionMessageCountAsync`, which kept the
active-only `findExistingTranscriptPath` resolution. After the new
async reader fallback, an archive-only transcript longer than the
bounded tail window therefore returned the newest messages with low
`seq` values and `hasMore = false`. The sessions-history HTTP first
page passes `boundedSnapshot.totalMessages` into
`buildSessionHistorySnapshot`, so a `nextCursor` follow-up could skip
the middle archive page.

This commit routes the async count and visit helpers
(`readSessionMessageCountAsync` and `visitSessionMessagesAsync`)
through the same `findActiveOrLatestResetArchiveAsync` resolution that
the message readers use, so an archive-only transcript reports a
consistent total, gives messages the right tail-end `seq` values, and
exposes the archive's earlier pages via `nextCursor`.

Behavior boundaries (kept narrow, same as the original PR):

- Active wins when present (no active+archive merging).
- Only the single newest archive is returned (no chain aggregation).
- Sync paths, `findExistingTranscriptPath`, the protocol schema, the
  `includeArchived` opt-in, sessions_history schema, the session-memory
  hook, and the session-logs skill are still untouched.

Tests: 4 new cases in `session-utils.fs.test.ts` cover the count and
stats helpers (active-only, archive-only with archive longer than the
tail window, both-exist active-priority, both-missing) and assert the
returned `seq` metadata matches the archive's tail position. One new
case in `sessions-history-http.test.ts` exercises the archive-only
HTTP pagination path: archive a 3-message transcript, request the
first `limit=2` page, follow `nextCursor`, and assert the older
archive message is visible without skipping middle pages.

Refs openclaw#56131 openclaw#43929 openclaw#45003 openclaw#60409 openclaw#73883
masatohoshino added a commit to masatohoshino/openclaw that referenced this pull request May 17, 2026
…reads when active is missing

When the active session transcript file is missing, async read paths used by
chat.history (`readSessionMessagesAsync`, `readRecentSessionMessagesAsync`)
and the session title/preview lookup
(`readSessionTitleFieldsFromTranscriptAsync`) returned an empty result instead
of any archived content. Reset rotates the active `<id>.jsonl` to a sibling
`<id>.jsonl.reset.<UTC timestamp>` and creates a new empty active file, so
sessions reset before the next read became invisible through the production
async API even when the archive still existed on disk.

This change adds a narrow auto-fallback on the async paths only:

- New `findLatestResetArchiveAsync(candidatePaths)` helper in
  `session-transcript-files.fs.ts` does an async `readdir` of each candidate
  parent directory, validates archive entries via the existing
  `parseSessionArchiveTimestamp(entry, "reset")` primitive, and returns only
  the path with the newest archive timestamp.
- A small internal helper `findActiveOrLatestResetArchiveAsync` in
  `session-utils.fs.ts` reuses the existing active-candidate `existsSync`
  check first; only when no active candidate exists does it consult the new
  helper.
- The three async functions above are re-routed through this helper.

Behavior boundaries (kept narrow on purpose):

- Active wins when present (no active+archive merging).
- Only the single newest archive is returned (no chain aggregation, no
  `previousFragments`, no rolled logical-history reconstruction).
- Sync paths, `findExistingTranscriptPath`, the protocol schema, the
  `includeArchived` opt-in, sessions_history schema, the session-memory hook,
  and the session-logs skill are untouched.

Refs openclaw#56131 openclaw#43929 openclaw#45003 openclaw#60409 openclaw#73883
masatohoshino added a commit to masatohoshino/openclaw that referenced this pull request May 17, 2026
…hive fallback as readers

Codex/clawsweeper review on PR openclaw#76134 found that
`readRecentSessionMessagesWithStatsAsync` still derived
`totalMessages` from `readSessionMessageCountAsync`, which kept the
active-only `findExistingTranscriptPath` resolution. After the new
async reader fallback, an archive-only transcript longer than the
bounded tail window therefore returned the newest messages with low
`seq` values and `hasMore = false`. The sessions-history HTTP first
page passes `boundedSnapshot.totalMessages` into
`buildSessionHistorySnapshot`, so a `nextCursor` follow-up could skip
the middle archive page.

This commit routes the async count and visit helpers
(`readSessionMessageCountAsync` and `visitSessionMessagesAsync`)
through the same `findActiveOrLatestResetArchiveAsync` resolution that
the message readers use, so an archive-only transcript reports a
consistent total, gives messages the right tail-end `seq` values, and
exposes the archive's earlier pages via `nextCursor`.

Behavior boundaries (kept narrow, same as the original PR):

- Active wins when present (no active+archive merging).
- Only the single newest archive is returned (no chain aggregation).
- Sync paths, `findExistingTranscriptPath`, the protocol schema, the
  `includeArchived` opt-in, sessions_history schema, the session-memory
  hook, and the session-logs skill are still untouched.

Tests: 4 new cases in `session-utils.fs.test.ts` cover the count and
stats helpers (active-only, archive-only with archive longer than the
tail window, both-exist active-priority, both-missing) and assert the
returned `seq` metadata matches the archive's tail position. One new
case in `sessions-history-http.test.ts` exercises the archive-only
HTTP pagination path: archive a 3-message transcript, request the
first `limit=2` page, follow `nextCursor`, and assert the older
archive message is visible without skipping middle pages.

Refs openclaw#56131 openclaw#43929 openclaw#45003 openclaw#60409 openclaw#73883
masatohoshino added a commit to masatohoshino/openclaw that referenced this pull request May 17, 2026
…reads when active is missing

When the active session transcript file is missing, async read paths used by
chat.history (`readSessionMessagesAsync`, `readRecentSessionMessagesAsync`)
and the session title/preview lookup
(`readSessionTitleFieldsFromTranscriptAsync`) returned an empty result instead
of any archived content. Reset rotates the active `<id>.jsonl` to a sibling
`<id>.jsonl.reset.<UTC timestamp>` and creates a new empty active file, so
sessions reset before the next read became invisible through the production
async API even when the archive still existed on disk.

This change adds a narrow auto-fallback on the async paths only:

- New `findLatestResetArchiveAsync(candidatePaths)` helper in
  `session-transcript-files.fs.ts` does an async `readdir` of each candidate
  parent directory, validates archive entries via the existing
  `parseSessionArchiveTimestamp(entry, "reset")` primitive, and returns only
  the path with the newest archive timestamp.
- A small internal helper `findActiveOrLatestResetArchiveAsync` in
  `session-utils.fs.ts` reuses the existing active-candidate `existsSync`
  check first; only when no active candidate exists does it consult the new
  helper.
- The three async functions above are re-routed through this helper.

Behavior boundaries (kept narrow on purpose):

- Active wins when present (no active+archive merging).
- Only the single newest archive is returned (no chain aggregation, no
  `previousFragments`, no rolled logical-history reconstruction).
- Sync paths, `findExistingTranscriptPath`, the protocol schema, the
  `includeArchived` opt-in, sessions_history schema, the session-memory hook,
  and the session-logs skill are untouched.

Refs openclaw#56131 openclaw#43929 openclaw#45003 openclaw#60409 openclaw#73883
masatohoshino added a commit to masatohoshino/openclaw that referenced this pull request May 17, 2026
…hive fallback as readers

Codex/clawsweeper review on PR openclaw#76134 found that
`readRecentSessionMessagesWithStatsAsync` still derived
`totalMessages` from `readSessionMessageCountAsync`, which kept the
active-only `findExistingTranscriptPath` resolution. After the new
async reader fallback, an archive-only transcript longer than the
bounded tail window therefore returned the newest messages with low
`seq` values and `hasMore = false`. The sessions-history HTTP first
page passes `boundedSnapshot.totalMessages` into
`buildSessionHistorySnapshot`, so a `nextCursor` follow-up could skip
the middle archive page.

This commit routes the async count and visit helpers
(`readSessionMessageCountAsync` and `visitSessionMessagesAsync`)
through the same `findActiveOrLatestResetArchiveAsync` resolution that
the message readers use, so an archive-only transcript reports a
consistent total, gives messages the right tail-end `seq` values, and
exposes the archive's earlier pages via `nextCursor`.

Behavior boundaries (kept narrow, same as the original PR):

- Active wins when present (no active+archive merging).
- Only the single newest archive is returned (no chain aggregation).
- Sync paths, `findExistingTranscriptPath`, the protocol schema, the
  `includeArchived` opt-in, sessions_history schema, the session-memory
  hook, and the session-logs skill are still untouched.

Tests: 4 new cases in `session-utils.fs.test.ts` cover the count and
stats helpers (active-only, archive-only with archive longer than the
tail window, both-exist active-priority, both-missing) and assert the
returned `seq` metadata matches the archive's tail position. One new
case in `sessions-history-http.test.ts` exercises the archive-only
HTTP pagination path: archive a 3-message transcript, request the
first `limit=2` page, follow `nextCursor`, and assert the older
archive message is visible without skipping middle pages.

Refs openclaw#56131 openclaw#43929 openclaw#45003 openclaw#60409 openclaw#73883
@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: macos App: macos app: web-ui App: web-ui gateway Gateway runtime rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant