Skip to content

fix(gateway): async session transcript IO#75875

Merged
steipete merged 4 commits intomainfrom
fix/async-session-transcript-io
May 2, 2026
Merged

fix(gateway): async session transcript IO#75875
steipete merged 4 commits intomainfrom
fix/async-session-transcript-io

Conversation

@steipete
Copy link
Copy Markdown
Contributor

@steipete steipete commented May 2, 2026

No description provided.

@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime agents Agent runtime and tooling extensions: codex size: XL maintainer Maintainer-authored PR labels May 2, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 2, 2026

Codex review: needs changes before merge.

What this changes:

The PR adds async/bounded session transcript readers and serialized transcript append helpers, rewires Gateway/session/TUI/agent callers to await them, adds transcript index/cache tests, and documents the Gateway/session performance fix in the changelog.

Required change before merge:

There is a narrow, actionable repair: add incomplete-chain fallback coverage for the new async bounded tree-tail reader, then run focused transcript/history tests. The protected label and failing CI still require maintainer review before merge.

Security review:

Security review cleared: The diff does not add dependencies, workflow execution, package resolution changes, secret handling, or new external code execution paths.

Review findings:

  • [P2] Fall back when the tail chain is incomplete — src/gateway/session-utils.fs.ts:395-396
Review details

Best possible solution:

Keep the async reader/append direction, but preserve SessionManager branch semantics for bounded tree reads by falling back to the async full index or expanding the tail when the active parent chain is incomplete; then rerun the focused Gateway/session transcript tests and resolve head CI before maintainer merge review.

Do we have a high-confidence way to reproduce the issue?

Yes. The underlying issue is reproducible from #75656 by creating large session JSONL histories and hitting chat.history, sessions.list, sessions.get, or session event paths on current main; the review finding is reproducible with a tree transcript whose latest parent-linked leaf is a non-message entry and whose parent is outside the bounded tail window.

Is this the best way to solve the issue?

No, not yet. The proposed async conversion is the right solution class, but the current bounded tree-tail implementation can drop valid history when the active branch cannot be reconstructed from the tail chunk alone.

Full review comments:

  • [P2] Fall back when the tail chain is incomplete — src/gateway/session-utils.fs.ts:395-396
    Tree transcripts can end with non-message entries such as session_info, labels, or model/thinking changes because SessionManager advances the leaf for every parent-linked entry. In the new bounded tail reader, if that leaf is present but its parent falls before the byte window, this break returns only the non-visible leaf, so readRecentSessionMessagesAsync() can return no history even though recent messages exist. Please fall back to the async full transcript index, or keep expanding the tail, when the active parent chain is incomplete or yields no visible messages.
    Confidence: 0.8

Overall correctness: patch is incorrect
Overall confidence: 0.78

Acceptance criteria:

  • pnpm test src/gateway/session-utils.fs.test.ts
  • pnpm test src/gateway/sessions-history-http.test.ts src/gateway/server-methods/sessions.send-followup-status.test.ts
  • pnpm exec oxfmt --check --threads=1 src/gateway/session-utils.fs.ts src/gateway/session-utils.fs.test.ts
  • pnpm check:changed

What I checked:

  • Live PR metadata: The PR is open, unmerged, based on main 5046cbc, has head aac97f9e156b437b01bbb80c6b5ca000844c3e17, and carries the protected maintainer label. (aac97f9e156b)
  • Related issue evidence: fix: synchronous session transcript reads block Gateway event loop (WS handshake timeouts, Telegram unresponsive) #75656 reports synchronous transcript reads blocking Gateway/WebSocket/Telegram paths on released 2026.4.29, with follow-up Linux/macOS confirmations and measured chat.history / sessions.list stalls.
  • Current-main target behavior: Current main still has the synchronous full transcript reader that this PR is replacing: readSessionMessages() calls fs.readFileSync(filePath, "utf-8").split(...) before parsing transcript lines. (src/gateway/session-utils.fs.ts:154, 5046cbc6f99a)
  • PR implementation surface: The PR adds readSessionTranscriptIndex(), async transcript message/count/title readers, and switches chat.history to readRecentSessionMessagesAsync(...). (src/gateway/server-methods/chat.ts:1679, aac97f9e156b)
  • Review finding evidence: The new bounded tree-tail reader stops when the active leaf's parent is outside the byte window, which can yield no visible messages for a valid tree transcript whose latest leaf is a non-message entry. (src/gateway/session-utils.fs.ts:395, aac97f9e156b)
  • Head checks: GitHub check runs for the PR head are failing; annotations include broad module-resolution failures, so merge readiness still needs CI triage after code review. (aac97f9e156b)

Likely related people:

  • vincentkoc: Recent current-main history shows multiple focused fixes in the exact transcript and session-list performance area, including bounded transcript hot paths and sessions-list usage caps. (role: recent Gateway/session hot-path maintainer; confidence: high; commits: ec59af33863a, ecf6cbf75d3d, aec83af23d49; files: src/gateway/session-utils.fs.ts, src/gateway/session-utils.ts, src/gateway/server-methods/sessions.ts)
  • amknight: Authored the recent merged async transcript I/O work that introduced listSessionsFromStoreAsync() and session compaction async copy behavior in the same performance cluster. (role: adjacent async transcript I/O contributor; confidence: medium; commits: 6f819280a302; files: src/gateway/session-utils.ts, src/gateway/server-methods/sessions.ts)
  • steipete: Beyond authoring this PR, prior merged history shows maintainer work in Gateway/session helpers and Codex app-server transcript boundaries touched by this patch. (role: recent maintainer and adjacent owner; confidence: high; commits: 47009dd71868, 079bf25fee2c, 662de55e0785; files: src/gateway/session-utils.fs.ts, src/gateway/session-utils.ts, extensions/codex/src/app-server/transcript-mirror.ts)

Remaining risk / open question:

  • The PR head currently has failing GitHub check runs, so the branch is not merge-ready even aside from the review finding.
  • This read-only pass did not execute targeted tests locally; the finding is source-level and should be covered by a focused regression test.
  • The patch intentionally reimplements some SessionManager tree behavior for async performance, so branch/leaf semantics need careful coverage before landing.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 5046cbc6f99a.

@steipete steipete force-pushed the fix/async-session-transcript-io branch from aac97f9 to ee50198 Compare May 2, 2026 00:56
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aac97f9e15

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +486 to +492
const snapshot = readRecentSessionUsageFromTranscript(
entry.sessionId,
params.storePath,
entry.sessionFile,
agentId,
typeof params.maxTranscriptBytes === "number" ? params.maxTranscriptBytes : 256 * 1024,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep uncapped usage fallback on full transcript scan

Switching the default path to readRecentSessionUsageFromTranscript(..., 256 * 1024) drops usage metadata whenever the newest assistant usage block falls outside the last 256 KiB (for example after very large recent user/tool content). Before this commit, the uncapped path read the full transcript and still surfaced totalTokens/estimatedCostUsd; now affected sessions return null usage and session rows/events lose these fields despite valid historical usage data.

Useful? React with 👍 / 👎.

@steipete steipete force-pushed the fix/async-session-transcript-io branch from aed2fda to 1feaafa Compare May 2, 2026 01:03
@steipete steipete merged commit 6147e1b into main May 2, 2026
97 of 101 checks passed
@steipete steipete deleted the fix/async-session-transcript-io branch May 2, 2026 01:06
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
* fix(gateway): async session transcript IO

* fix(plugins): restore jiti loader cache helper

* test(gateway): mock async artifact transcript reads

* chore(plugins): drop obsolete jiti loader shim
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
* fix(gateway): async session transcript IO

* fix(plugins): restore jiti loader cache helper

* test(gateway): mock async artifact transcript reads

* chore(plugins): drop obsolete jiti loader shim
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: web-ui App: web-ui extensions: codex gateway Gateway runtime maintainer Maintainer-authored PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant