fix(memory): yield during session indexing by bitloi · Pull Request #76978 · openclaw/openclaw

bitloi · 2026-05-03T22:36:16Z

Summary

Problem: memory-core session transcript indexing can run during startup/session-start and starve the gateway event loop for large session corpora.
Why it matters: gateway WebSocket requests can time out while the process is alive but busy indexing sessions.
What changed: session transcript sync now yields to the event loop between indexing batches and stale-row pruning batches.
What did NOT change (scope boundary): this does not change memory provider selection, session source semantics, config behavior, or move indexing to a worker thread.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes memory-core: synchronous indexing on session start blocks gateway event loop for 30+ seconds #76890
This PR fixes a bug or regression

Real behavior proof (required for external PRs)

External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only. Screenshots are encouraged even for CLI, console, text, or log changes; terminal screenshots and copied live output count.

Behavior or issue addressed: memory-core session sync yields back to the Node event loop during large session transcript indexing batches instead of monopolizing startup/session-start work until the full corpus finishes.
Real environment tested: local OpenClaw checkout at PR head, Linux container, Node v24.13.0, memory-core session sync path.
Exact steps or command run after this patch:
1. Rebased the branch onto current upstream/main so the PR includes the current test-type fixes.
2. Ran the focused memory-core session-yield verification command against the real PR checkout.
3. Ran the failing check:test-types shard locally.
Evidence after fix: copied terminal output from the local PR checkout:

$ PATH=/root/.nvm/versions/node/v24.13.0/bin:$PATH corepack pnpm test extensions/memory-core/src/memory/manager-sync-yield.test.ts

> openclaw@2026.5.5 test /root/74/openclaw
> node scripts/test-projects.mjs extensions/memory-core/src/memory/manager-sync-yield.test.ts

[test] starting test/vitest/vitest.extension-memory.config.ts
[test] passed 1 Vitest shard in 58.83s

$ PATH=/root/.nvm/versions/node/v24.13.0/bin:$PATH corepack pnpm check:test-types

> openclaw@2026.5.5 check:test-types /root/74/openclaw
> pnpm tsgo:test

> openclaw@2026.5.5 tsgo:core:test /root/74/openclaw
> node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo

> openclaw@2026.5.5 tsgo:extensions:test /root/74/openclaw
> node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.extensions.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/extensions-test.tsbuildinfo

Observed result after fix: the session-yield verification completed successfully on the PR checkout, and the previously failing check:test-types command completed without the model-fallback.test.ts env type error after rebasing onto current upstream/main.
What was not tested: live gateway startup with a real multi-thousand-session corpus and end-to-end WebSocket timeout reproduction.
Before evidence (optional but encouraged): before this fix, the new regression observed the 11th session file indexing before the scheduled setImmediate ran, failing with expected [ false ] to deeply equal [ true ].

Root Cause (if applicable)

Root cause: memory-core session sync processed session transcript batches without a macrotask yield, so large startup/full-reindex work could monopolize the gateway event loop.
Missing detection / guardrail: there was no regression test proving session sync yields during large transcript batches.
Contributing context (if known): sessions.list already has a similar batching/yield pattern, but memory-core session indexing did not.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: extensions/memory-core/src/memory/manager-sync-yield.test.ts
Scenario the test should lock in: a scheduled setImmediate runs before later session files continue indexing.
Why this is the smallest reliable guardrail: it exercises the real syncSessionFiles() loop while mocking transcript parsing and providers out of scope.
Existing test that already covers this (if any): none.
If no new test is added, why not: N/A.

User-visible / Behavior Changes

Large session corpora should no longer make gateway startup/session-start reindexing starve WebSocket handling for long stretches.

Diagram (if applicable)

N/A

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: Linux container
Runtime/container: Node v24.13.0 for verification
Model/provider: N/A
Integration/channel (if any): memory-core session indexing
Relevant config (redacted): issue repro config with memory-core sessions source enabled

Steps

Trigger memory-core session sync with more than one session-indexing batch.
Schedule a setImmediate before sync runs.
Verify the event loop gets a chance to run before later files continue indexing.

Expected

Session sync yields between batches so gateway work is not starved indefinitely.

Actual

After this patch, the scheduled immediate runs before the 11th session file is indexed.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Before the fix, the new regression failed with expected [ false ] to deeply equal [ true ]. After the fix, it passes.

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios:
- Targeted memory-core regression test.
- Failing check:test-types shard after rebasing onto current upstream/main.
- Changelog attribution check.
Edge cases checked:
- targeted session files
- skipped/unchanged session files
- missing session entries
- stale session-row pruning
What you did not verify:
- live gateway/TUI reproduction with a real 30+ second corpus.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: very small scheduling overhead during large session syncs.
- Mitigation: yielding is batched every 10 processed session items and preserves existing indexing order, source behavior, and error propagation.

clawsweeper · 2026-05-03T22:39:29Z

Codex review: needs real behavior proof before merge.

Summary
The PR adds event-loop yields while memory-core indexes session transcript files, prunes stale session rows, and parses large transcript files, with focused tests and a changelog entry.

Reproducibility: yes. at source level: current main has a clear session-start path into session transcript indexing without macrotask yields, and #76890 supplies matching event-loop-delay logs. I did not run a live gateway/TUI reproduction in this read-only review.

Real behavior proof
Needs real behavior proof before merge: The PR body provides copied terminal output for mocked Vitest checks and typecheck, but no live gateway/TUI or real large-corpus after-fix proof. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Needs contributor-supplied real runtime proof or a maintainer override; there is no narrow code defect for a ClawSweeper repair PR to fix.

Security
Cleared: The diff only changes memory scheduling/parsing, focused tests, and changelog text; it adds no dependency, workflow, permission, secret, network, or command-execution surface.

Review details

Best possible solution:

Land the focused memory-core yield fix after redacted real runtime proof or an explicit maintainer override, keeping the scope to responsiveness in session indexing and transcript parsing.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main has a clear session-start path into session transcript indexing without macrotask yields, and #76890 supplies matching event-loop-delay logs. I did not run a live gateway/TUI reproduction in this read-only review.

Is this the best way to solve the issue?

Yes for the code direction: batched setImmediate yields match the existing sessions.list responsiveness pattern, and the maintainer head also yields inside single-file transcript parsing. The merge path still needs real runtime proof or a maintainer override.

Acceptance criteria:

pnpm test extensions/memory-core/src/memory/manager-sync-yield.test.ts packages/memory-host-sdk/src/host/session-files-yield.test.ts
pnpm test extensions/memory-core
pnpm check:changed
Live gateway/TUI proof with a real large session corpus and private details redacted

What I checked:

Current main session-start path: warmSession() still fire-and-forgets sync({ reason: "session-start" }) when on-session-start sync is enabled, matching the linked gateway startup/session-start report path. (extensions/memory-core/src/memory/manager.ts:288, b2bc1e9a5661)
Current main lacks session-sync yield points: syncSessionFiles maps transcript files through runWithConcurrency and then loops stale rows without a macrotask yield on current main. (extensions/memory-core/src/memory/manager-sync-ops.ts:873, b2bc1e9a5661)
Current main parser is synchronous after file read: buildSessionEntry() reads the transcript into memory, splits the whole raw string, and parses the line loop without yielding on current main. (packages/memory-host-sdk/src/host/session-files.ts:545, b2bc1e9a5661)
Existing responsiveness precedent: sessions.list already yields with setImmediate between batches to avoid starving WebSocket heartbeats, channel I/O, and concurrent RPC calls. (src/gateway/session-utils.ts:2232, b2bc1e9a5661)
Docs establish the bug boundary: The memory config docs say session indexing is opt-in and asynchronous, so a synchronous event-loop stall is broken existing behavior rather than a new capability request. Public docs: docs/reference/memory-config.md. (docs/reference/memory-config.md:449, b2bc1e9a5661)
PR adds the intended sync yields: The PR diff adds SESSION_SYNC_YIELD_EVERY, createSessionSyncYield, and finally-based awaits for session-file processing and stale-row pruning. (extensions/memory-core/src/memory/manager-sync-ops.ts:141, 58d60ba3db16)

Likely related people:

steipete: GitHub path history shows repeated recent work on memory-core sync and memory-host transcript code, and the current PR head includes their parser-yield commit. (role: recent area contributor; confidence: high; commits: 53a97fe0a76a, a38c2c233a44, 983fd775e2ca; files: extensions/memory-core/src/memory/manager-sync-ops.ts, extensions/memory-core/src/memory/manager.ts, packages/memory-host-sdk/src/host/session-files.ts)
buyitsydney: Recent commits changed archived session transcript visibility in both the memory sync path and memory-host session-file parser. (role: adjacent session transcript contributor; confidence: medium; commits: aba97a4c7cff, 2ffdb5d248a1; files: extensions/memory-core/src/memory/manager-sync-ops.ts, packages/memory-host-sdk/src/host/session-files.ts)
zqchris: Recent session-corpus parsing work touched the parser that builds transcript entries consumed by this indexing path. (role: adjacent session corpus contributor; confidence: medium; commits: 82e349a48ad9; files: packages/memory-host-sdk/src/host/session-files.ts)
Peetiegonzalez: Authored the recent sessions.list event-loop-blocking reduction that this PR mirrors as a scheduling precedent. (role: adjacent gateway responsiveness contributor; confidence: low; commits: 7fe4ba013ff0; files: src/gateway/session-utils.ts)

Remaining risk / open question:

The supplied proof does not exercise a live gateway/TUI or real large session corpus, so the end-to-end timeout improvement remains unproven.

Codex review notes: model gpt-5.5, reasoning high; reviewed against b2bc1e9a5661.

bitloi · 2026-05-03T22:51:58Z

Codex review: needs maintainer review before merge.

Summary The PR adds batched setImmediate yields to memory-core session transcript indexing and stale session-row pruning, with a new Vitest regression and changelog entry for #76890.

Reproducibility: no. high-confidence live reproduction was run. Source inspection shows current main processes full session-indexing batches and stale-row pruning without a macrotask yield, and the PR’s unit scenario captures the scheduling gap at the source level.

Next step before merge No repair lane is needed because the PR is mergeable with no actionable review finding; remaining work is final CI completion and normal maintainer merge review.

Security Cleared: The diff only changes memory sync scheduling, a focused test, and changelog text; it adds no dependencies, permissions, network calls, secret handling, or command execution surface.

Review details

Thanks. This is addressed on the current head c1168f9a3b.

GitHub now reports the PR as mergeable: MERGEABLE with mergeStateStatus: CLEAN.
I verified a clean local merge against latest upstream/main; CHANGELOG.md auto-merges cleanly.
Fixed the extension lint failure from the empty undici mock classes.
Ran the requested checks:
- pnpm test extensions/memory-core/src/memory/manager-sync-yield.test.ts extensions/memory-core/src/memory/manager.session-reindex.test.ts extensions/memory-core/src/memory/manager-targeted-sync.test.ts → PASS
- pnpm test extensions/memory-core → PASS
- pnpm check:changed → PASS
- pnpm lint → PASS

The earlier dirty-branch finding was for the old head 50971fb52d; the current branch is clean and validated.

steipete · 2026-05-10T06:41:35Z

Landed via rebase merge onto main.

Source head: 58d60ba
Landed head: 482af6d
Gate: focused local tests for session parsing/session-sync yield; Blacksmith Testbox pnpm check:changed passed; PR CI green on source head.

Thanks @bitloi!

bitloi · 2026-05-10T06:42:46Z

Landed via rebase merge onto main.

Source head: 58d60ba

Landed head: 482af6d

Gate: focused local tests for session parsing/session-sync yield; Blacksmith Testbox pnpm check:changed passed; PR CI green on source head.

Thanks @bitloi!

thank you sir

openclaw-barnacle Bot added extensions: memory-core Extension: memory-core size: M labels May 3, 2026

bitloi force-pushed the fix/issue-76890-memory-session-yield branch 2 times, most recently from 8072468 to c1168f9 Compare May 3, 2026 22:45

clawsweeper Bot mentioned this pull request May 3, 2026

memory-core: synchronous indexing on session start blocks gateway event loop for 30+ seconds #76890

Closed

openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 6, 2026

bitloi force-pushed the fix/issue-76890-memory-session-yield branch from c40a3d4 to 7f56002 Compare May 6, 2026 05:47

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 6, 2026

clawsweeper Bot mentioned this pull request May 7, 2026

fix/issue-76890-memory-session-yield #79001

Closed

bitloi and others added 3 commits May 10, 2026 07:21

fix(memory): yield during session indexing

f949e86

Add changelog credit for memory sync fix

3338bac

fix(memory): yield while parsing session transcripts

58d60ba

steipete force-pushed the fix/issue-76890-memory-session-yield branch from 4a3e1f2 to 58d60ba Compare May 10, 2026 06:26

steipete merged commit 482af6d into openclaw:main May 10, 2026
110 checks passed

nitinjwadhawan mentioned this pull request May 18, 2026

fix(memory-core): yield event loop during fallback vector search (#81172) #83758

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(memory): yield during session indexing#76978

fix(memory): yield during session indexing#76978
steipete merged 3 commits into
openclaw:mainfrom
bitloi:fix/issue-76890-memory-session-yield

bitloi commented May 3, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 3, 2026 •

edited

Loading

Uh oh!

bitloi commented May 3, 2026

Uh oh!

Uh oh!

steipete commented May 10, 2026

Uh oh!

bitloi commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bitloi commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Real behavior proof (required for external PRs)

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Uh oh!

clawsweeper Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bitloi commented May 3, 2026

Uh oh!

Uh oh!

steipete commented May 10, 2026

Uh oh!

bitloi commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bitloi commented May 3, 2026 •

edited

Loading

clawsweeper Bot commented May 3, 2026 •

edited

Loading