Skip to content

fix(memory-core): stream embedding cache seed during reindex#73118

Merged
steipete merged 2 commits into
mainfrom
peter/land-73067-memory-cache-seed
Apr 28, 2026
Merged

fix(memory-core): stream embedding cache seed during reindex#73118
steipete merged 2 commits into
mainfrom
peter/land-73067-memory-cache-seed

Conversation

@steipete

Copy link
Copy Markdown
Contributor

Summary

  • Land fix(memory-core): stream embedding cache seed during reindex #73067 through a maintainer branch because GitHub rejected writes to the contributor fork despite maintainerCanModify=true.
  • Stream safe-reindex embedding_cache seeding with SQLite iterate() and avoid a no-op transaction for empty caches.
  • Keep the regression test explicit about spying on the original source DB while safe reindex writes to a temp DB.

Verification

  • Live runtime smoke: real manager + real SQLite, 5,000 seeded cache rows, .all() forced to throw, safe reindex copied rows through exactly one streamed SELECT.
  • pnpm test extensions/memory-core/src/memory/index.test.ts
  • pnpm test extensions/memory-core/src/memory
  • pnpm check:changed

Supersedes #73067. Thanks @parkertoddbrooks.

@cursor

cursor Bot commented Apr 28, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Changes the safe-reindex path to stream SQLite embedding-cache rows and introduces async yielding/transaction gating, which could affect reindex correctness/performance if mishandled. Scoped to memory indexing, but touches core sync/reindex logic.

Overview
Memory safe reindex now seeds the embedding cache via streaming iteration instead of materializing all rows up front, and only starts/commits a transaction if at least one row is copied.

The cache-seed loop periodically yields (setImmediate) during large copies, and a new regression test asserts .iterate() is used (and .all() is not) when copying from the original DB into the temp DB used for atomic reindex.

Reviewed by Cursor Bugbot for commit 748f3d9. Bugbot is set up for automated code reviews on this repo. Configure here.

@openclaw-barnacle openclaw-barnacle Bot added extensions: memory-core Extension: memory-core size: S maintainer Maintainer-authored PR labels Apr 28, 2026
@greptile-apps

greptile-apps Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR replaces .all() with .iterate() in seedEmbeddingCache to avoid materialising every row of the embedding cache into the V8 heap during a safe reindex, and defers opening the SQLite transaction until the first row arrives so empty caches skip the no-op BEGIN/COMMIT. A setImmediate yield every 1 000 rows keeps the event loop available to health probes. The new test explicitly guards that the streaming path is taken by making the .all() overload throw.

Confidence Score: 5/5

Safe to merge — the logic change is minimal, error handling is correct, and the regression test directly verifies the streaming path.

No P0 or P1 issues found. The lazy transaction initialisation, streaming iteration, yield cadence, and ROLLBACK guard are all handled correctly. The test is well-scoped: it reads cached rows from the source DB before installing the spy, drives a forced reindex, and confirms no extra embedding calls were made.

No files require special attention.

Reviews (1): Last reviewed commit: "fix(memory-core): tighten cache seed str..." | Re-trigger Greptile

@steipete steipete merged commit 983fd77 into main Apr 28, 2026
68 of 69 checks passed
@steipete steipete deleted the peter/land-73067-memory-cache-seed branch April 28, 2026 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extensions: memory-core Extension: memory-core maintainer Maintainer-authored PR size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants