fix(memory): index reset session transcripts#20183
Conversation
db9217e to
645a664
Compare
|
@vignesh07 you might be interested in this fix. I just rebased and squashed when I saw you pushed some changes and remembered your focus on memory enhancements |
c0e989c to
cb32759
Compare
5b930d3 to
cb32759
Compare
2b1f577 to
210db85
Compare
|
+1 — we hit this exact scenario tonight. After a session cleanup, dozens of This PR + #20148 together would fully resolve the issue. 🙏 |
|
+1 from a production deployment. We have 92 After a session cleanup cycle, Happy to help test if needed. |
|
Been actively using and debasing since I opened the PR with no issues. Hopefully someone takes a peek soon 🤞 |
|
Thanks all for the production feedback and +1s it's actually helpful in a place where there are literally thousands of PRs/issues. I pushed an update to scope this PR to reset archives only. This changes the scope to indexes Why the change?From the session lifecycle paths:
So ResultThis fixes missing memory recall from reset snapshots while intentionally keeping deleted transcripts out of indexing. If you're looking for
If there’s demand for deleted indexing, that should likely be a separate, explicit follow-up (probably as an opt-in and scoped by session type). |
|
@codex review |
|
@greptile-apps review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8115b99aea
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Greptile SummaryThis PR adds opt-in recall of reset-archive transcripts (
Confidence Score: 3/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/gateway/session-utils.fs.ts
Line: 269-273
Comment:
**Transcript update emitted even when `rm` fails silently**
`fs.promises.rm` errors are swallowed by `.catch(() => undefined)`, but `emitSessionTranscriptUpdate` is called unconditionally afterward. If the file removal fails, downstream subscribers (memory index manager) will receive a "file removed" signal for a file that still exists on disk. This can cause the index entry to be deleted and then re-added on the next sync cycle — at minimum, unnecessary churn.
Compare with `archiveFileOnDisk` where `renameSync` throws on failure so the emit is implicitly guarded. The cleanup path should apply the same guard:
```typescript
const removed_ = await fs.promises.rm(fullPath).then(() => true, () => false);
if (removed_) {
removed += 1;
if (reason === "reset") {
emitSessionTranscriptUpdate(fullPath);
}
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/memory/manager-sync-ops.ts
Line: 784-789
Comment:
**Stale session rows not cleaned up on full reindex**
`sessionRowsBefore` is set to `[]` when `needsFullReindex` is `true`, which means the stale-row cleanup loop at the bottom of `syncSessionFiles` becomes a no-op during full reindexes. Prior to this PR, the `staleRows` query ran *after* indexing regardless of `needsFullReindex`, so it always cleaned up paths no longer in `activePaths`.
The practical impact is visible in the toggle scenario: when `includeResetArchives` flips from `true` → `false`, a full reindex is triggered (via `metaSessionIncludeResetArchivesDiffers`), `activePaths` no longer includes reset archive paths, but because `sessionRowsBefore = []` the stale reset archive rows are never deleted from the `files`/`chunks` tables.
The test `triggers a full session reindex when includeResetArchives toggles` expects `withoutArchivesFiles` to equal `1` after the toggle, which would fail unless there is a pre-existing bulk-delete of session rows elsewhere in the parent sync path (not visible in this diff). If that bulk-delete exists, this is intentional; if not, it is a regression. Please confirm that the full reindex path clears session rows before calling `syncSessionFiles({ needsFullReindex: true })`, or restore the unconditional query for the cleanup pass:
```typescript
const sessionRowsBefore =
(this.db.prepare(`SELECT path FROM files WHERE source = ?`).all("sessions") as Array<{
path: string;
}>);
const knownPaths = params.needsFullReindex
? new Set<string>()
: new Set(sessionRowsBefore.map((row) => row.path));
```
This preserves the optimization (skip known unchanged paths on incremental sync) while still ensuring stale rows are cleaned up on full reindex.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: cd22347 |
fb5961e to
386a672
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 386a672d9d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
c5bdf44 to
35d782b
Compare
7b3a342 to
2f8de9b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2f8de9bda8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@greptile-apps review |
|
Codex Review: Didn't find any major issues. What shall we delve into next? ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
|
@vignesh07 would you mind taking a look? |
|
@greptile-apps review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
c38b418 to
cd22347
Compare
|
@greptile-apps review |
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
|
@vignesh07 rebased with main and addressed additional bot reviews 🤞 |
…et-deleted # Conflicts: # src/agents/memory-search.test.ts # src/agents/memory-search.ts # src/config/schema.help.ts # src/config/schema.labels.ts # src/config/types.tools.ts # src/config/zod-schema.agent-runtime.ts # src/memory/manager-sync-ops.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3dcaf0b236
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (reason === "reset") { | ||
| emitSessionTranscriptUpdate(archived); | ||
| } |
There was a problem hiding this comment.
Update the session-archive reset path too
Fresh evidence: the message-driven reset flow in src/auto-reply/reply/session.ts:571-579 still calls archiveSessionTranscripts from src/gateway/session-archive.fs.ts, and that helper's archiveFileOnDisk never emits emitSessionTranscriptUpdate. When agents.defaults.memorySearch.sync.sessions.includeResetArchives is enabled, a normal /reset creates a new *.jsonl.reset.<ts> file, but the listener in src/memory/manager-sync-ops.ts:443-456 never sees it, so the reset archive is not indexed until some unrelated full/manual sync happens.
Useful? React with 👍 / 👎.
| if (reason === "reset") { | ||
| emitSessionTranscriptUpdate(fullPath); |
There was a problem hiding this comment.
Mirror reset-retention invalidation in session-archive.fs
Fresh evidence: regular session-store maintenance still imports cleanupArchivedSessionTranscripts from src/gateway/session-archive.fs.ts (src/config/sessions/store.ts:468-478), while only this copy now emits a transcript update after removing expired reset archives. In deployments that index reset archives, aging them out through the normal store maintenance path leaves stale sessions/*reset* rows behind, so deleted reset content remains searchable until another sync eventually rebuilds the session index.
Useful? React with 👍 / 👎.
|
+1 — this PR fixes a real pain point (92 .reset files / 37.9MB excluded from search). Codex review passed, rebased clean. Would love to see this merged! 🙏 |
|
Taking a look. |
Summary
*.jsonl, so reset archives (*.jsonl.reset.<timestamp>) were excluded from recall.Final Behavior
*.jsonl*.jsonl.reset.<timestamp>(only when explicitly enabled)*.jsonl.deleted.**.jsonl.bak.**.jsonl.reset.<ts>.deleted.<ts>/*.bak.<ts>Config (Opt-In)
memory.qmd.sessions.includeResetArchives: trueagents.defaults.memorySearch.sync.sessions.includeResetArchives: trueBoth default to
falseto preserve existing safety expectations.What Changed
Codex Review Notes
Branch Hygiene
main:7b3a34266Verification
src/memory/session-files.test.tssrc/memory/manager.session-delta-archived.test.tsLinked