Summary
In v2026.5.2, buildSessionEntry short-circuits archive files (.jsonl.reset.*, .jsonl.deleted.*, .jsonl.bak.*) to return an entry with content: "". The indexer then registers these files in the files table but produces zero chunks for them. As a result, memory_search cannot find any content from sessions that have been /reset'd or /new'd. From a user perspective, the assistant forgets the entire conversation the moment it gets archived.
This appears related to #77338 (which argues archive files shouldn't be indexed at all, to fix counter overflow) and #56609 (which establishes that archive files contain irreplaceable history). The behavior shipped in 2026.5.2 partially addresses #77338 but inadvertently breaks long-term recall.
Reproduction
- Have a Signal/web conversation in a session. Exchange a few messages.
/reset the session (or anything else that archives it — e.g., a heartbeat-triggered cleanup).
- Run
openclaw memory index --force.
- Search for distinctive keywords from the archived conversation via
openclaw memory search "<keywords>" or via the memory_search tool.
Expected: Hits from the archived session, with sender/timestamp metadata.
Actual: No matches. The reset file is in sessions/ on disk with the original transcript intact, but chunks table contains nothing for it.
Diagnosis
dist/engine-qmd-DpZ08KSc.js (file hash depends on build, but in 2026.5.2):
function shouldSkipTranscriptFileForDreaming(absPath) {
const fileName = path.basename(absPath);
return isSessionArchiveArtifactName(fileName) || isCompactionCheckpointTranscriptFileName(fileName);
}
async function buildSessionEntry(absPath, opts = {}) {
const stat = await fs$1.stat(absPath);
if (shouldSkipTranscriptFileForDreaming(absPath)) return {
path: sessionPathForFile(absPath),
absPath,
mtimeMs: stat.mtimeMs,
size: stat.size,
hash: hashText("\n\n"),
content: "",
lineMap: [],
messageTimestampsMs: []
};
// ... actual content extraction
}
The function is named …ForDreaming but is called unconditionally from the main memory sync path (manager-CgRVbrYO.js:1359 → buildSessionEntry). So the gate applies to all session indexing, not only dreaming/REM passes.
SQLite evidence (one of my containers, v2026.5.2)
chunks total: 785
files total: 922
files by path-suffix:
plain .jsonl : 752
.jsonl.reset.* : 7
.jsonl.deleted.* : 159
*.md (memory) : 4
chunks by path-suffix:
plain .jsonl : 752
.jsonl.reset.* : 0
.jsonl.deleted.* : 0
*.md (memory) : 4
166 archived files registered in files, zero chunks for any of them. The transcripts themselves are intact on disk (verified: grep "<keyword>" *.jsonl.reset.* finds the content; openclaw memory search "<keyword>" does not).
Impact
For users relying on memory_search as durable long-term memory across sessions, this is invisible data loss. Sessions roll over (manual /reset, or whatever triggers automatic archival), and previously-searchable conversation history silently drops out of the index. The user only notices when the assistant claims "I have no memory of X" for a conversation they remember happening.
Questions
- Is the archive-file gate intentional in
buildSessionEntry's main path, or did it leak in from a dreaming-specific change? The function name (shouldSkipTranscriptFileForDreaming) suggests the latter.
- If intentional: what's the recommended path for users who want recall across resets? A separate corpus, a config flag, a different command?
- If unintentional: would a fix make sense that scopes the gate to dreaming code paths and lets the regular indexer chunk archived files (perhaps tagged with a different
source so chunks_fts can rank live > archived)?
Happy to test a patch on my deployment.
Environment
- OpenClaw 2026.5.2 (
8b2a6e5)
- Node 22.22.1, Ubuntu 24
agents.defaults.memorySearch.{enabled:true, sources:["memory","sessions"], experimental.sessionMemory:true, store.vector.enabled:false, query.hybrid:{enabled:true, vectorWeight:0, textWeight:1}}
Summary
In v2026.5.2,
buildSessionEntryshort-circuits archive files (.jsonl.reset.*,.jsonl.deleted.*,.jsonl.bak.*) to return an entry withcontent: "". The indexer then registers these files in thefilestable but produces zero chunks for them. As a result,memory_searchcannot find any content from sessions that have been/reset'd or/new'd. From a user perspective, the assistant forgets the entire conversation the moment it gets archived.This appears related to #77338 (which argues archive files shouldn't be indexed at all, to fix counter overflow) and #56609 (which establishes that archive files contain irreplaceable history). The behavior shipped in 2026.5.2 partially addresses #77338 but inadvertently breaks long-term recall.
Reproduction
/resetthe session (or anything else that archives it — e.g., a heartbeat-triggered cleanup).openclaw memory index --force.openclaw memory search "<keywords>"or via thememory_searchtool.Expected: Hits from the archived session, with sender/timestamp metadata.
Actual: No matches. The reset file is in
sessions/on disk with the original transcript intact, butchunkstable contains nothing for it.Diagnosis
dist/engine-qmd-DpZ08KSc.js(file hash depends on build, but in 2026.5.2):The function is named
…ForDreamingbut is called unconditionally from the main memory sync path (manager-CgRVbrYO.js:1359→buildSessionEntry). So the gate applies to all session indexing, not only dreaming/REM passes.SQLite evidence (one of my containers, v2026.5.2)
166 archived files registered in
files, zero chunks for any of them. The transcripts themselves are intact on disk (verified:grep "<keyword>" *.jsonl.reset.*finds the content;openclaw memory search "<keyword>"does not).Impact
For users relying on
memory_searchas durable long-term memory across sessions, this is invisible data loss. Sessions roll over (manual/reset, or whatever triggers automatic archival), and previously-searchable conversation history silently drops out of the index. The user only notices when the assistant claims "I have no memory of X" for a conversation they remember happening.Questions
buildSessionEntry's main path, or did it leak in from a dreaming-specific change? The function name (shouldSkipTranscriptFileForDreaming) suggests the latter.sourcesochunks_ftscan rank live > archived)?Happy to test a patch on my deployment.
Environment
8b2a6e5)agents.defaults.memorySearch.{enabled:true, sources:["memory","sessions"], experimental.sessionMemory:true, store.vector.enabled:false, query.hybrid:{enabled:true, vectorWeight:0, textWeight:1}}