Skip to content

[memory] sessions source counter overflow: tombstones and reset snapshots indexed as live files #77338

@buyitsydney

Description

@buyitsydney

[memory] sessions source counter overflow + tombstone/snapshot indexed as live files

Summary

openclaw memory status reports sessions · N/M where N > M (numerator exceeds denominator) due to inconsistent file enumeration filters between scanner (denominator) and indexer (numerator). The indexed set contains *.jsonl.reset.TIMESTAMP.Z snapshots and *.jsonl.deleted.TIMESTAMP.Z tombstones that the denominator scanner excludes.

Reproduction

Two independent fleet containers on v2026.4.24 and v2026.5.3 both reproduce:

Container OpenClaw ver sessions Overflow trajectory.jsonl normal .jsonl .reset.* .deleted.*
A 2026.4.24 458/333 +125 282 45 55 76
B 2026.5.3 60/43 +17 ~14 ~29 ~6 ~11

Older containers show more overflow because /new and /reset events accumulate .reset.* and .deleted.* artifacts over time.

Root cause

In dist/cli.runtime-*.js:

Denominator (scanSessionFiles):

totalFiles: (await fs.readdir(sessionsDir, { withFileTypes: true }))
  .filter((entry) => entry.isFile() && entry.name.endsWith(".jsonl"))
  .length

Strictly matches filenames ending in .jsonl.

Numerator (indexer writing to files table):
Ingests:

  • *.jsonl (matches denominator) ✓
  • *.trajectory.jsonl (matches denominator) ✓
  • *.jsonl.reset.TIMESTAMP.Z (mismatch — snapshots after /reset) ✗
  • *.jsonl.deleted.TIMESTAMP.Z (mismatch — tombstones) ✗

Memory source memory/ avoids the mismatch because numerator and denominator share listMemoryFiles().

SQLite evidence (container A)

SELECT source, COUNT(*) FROM files GROUP BY source;
-- memory  104
-- sessions 458

Indexed 458 = 282 trajectory + 45 normal + 55 reset snapshots + 76 deleted tombstones
Disk .jsonl strict match = 333 (282 trajectory + 45 normal + 6 other)

All 458 entries point to files that exist on disk (0 true orphans). The problem is filter divergence, not stale entries.

Impact

  1. Counter reporting is misleading (59/43 suggests indexing failure, but indexing is over-broad).
  2. Index bloat — 131/458 = 29% of indexed file rows (and their vector chunks) are .reset.* snapshots + .deleted.* tombstones that logically should not be in active search.
  3. Semantic search pollution — memory_search may surface content from sessions the user explicitly reset or deleted.

Secondary bug: Dirty state machine inconsistent

After openclaw memory index --force:

  • Container A: Dirty=no consistently (pre and post)
  • Container B round 2: Dirty=yes immediately after --force success
  • Container B round 3: Dirty=no

Behavior is non-deterministic. Likely a race between index writer marking clean and a concurrent filesystem event marking dirty.

Suggested fix

Preferred: Indexer skips *.jsonl.reset.* and *.jsonl.deleted.* — tombstones have no user value in search, and reset snapshots are historical data that should live in an archive corpus, not active sessions index.

Alternative: Denominator and numerator share a single file-enumeration helper listSessionFiles() that returns the same set both sides use.

Third: Garbage-collect index entries whose path matches *.deleted.* on next reconcile pass.

Bonus: this fix is independent of PR #76666 (which targets bug 1 — sessions 4/42 lazy-load race via eager ensureSessionListener()).

Not touched by existing PRs

PR #76666 (P7 reset-index series) addresses bug 1 (sessions underindexing). Bugs 2 (counter overflow / filter mismatch) and 3 (dirty state machine) are not covered by P1-P7.


Filed with two independent fleet reproductions (v2026.4.24 + v2026.5.3) and full sqlite table inspection.

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions