Skip to content

feat: directory-per-session store to eliminate monolithic JSON bottleneck#19528

Closed
binary64 wants to merge 3 commits intoopenclaw:mainfrom
binary64:feat/directory-session-store
Closed

feat: directory-per-session store to eliminate monolithic JSON bottleneck#19528
binary64 wants to merge 3 commits intoopenclaw:mainfrom
binary64:feat/directory-session-store

Conversation

@binary64
Copy link
Copy Markdown
Contributor

@binary64 binary64 commented Feb 17, 2026

Problem

The session store is a single monolithic JSON file (sessions.json) that grows to 80MB+ with 650+ sessions. Every read/write operation:

  • Parses the entire file (~80ms+ for JSON.parse alone)
  • Serializes and rewrites the entire file atomically
  • Requires global file locking, causing contention and orphaned .tmp files

This is O(n) for what should be O(1) operations.

Solution: Directory-per-session layout

sessions.d/
  agent--main--telegram--direct--james/
    meta.json    # ~200 bytes
  agent--main--cron--14bbd904/
    meta.json

Key changes in src/config/sessions/store.ts:

  • Auto-detection: If sessions.d/ directory exists, uses directory mode. Otherwise falls back to legacy JSON (backward compatible).
  • Diff-based writes: updateSessionStore() snapshots the store before mutation, then only writes changed entries and deletes removed ones — no more full rewrites.
  • Per-entry I/O: readSessionUpdatedAt() reads a single meta.json instead of parsing the entire store.
  • Atomic writes: Each meta.json uses temp-file + rename, same as the existing pattern.
  • Migration: migrateSessionStoreToDirectory(storePath) splits an existing JSON file into the directory layout, backs up the original file, and is safe to call multiple times.

Performance characteristics:

Operation Before (JSON) After (Directory)
Read one session O(n) parse entire file O(1) read single meta.json
Write one session O(n) serialize + write all O(1) write single meta.json
List sessions O(n) parse O(1) readdir
Locking contention Global file lock Per-session potential

Migration path:

  1. No breaking changes — existing sessions.json continues working as-is
  2. Call migrateSessionStoreToDirectory(storePath) to opt in
  3. After migration, original JSON is backed up as sessions.json.pre-directory-migration.<timestamp>
  4. All subsequent operations use directory mode automatically

New exports:

  • migrateSessionStoreToDirectory(storePath) — trigger migration
  • resolveSessionStoreDir(storePath) — get directory path
  • sanitizeSessionKey(key) / desanitizeSessionKey(dirName) — key ↔ dirname conversion

Tests

Added store.directory.test.ts with 395 lines covering:

  • Key sanitization round-trips
  • Directory store path resolution
  • Migration from JSON to directory (including backup verification)
  • Load/save/update in directory mode
  • Diff-based writes (only changed entries written)
  • Cache invalidation for directory stores
  • Maintenance (pruning, capping) in directory mode

Greptile Summary

Migrates session storage from monolithic JSON to directory-per-session layout, addressing O(n) bottleneck with 80MB+ files. Implementation adds auto-detection, diff-based writes, per-entry I/O, atomic operations, and backward-compatible migration.

Key changes:

  • New directory layout (sessions.d/) with per-session meta.json files (~200 bytes each)
  • Diff-based writes in updateSessionStore() - only changed/removed entries are written
  • readSessionUpdatedAt() now reads single file instead of parsing entire store
  • Cache invalidation properly handles directory stores via explicit clearing on writes
  • Migration function migrateSessionStoreToDirectory() with automatic backup
  • Comprehensive test coverage (395 lines) for migration, CRUD ops, concurrent writes, and cache behavior

Technical review:

  • Atomic writes use temp-file + rename pattern correctly
  • Lock-based serialization prevents concurrent write conflicts
  • Cache is explicitly invalidated on writes, avoiding stale data
  • Directory mtime caching is safe due to explicit invalidation
  • Key sanitization (:--) works for current session key patterns but could collide if keys naturally contained -- (not present in codebase)
  • One documentation mismatch found (comment claims percent-encoding, implementation doesn't)

Confidence Score: 4/5

  • Safe to merge - well-designed performance optimization with comprehensive tests and backward compatibility
  • Implementation is sound with proper atomicity, locking, and cache management. The only issue found is a minor documentation mismatch. The change is backward compatible and includes thorough test coverage. Performance improvement is significant (O(n) → O(1) for single-session operations) and addresses a real production bottleneck.
  • No files require special attention - both implementation and tests are well-structured

Last reviewed commit: bc69f6b

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread src/config/sessions/store.ts Outdated
Comment on lines +42 to +43
* Sanitize a session key for safe use as a filesystem directory name.
* Colons are replaced with `--`, and other unsafe chars are percent-encoded.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment says "other unsafe chars are percent-encoded" but implementation only replaces colons

Suggested change
* Sanitize a session key for safe use as a filesystem directory name.
* Colons are replaced with `--`, and other unsafe chars are percent-encoded.
* Sanitize a session key for safe use as a filesystem directory name.
* Colons are replaced with `--` for reversible filesystem-safe encoding.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 42:43

Comment:
comment says "other unsafe chars are percent-encoded" but implementation only replaces colons

```suggestion
 * Sanitize a session key for safe use as a filesystem directory name.
 * Colons are replaced with `--` for reversible filesystem-safe encoding.
```

How can I resolve this? If you propose a fix, please make it concise.

@steipete steipete force-pushed the feat/directory-session-store branch from 6cf4dc4 to bf742ad Compare February 18, 2026 01:24
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling labels Feb 18, 2026
…neck

Add an alternative directory-based session store layout where each session
gets its own directory with a meta.json file, replacing the monolithic
sessions.json that can grow to 80MB+ with 650+ sessions.

Directory layout:
  sessions.d/
    <sanitized-session-key>/
      meta.json    # SessionEntry (~200 bytes)

Key changes:
- Add sanitizeSessionKey/desanitizeSessionKey for filesystem-safe names
  (colons → double-dash, reversible)
- Add resolveSessionStoreDir() to derive sessions.d/ path from storePath
- loadSessionStore() auto-detects directory vs JSON mode
- saveSessionStoreUnlocked() uses diff-based writes in directory mode
  (only writes changed entries, deletes removed ones)
- readSessionUpdatedAt() reads single entry without loading full store
  in directory mode (O(1) instead of O(n))
- rotateSessionFile() is a no-op in directory mode (individual files
  are tiny, no rotation needed)
- Add migrateSessionStoreToDirectory() for explicit migration from
  legacy JSON to directory layout (backs up original file)
- Add comprehensive tests for migration, CRUD, concurrent writes,
  key sanitization round-trips

Design decisions:
- storePath remains pointing at sessions.json for backward compatibility
  (many callers use path.dirname(storePath) to locate transcript files)
- Directory mode only activates if sessions.d/ exists (after migration)
- Un-migrated stores continue using JSON unchanged
- Migration is explicit via migrateSessionStoreToDirectory(), not automatic
- All existing function signatures preserved — zero caller changes needed
- updateSessionStore takes a pre-mutation snapshot for diff-based writes

Performance improvements in directory mode:
- O(1) single-session reads instead of O(n) full-store parse
- O(changed) writes instead of O(n) full-store serialize
- No file rotation needed (no monolithic file to rotate)
- Per-entry files are ~200 bytes vs 80MB+ monolithic JSON
…c comment

- Fix type errors in exec approval handler tests by casting mock respond
  functions and adding missing client fields (version, platform, mode)
- Fix isWebchatConnect noop to return boolean instead of void
- Cast health handler test params through unknown for partial test objects
- Update sanitizeSessionKey JSDoc to accurately describe implementation
  (only replaces colons, does not percent-encode other chars)
@binary64 binary64 force-pushed the feat/directory-session-store branch from 8f92b46 to 0c1bc32 Compare February 19, 2026 02:32
@openclaw-barnacle openclaw-barnacle Bot removed gateway Gateway runtime agents Agent runtime and tooling labels Feb 19, 2026
@openclaw-barnacle
Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle
Copy link
Copy Markdown

Closing due to inactivity.
If you believe this PR should be revived, post in #pr-thunderdome-dangerzone on Discord to talk to a maintainer.
That channel is the escape hatch for high-quality PRs that get auto-closed.

@jalehman jalehman reopened this Mar 19, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 19, 2026

Greptile Summary

This PR replaces the monolithic sessions.json file with a sessions.d/<sanitized-key>/meta.json directory layout, adding auto-detection, a migration utility, diff-based writes in updateSessionStore, and a readSessionUpdatedAt fast-path. The approach is well-structured and backward-compatible, but three correctness bugs need to be resolved before this is safe to enable on production data.

Issues found:

  • Cache invalidation is broken for cross-process directory updatesgetDirStoreMtimeMs watches the sessions.d/ directory's own mtime, which only changes when direct children are added or removed. Updating an existing sessions.d/<key>/meta.json does not change sessions.d/'s mtime, so a second process's cached snapshot of that session is never invalidated and remains stale for the full TTL (45 s by default).

  • Key collision in sanitizeSessionKey / desanitizeSessionKey-- already present in a session key is indistinguishable from -- produced by replacing :. Two different keys (e.g. a:b:c and a--b:c) map to the same directory name, causing silent data overwrite and incorrect round-tripping when the store is loaded.

  • Partial migration leaves the store permanently inconsistentmigrateJsonToDirectory creates sessions.d/ before all entries are written. If a write fails mid-way, sessions.d/ exists with partial data, isDirectoryStore returns true from then on, and sessions.json (which still contains the full data) is never read again.

  • updateSessionStoreEntry and updateLastRoute skip the diff optimization — both call saveSessionStoreUnlocked without previousStore, causing a full O(n) write on every call in directory mode.

Confidence Score: 2/5

  • Not safe to merge — three correctness bugs (stale cache, key collision, partial-migration data loss) can cause silent data loss or serve stale session data in production.
  • The overall design is sound and the backward-compatibility story is good, but the cache invalidation strategy is fundamentally wrong for multi-process writes in directory mode, the sanitization encoding is not injective (silent key collision), and a failed mid-migration permanently hides the original JSON data. Any of the first three bugs can cause data loss or incorrect behavior for production sessions.
  • src/config/sessions/store.ts — cache invalidation logic (lines 395-399), sanitize/desanitize pair (lines 45-55), and migrateJsonToDirectory (lines 180-229) all require fixes before this can be safely enabled.

Comments Outside Diff (1)

  1. src/config/sessions/store.ts, line 1163 (link)

    P2 Two callers bypass diff-based writes

    updateSessionStoreEntry (line 1163) and updateLastRoute (line 1274) call saveSessionStoreUnlocked without a previousStore argument. This means they always execute the full-write path for directory stores — readdir every entry, write all, delete stale dirs — which is O(n) and defeats the O(1) goal of the new layout for these callers.

    The updateSessionStore helper snapshots previousStore = structuredClone(store) before passing the store to the mutator, then forwards it to saveSessionStoreUnlocked. The same pattern should be applied to updateSessionStoreEntry and updateLastRoute.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/config/sessions/store.ts
    Line: 1163
    
    Comment:
    **Two callers bypass diff-based writes**
    
    `updateSessionStoreEntry` (line 1163) and `updateLastRoute` (line 1274) call `saveSessionStoreUnlocked` without a `previousStore` argument. This means they always execute the full-write path for directory stores — `readdir` every entry, write all, delete stale dirs — which is O(n) and defeats the O(1) goal of the new layout for these callers.
    
    The `updateSessionStore` helper snapshots `previousStore = structuredClone(store)` before passing the store to the mutator, then forwards it to `saveSessionStoreUnlocked`. The same pattern should be applied to `updateSessionStoreEntry` and `updateLastRoute`.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 395-399

Comment:
**Directory-mtime cache invalidation misses in-place updates**

`getDirStoreMtimeMs` checks the mtime of the `sessions.d/` directory itself. On Linux/macOS, a directory's `mtime` is updated only when its *direct* children are added or removed — **not** when a file inside a subdirectory is modified. So when process A calls `updateSessionStore` and rewrites `sessions.d/agent--main--test/meta.json`, the `sessions.d/` directory mtime is unchanged; process B's cache check (`currentDirMtime === cached.dirMtimeMs`) will still return `true` and serve the stale entry for up to 45 seconds.

By contrast, the legacy JSON path is correct: every write changes `sessions.json`'s mtime, so cross-process invalidation works.

For new session directories being created the mtime does change (a new child is added to `sessions.d/`), but existing-session updates are silently missed.

A safer strategy would be to track the mtime of each session sub-directory (`sessions.d/<key>/`) rather than the root `sessions.d/` directory, or to skip the mtime check entirely for directory stores and rely solely on the TTL.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 53-55

Comment:
**Key collision when session keys contain `--`**

The sanitization is not injective: any `--` already present in a session key is indistinguishable from one produced by replacing `:`. This causes two classes of problem:

1. **Silent collision** – keys `a:b:c` and `a--b:c` both map to the directory name `a--b--c`. The second write silently overwrites the first.
2. **Wrong round-trip**`desanitizeSessionKey(sanitizeSessionKey("a--b:c"))` returns `"a:b:c"` instead of `"a--b:c"`, so loading the store after migration produces the wrong key.

Example:
```
sanitizeSessionKey("a:b:c")    // "a--b--c"
sanitizeSessionKey("a--b:c")   // "a--b--c"  ← same!
desanitizeSessionKey("a--b--c") // "a:b:c"    ← not "a--b:c"
```

Consider a different separator that cannot appear in real session keys (e.g. a single `%3A` percent-encoding, or a character like `+` that colons are never turned into), or alternatively assert/reject keys that contain `--` during `sanitizeSessionKey`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 180-229

Comment:
**Partial migration leaves store in a permanently inconsistent state**

`migrateJsonToDirectory` creates `sessions.d/` before writing all entries. If any `writeSessionEntryToDir` call throws (disk full, permission error, etc.), the function exits early — `sessions.d/` now exists with only some entries, and the rename of `sessions.json` to the backup path never runs.

On the next load:
- `isDirectoryStore` returns `true` (because `sessions.d/` was created)
- The directory is read — only the successfully-migrated entries are visible
- `sessions.json` is still present but **never read** in directory mode
- All sessions that weren't written yet are silently lost

The public `migrateSessionStoreToDirectory` wrapper's idempotency check (`isDirectoryStore → return false`) means re-running migration won't recover the missing entries either.

A safer approach is to write all entries first, verify success, and only then create/rename the directory (or use an in-progress marker that is removed only on clean completion).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 1163

Comment:
**Two callers bypass diff-based writes**

`updateSessionStoreEntry` (line 1163) and `updateLastRoute` (line 1274) call `saveSessionStoreUnlocked` without a `previousStore` argument. This means they always execute the full-write path for directory stores — `readdir` every entry, write all, delete stale dirs — which is O(n) and defeats the O(1) goal of the new layout for these callers.

The `updateSessionStore` helper snapshots `previousStore = structuredClone(store)` before passing the store to the mutator, then forwards it to `saveSessionStoreUnlocked`. The same pattern should be applied to `updateSessionStoreEntry` and `updateLastRoute`.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "fix: resolve TypeScr..."

Comment on lines +395 to +399
if (useDirectoryStore) {
const currentDirMtime = getDirStoreMtimeMs(resolveSessionStoreDir(storePath));
if (currentDirMtime === cached.dirMtimeMs) {
return structuredClone(cached.store);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Directory-mtime cache invalidation misses in-place updates

getDirStoreMtimeMs checks the mtime of the sessions.d/ directory itself. On Linux/macOS, a directory's mtime is updated only when its direct children are added or removed — not when a file inside a subdirectory is modified. So when process A calls updateSessionStore and rewrites sessions.d/agent--main--test/meta.json, the sessions.d/ directory mtime is unchanged; process B's cache check (currentDirMtime === cached.dirMtimeMs) will still return true and serve the stale entry for up to 45 seconds.

By contrast, the legacy JSON path is correct: every write changes sessions.json's mtime, so cross-process invalidation works.

For new session directories being created the mtime does change (a new child is added to sessions.d/), but existing-session updates are silently missed.

A safer strategy would be to track the mtime of each session sub-directory (sessions.d/<key>/) rather than the root sessions.d/ directory, or to skip the mtime check entirely for directory stores and rely solely on the TTL.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 395-399

Comment:
**Directory-mtime cache invalidation misses in-place updates**

`getDirStoreMtimeMs` checks the mtime of the `sessions.d/` directory itself. On Linux/macOS, a directory's `mtime` is updated only when its *direct* children are added or removed — **not** when a file inside a subdirectory is modified. So when process A calls `updateSessionStore` and rewrites `sessions.d/agent--main--test/meta.json`, the `sessions.d/` directory mtime is unchanged; process B's cache check (`currentDirMtime === cached.dirMtimeMs`) will still return `true` and serve the stale entry for up to 45 seconds.

By contrast, the legacy JSON path is correct: every write changes `sessions.json`'s mtime, so cross-process invalidation works.

For new session directories being created the mtime does change (a new child is added to `sessions.d/`), but existing-session updates are silently missed.

A safer strategy would be to track the mtime of each session sub-directory (`sessions.d/<key>/`) rather than the root `sessions.d/` directory, or to skip the mtime check entirely for directory stores and rely solely on the TTL.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +53 to +55
export function desanitizeSessionKey(dirName: string): string {
return dirName.replace(/--/g, ":");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Key collision when session keys contain --

The sanitization is not injective: any -- already present in a session key is indistinguishable from one produced by replacing :. This causes two classes of problem:

  1. Silent collision – keys a:b:c and a--b:c both map to the directory name a--b--c. The second write silently overwrites the first.
  2. Wrong round-tripdesanitizeSessionKey(sanitizeSessionKey("a--b:c")) returns "a:b:c" instead of "a--b:c", so loading the store after migration produces the wrong key.

Example:

sanitizeSessionKey("a:b:c")    // "a--b--c"
sanitizeSessionKey("a--b:c")   // "a--b--c"  ← same!
desanitizeSessionKey("a--b--c") // "a:b:c"    ← not "a--b:c"

Consider a different separator that cannot appear in real session keys (e.g. a single %3A percent-encoding, or a character like + that colons are never turned into), or alternatively assert/reject keys that contain -- during sanitizeSessionKey.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 53-55

Comment:
**Key collision when session keys contain `--`**

The sanitization is not injective: any `--` already present in a session key is indistinguishable from one produced by replacing `:`. This causes two classes of problem:

1. **Silent collision** – keys `a:b:c` and `a--b:c` both map to the directory name `a--b--c`. The second write silently overwrites the first.
2. **Wrong round-trip**`desanitizeSessionKey(sanitizeSessionKey("a--b:c"))` returns `"a:b:c"` instead of `"a--b:c"`, so loading the store after migration produces the wrong key.

Example:
```
sanitizeSessionKey("a:b:c")    // "a--b--c"
sanitizeSessionKey("a--b:c")   // "a--b--c"  ← same!
desanitizeSessionKey("a--b--c") // "a:b:c"    ← not "a--b:c"
```

Consider a different separator that cannot appear in real session keys (e.g. a single `%3A` percent-encoding, or a character like `+` that colons are never turned into), or alternatively assert/reject keys that contain `--` during `sanitizeSessionKey`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +180 to +229
async function migrateJsonToDirectory(storePath: string): Promise<void> {
const storeDir = resolveSessionStoreDir(storePath);
let store: Record<string, SessionEntry> = {};

try {
const raw = fs.readFileSync(storePath, "utf-8");
if (raw.length === 0) {
return;
}
const parsed = JSON.parse(raw);
if (parsed && typeof parsed === "object" && !Array.isArray(parsed)) {
store = parsed as Record<string, SessionEntry>;
}
} catch {
return; // Can't read/parse the file, nothing to migrate
}

const keys = Object.keys(store);
if (keys.length === 0) {
return;
}

log.info("migrating session store from JSON to directory layout", {
entries: keys.length,
storePath,
storeDir,
});

// Create the directory store
await fs.promises.mkdir(storeDir, { recursive: true });

// Write each entry
for (const [key, entry] of Object.entries(store)) {
if (!entry) {
continue;
}
await writeSessionEntryToDir(storeDir, key, entry);
}

// Backup and remove the old JSON file
const backupPath = `${storePath}.pre-directory-migration.${Date.now()}`;
try {
await fs.promises.rename(storePath, backupPath);
log.info("backed up legacy sessions.json", {
backupPath: path.basename(backupPath),
});
} catch {
// If rename fails, just leave the file. Directory store takes precedence.
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Partial migration leaves store in a permanently inconsistent state

migrateJsonToDirectory creates sessions.d/ before writing all entries. If any writeSessionEntryToDir call throws (disk full, permission error, etc.), the function exits early — sessions.d/ now exists with only some entries, and the rename of sessions.json to the backup path never runs.

On the next load:

  • isDirectoryStore returns true (because sessions.d/ was created)
  • The directory is read — only the successfully-migrated entries are visible
  • sessions.json is still present but never read in directory mode
  • All sessions that weren't written yet are silently lost

The public migrateSessionStoreToDirectory wrapper's idempotency check (isDirectoryStore → return false) means re-running migration won't recover the missing entries either.

A safer approach is to write all entries first, verify success, and only then create/rename the directory (or use an in-progress marker that is removed only on clean completion).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 180-229

Comment:
**Partial migration leaves store in a permanently inconsistent state**

`migrateJsonToDirectory` creates `sessions.d/` before writing all entries. If any `writeSessionEntryToDir` call throws (disk full, permission error, etc.), the function exits early — `sessions.d/` now exists with only some entries, and the rename of `sessions.json` to the backup path never runs.

On the next load:
- `isDirectoryStore` returns `true` (because `sessions.d/` was created)
- The directory is read — only the successfully-migrated entries are visible
- `sessions.json` is still present but **never read** in directory mode
- All sessions that weren't written yet are silently lost

The public `migrateSessionStoreToDirectory` wrapper's idempotency check (`isDirectoryStore → return false`) means re-running migration won't recover the missing entries either.

A safer approach is to write all entries first, verify success, and only then create/rename the directory (or use an in-progress marker that is removed only on clean completion).

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: L stale Marked as stale due to inactivity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants