feat: directory-per-session store to eliminate monolithic JSON bottleneck#19528
feat: directory-per-session store to eliminate monolithic JSON bottleneck#19528binary64 wants to merge 3 commits intoopenclaw:mainfrom
Conversation
| * Sanitize a session key for safe use as a filesystem directory name. | ||
| * Colons are replaced with `--`, and other unsafe chars are percent-encoded. |
There was a problem hiding this comment.
comment says "other unsafe chars are percent-encoded" but implementation only replaces colons
| * Sanitize a session key for safe use as a filesystem directory name. | |
| * Colons are replaced with `--`, and other unsafe chars are percent-encoded. | |
| * Sanitize a session key for safe use as a filesystem directory name. | |
| * Colons are replaced with `--` for reversible filesystem-safe encoding. |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 42:43
Comment:
comment says "other unsafe chars are percent-encoded" but implementation only replaces colons
```suggestion
* Sanitize a session key for safe use as a filesystem directory name.
* Colons are replaced with `--` for reversible filesystem-safe encoding.
```
How can I resolve this? If you propose a fix, please make it concise.6cf4dc4 to
bf742ad
Compare
…neck
Add an alternative directory-based session store layout where each session
gets its own directory with a meta.json file, replacing the monolithic
sessions.json that can grow to 80MB+ with 650+ sessions.
Directory layout:
sessions.d/
<sanitized-session-key>/
meta.json # SessionEntry (~200 bytes)
Key changes:
- Add sanitizeSessionKey/desanitizeSessionKey for filesystem-safe names
(colons → double-dash, reversible)
- Add resolveSessionStoreDir() to derive sessions.d/ path from storePath
- loadSessionStore() auto-detects directory vs JSON mode
- saveSessionStoreUnlocked() uses diff-based writes in directory mode
(only writes changed entries, deletes removed ones)
- readSessionUpdatedAt() reads single entry without loading full store
in directory mode (O(1) instead of O(n))
- rotateSessionFile() is a no-op in directory mode (individual files
are tiny, no rotation needed)
- Add migrateSessionStoreToDirectory() for explicit migration from
legacy JSON to directory layout (backs up original file)
- Add comprehensive tests for migration, CRUD, concurrent writes,
key sanitization round-trips
Design decisions:
- storePath remains pointing at sessions.json for backward compatibility
(many callers use path.dirname(storePath) to locate transcript files)
- Directory mode only activates if sessions.d/ exists (after migration)
- Un-migrated stores continue using JSON unchanged
- Migration is explicit via migrateSessionStoreToDirectory(), not automatic
- All existing function signatures preserved — zero caller changes needed
- updateSessionStore takes a pre-mutation snapshot for diff-based writes
Performance improvements in directory mode:
- O(1) single-session reads instead of O(n) full-store parse
- O(changed) writes instead of O(n) full-store serialize
- No file rotation needed (no monolithic file to rotate)
- Per-entry files are ~200 bytes vs 80MB+ monolithic JSON
…c comment - Fix type errors in exec approval handler tests by casting mock respond functions and adding missing client fields (version, platform, mode) - Fix isWebchatConnect noop to return boolean instead of void - Cast health handler test params through unknown for partial test objects - Update sanitizeSessionKey JSDoc to accurately describe implementation (only replaces colons, does not percent-encode other chars)
8f92b46 to
0c1bc32
Compare
|
This pull request has been automatically marked as stale due to inactivity. |
|
Closing due to inactivity. |
Greptile SummaryThis PR replaces the monolithic Issues found:
Confidence Score: 2/5
|
| if (useDirectoryStore) { | ||
| const currentDirMtime = getDirStoreMtimeMs(resolveSessionStoreDir(storePath)); | ||
| if (currentDirMtime === cached.dirMtimeMs) { | ||
| return structuredClone(cached.store); | ||
| } |
There was a problem hiding this comment.
Directory-mtime cache invalidation misses in-place updates
getDirStoreMtimeMs checks the mtime of the sessions.d/ directory itself. On Linux/macOS, a directory's mtime is updated only when its direct children are added or removed — not when a file inside a subdirectory is modified. So when process A calls updateSessionStore and rewrites sessions.d/agent--main--test/meta.json, the sessions.d/ directory mtime is unchanged; process B's cache check (currentDirMtime === cached.dirMtimeMs) will still return true and serve the stale entry for up to 45 seconds.
By contrast, the legacy JSON path is correct: every write changes sessions.json's mtime, so cross-process invalidation works.
For new session directories being created the mtime does change (a new child is added to sessions.d/), but existing-session updates are silently missed.
A safer strategy would be to track the mtime of each session sub-directory (sessions.d/<key>/) rather than the root sessions.d/ directory, or to skip the mtime check entirely for directory stores and rely solely on the TTL.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 395-399
Comment:
**Directory-mtime cache invalidation misses in-place updates**
`getDirStoreMtimeMs` checks the mtime of the `sessions.d/` directory itself. On Linux/macOS, a directory's `mtime` is updated only when its *direct* children are added or removed — **not** when a file inside a subdirectory is modified. So when process A calls `updateSessionStore` and rewrites `sessions.d/agent--main--test/meta.json`, the `sessions.d/` directory mtime is unchanged; process B's cache check (`currentDirMtime === cached.dirMtimeMs`) will still return `true` and serve the stale entry for up to 45 seconds.
By contrast, the legacy JSON path is correct: every write changes `sessions.json`'s mtime, so cross-process invalidation works.
For new session directories being created the mtime does change (a new child is added to `sessions.d/`), but existing-session updates are silently missed.
A safer strategy would be to track the mtime of each session sub-directory (`sessions.d/<key>/`) rather than the root `sessions.d/` directory, or to skip the mtime check entirely for directory stores and rely solely on the TTL.
How can I resolve this? If you propose a fix, please make it concise.| export function desanitizeSessionKey(dirName: string): string { | ||
| return dirName.replace(/--/g, ":"); | ||
| } |
There was a problem hiding this comment.
Key collision when session keys contain
--
The sanitization is not injective: any -- already present in a session key is indistinguishable from one produced by replacing :. This causes two classes of problem:
- Silent collision – keys
a:b:canda--b:cboth map to the directory namea--b--c. The second write silently overwrites the first. - Wrong round-trip –
desanitizeSessionKey(sanitizeSessionKey("a--b:c"))returns"a:b:c"instead of"a--b:c", so loading the store after migration produces the wrong key.
Example:
sanitizeSessionKey("a:b:c") // "a--b--c"
sanitizeSessionKey("a--b:c") // "a--b--c" ← same!
desanitizeSessionKey("a--b--c") // "a:b:c" ← not "a--b:c"
Consider a different separator that cannot appear in real session keys (e.g. a single %3A percent-encoding, or a character like + that colons are never turned into), or alternatively assert/reject keys that contain -- during sanitizeSessionKey.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 53-55
Comment:
**Key collision when session keys contain `--`**
The sanitization is not injective: any `--` already present in a session key is indistinguishable from one produced by replacing `:`. This causes two classes of problem:
1. **Silent collision** – keys `a:b:c` and `a--b:c` both map to the directory name `a--b--c`. The second write silently overwrites the first.
2. **Wrong round-trip** – `desanitizeSessionKey(sanitizeSessionKey("a--b:c"))` returns `"a:b:c"` instead of `"a--b:c"`, so loading the store after migration produces the wrong key.
Example:
```
sanitizeSessionKey("a:b:c") // "a--b--c"
sanitizeSessionKey("a--b:c") // "a--b--c" ← same!
desanitizeSessionKey("a--b--c") // "a:b:c" ← not "a--b:c"
```
Consider a different separator that cannot appear in real session keys (e.g. a single `%3A` percent-encoding, or a character like `+` that colons are never turned into), or alternatively assert/reject keys that contain `--` during `sanitizeSessionKey`.
How can I resolve this? If you propose a fix, please make it concise.| async function migrateJsonToDirectory(storePath: string): Promise<void> { | ||
| const storeDir = resolveSessionStoreDir(storePath); | ||
| let store: Record<string, SessionEntry> = {}; | ||
|
|
||
| try { | ||
| const raw = fs.readFileSync(storePath, "utf-8"); | ||
| if (raw.length === 0) { | ||
| return; | ||
| } | ||
| const parsed = JSON.parse(raw); | ||
| if (parsed && typeof parsed === "object" && !Array.isArray(parsed)) { | ||
| store = parsed as Record<string, SessionEntry>; | ||
| } | ||
| } catch { | ||
| return; // Can't read/parse the file, nothing to migrate | ||
| } | ||
|
|
||
| const keys = Object.keys(store); | ||
| if (keys.length === 0) { | ||
| return; | ||
| } | ||
|
|
||
| log.info("migrating session store from JSON to directory layout", { | ||
| entries: keys.length, | ||
| storePath, | ||
| storeDir, | ||
| }); | ||
|
|
||
| // Create the directory store | ||
| await fs.promises.mkdir(storeDir, { recursive: true }); | ||
|
|
||
| // Write each entry | ||
| for (const [key, entry] of Object.entries(store)) { | ||
| if (!entry) { | ||
| continue; | ||
| } | ||
| await writeSessionEntryToDir(storeDir, key, entry); | ||
| } | ||
|
|
||
| // Backup and remove the old JSON file | ||
| const backupPath = `${storePath}.pre-directory-migration.${Date.now()}`; | ||
| try { | ||
| await fs.promises.rename(storePath, backupPath); | ||
| log.info("backed up legacy sessions.json", { | ||
| backupPath: path.basename(backupPath), | ||
| }); | ||
| } catch { | ||
| // If rename fails, just leave the file. Directory store takes precedence. | ||
| } | ||
| } |
There was a problem hiding this comment.
Partial migration leaves store in a permanently inconsistent state
migrateJsonToDirectory creates sessions.d/ before writing all entries. If any writeSessionEntryToDir call throws (disk full, permission error, etc.), the function exits early — sessions.d/ now exists with only some entries, and the rename of sessions.json to the backup path never runs.
On the next load:
isDirectoryStorereturnstrue(becausesessions.d/was created)- The directory is read — only the successfully-migrated entries are visible
sessions.jsonis still present but never read in directory mode- All sessions that weren't written yet are silently lost
The public migrateSessionStoreToDirectory wrapper's idempotency check (isDirectoryStore → return false) means re-running migration won't recover the missing entries either.
A safer approach is to write all entries first, verify success, and only then create/rename the directory (or use an in-progress marker that is removed only on clean completion).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/store.ts
Line: 180-229
Comment:
**Partial migration leaves store in a permanently inconsistent state**
`migrateJsonToDirectory` creates `sessions.d/` before writing all entries. If any `writeSessionEntryToDir` call throws (disk full, permission error, etc.), the function exits early — `sessions.d/` now exists with only some entries, and the rename of `sessions.json` to the backup path never runs.
On the next load:
- `isDirectoryStore` returns `true` (because `sessions.d/` was created)
- The directory is read — only the successfully-migrated entries are visible
- `sessions.json` is still present but **never read** in directory mode
- All sessions that weren't written yet are silently lost
The public `migrateSessionStoreToDirectory` wrapper's idempotency check (`isDirectoryStore → return false`) means re-running migration won't recover the missing entries either.
A safer approach is to write all entries first, verify success, and only then create/rename the directory (or use an in-progress marker that is removed only on clean completion).
How can I resolve this? If you propose a fix, please make it concise.
Problem
The session store is a single monolithic JSON file (
sessions.json) that grows to 80MB+ with 650+ sessions. Every read/write operation:JSON.parsealone).tmpfilesThis is O(n) for what should be O(1) operations.
Solution: Directory-per-session layout
Key changes in
src/config/sessions/store.ts:sessions.d/directory exists, uses directory mode. Otherwise falls back to legacy JSON (backward compatible).updateSessionStore()snapshots the store before mutation, then only writes changed entries and deletes removed ones — no more full rewrites.readSessionUpdatedAt()reads a singlemeta.jsoninstead of parsing the entire store.meta.jsonuses temp-file + rename, same as the existing pattern.migrateSessionStoreToDirectory(storePath)splits an existing JSON file into the directory layout, backs up the original file, and is safe to call multiple times.Performance characteristics:
Migration path:
sessions.jsoncontinues working as-ismigrateSessionStoreToDirectory(storePath)to opt insessions.json.pre-directory-migration.<timestamp>New exports:
migrateSessionStoreToDirectory(storePath)— trigger migrationresolveSessionStoreDir(storePath)— get directory pathsanitizeSessionKey(key)/desanitizeSessionKey(dirName)— key ↔ dirname conversionTests
Added
store.directory.test.tswith 395 lines covering:Greptile Summary
Migrates session storage from monolithic JSON to directory-per-session layout, addressing O(n) bottleneck with 80MB+ files. Implementation adds auto-detection, diff-based writes, per-entry I/O, atomic operations, and backward-compatible migration.
Key changes:
sessions.d/) with per-sessionmeta.jsonfiles (~200 bytes each)updateSessionStore()- only changed/removed entries are writtenreadSessionUpdatedAt()now reads single file instead of parsing entire storemigrateSessionStoreToDirectory()with automatic backupTechnical review:
:→--) works for current session key patterns but could collide if keys naturally contained--(not present in codebase)Confidence Score: 4/5
Last reviewed commit: bc69f6b
(2/5) Greptile learns from your feedback when you react with thumbs up/down!