feat(sessions): add directory-backed session store#1205
Open
BingqingLyu wants to merge 10 commits intomainfrom
Open
feat(sessions): add directory-backed session store#1205BingqingLyu wants to merge 10 commits intomainfrom
BingqingLyu wants to merge 10 commits intomainfrom
Conversation
Address PR openclaw#51946 follow-up review feedback in the directory-backed session store. Clamp filesystem exposure from oversized session keys by switching oversized keys to fixed-length hashed filenames, store the original key inside each entry payload, reject unsafe legacy migration inputs, and harden load/write paths against blocked prototype keys plus writable directory layouts. Also remove the unused directory load versionToken surface, document the lock invariant on direct directory writes, preserve empty legacy-store saves instead of dropping them, and route directory fast-path writes through full maintenance enforcement when session maintenance is set to enforce. Regeneration-Prompt: | Follow up on OpenClaw PR openclaw#51946 review feedback for the new directory-backed session store. Fix the security issues from Aisle by making long session keys use fixed-length hashed filenames while preserving the original key inside the entry file, reject unsafe legacy migration inputs like symlinks or oversized files, and harden directory loading against __proto__/prototype style keys and writable directory layouts. Also clean up the misleading unused versionToken return, document that direct directory write helpers require the session-store lock, preserve empty legacy-store writes instead of returning early, and keep maintenance enforcement active on directory fast paths. Validate with focused session-store tests plus pnpm check and pnpm build.
Address the two remaining PR openclaw#51946 regressions after rebasing onto current origin/main. Keep resolveAllAgentSessionStoreTargets* returning stores after sessions.json is promoted to sessions.d, and make subagent depth checks read migrated directory-backed stores while preserving legacy JSON5 support. Add focused regressions for both cases. Regeneration-Prompt: | Update the rebased PR openclaw#51946 session-store branch so directory-backed migration does not break existing readers. Keep combined-store target discovery working after sessions.json is promoted away by treating an active sibling sessions.d store as a valid authoritative store, and make subagent-depth read migrated directory-backed stores instead of only the legacy JSON file while still supporting JSON5 in unmigrated stores. Add focused tests covering post-migration target discovery and preserved subagent depth, then verify with the targeted test files plus pnpm build and pnpm check.
Prevent unchanged legacy session-store updates from migrating into directory mode, so the existing no-write contract still holds for no-op saves. Also isolate the Discord exec-approval test store from stale directory-backed siblings on CI runners. Regeneration-Prompt: | Investigate CI failures introduced after adding directory-backed session-store discovery and subagent-depth support. Preserve the existing behavior that an unchanged legacy sessions.json update does not write anything, even if migration support exists. Trace the failing Discord exec-approvals test to shared temp-path state: the test rewrites sessions.json directly, so stale sessions.d state from another run or earlier migration must not remain authoritative. Keep the product fix minimal in the session-store save path, and harden the test fixture so it uses isolated temp storage and removes any directory-backed sibling before each write.
Make the session-store summary helper read through the directory-aware session reader so WhatsApp heartbeat recipient inference still sees migrated sessions.d data. Add a regression test that exercises a migrated store summary directly. Regeneration-Prompt: | A new Codex review comment flagged that WhatsApp heartbeat recipient inference still depended on a helper that only parsed legacy sessions.json. After this PR migrates session stores into sessions.d and renames the JSON file to a backup, existing users would lose session-derived heartbeat recipients unless they passed --to or allowFrom. Fix the summary helper itself rather than the caller so all shallow session-summary readers stay directory-aware, and add a focused regression test that migrates a legacy store and verifies the summary still exposes lastChannel, lastTo, and updatedAt from the directory backend.
Add an explicit rollback path for directory-backed session stores so local testing can restore legacy sessions.json snapshots without changing the one-way startup migration behavior in the PR. Keep the rollback logic in the session-store layer, add a focused regression test, and provide a local script that targets the same store set as startup migration. Regeneration-Prompt: | Keep the PR's automatic migration one-way on gateway startup, but add a manual way for a maintainer to roll a local instance back from sessions.d to legacy sessions.json. The rollback should read the authoritative directory-backed store, write a canonical sessions.json snapshot under the existing session-store lock, and then rename the sessions.d directory to a timestamped backup so the downgraded code can see the legacy file again. Add a targeted regression test for that rollback, and provide a local script that can scan the same session-store targets as startup migration or accept explicit --store paths for dry-run and restore workflows.
Use session-store loaders in tests that now exercise directory-backed stores, and harden the Discord /think autocomplete fixture to clear sibling sessions.d state before rewriting sessions.json. Regeneration-Prompt: | The rebased PR started failing CI in three test-only ways after the session-store migration work. Two tests were still reading sessions.json directly after code paths that now write through the session-store layer and may promote to sessions.d, so update those assertions to verify the logical store via loadSessionStore(...). The Discord native /think autocomplete test also needed the same temp-store hardening pattern used in exec-approvals: isolate the temp directory and remove any sibling sessions.d state before writing the legacy fixture, because CI can otherwise observe stale directory-backed state and fall back to the default model context.
Clarify that session transcripts remain under the agent sessions directory while the metadata store may now be either legacy sessions.json or migrated sessions.d. Also note the backup file pattern left behind after migration. Regeneration-Prompt: | Update the session-logs skill after the session store migration work. Preserve the existing guidance for finding JSONL transcripts under ~/.openclaw/agents/<agentId>/sessions/, but stop implying that sessions.json is always authoritative. Document that migrated agents may use sessions.d instead, and mention the sessions.json.bak.<timestamp> backup artifact so operators know what they are seeing on disk.
Add a read-only migration inspection step for legacy session stores, switch directory migration to return structured outcomes, and move gateway startup onto an aggregated summary log. Empty or missing legacy stores now stay info-level, while invalid stores and execution failures are surfaced in the summary with their paths. Include focused tests for detection and startup reporting. Regeneration-Prompt: | Follow up on the sessions.json to sessions.d migration in PR openclaw#51946 without adding a feature flag. Keep the one-way migration behavior, but harden operator visibility based on patterns used in older state migrations. Add a read-only detection layer for legacy stores, return structured migration outcomes instead of a bare boolean, and have gateway startup log one compact summary with counts for migrated, already-directory, skipped-empty, skipped-invalid, missing, and failed stores. Empty or unused legacy stores should no longer emit scary warning noise by themselves. Preserve the existing crash-safe migration behavior and validate the new detection/result contract with focused tests plus build and check.
Remove the accidental session migration detector re-export, update the stale plugin SDK baseline metadata, and rename the duplicate Anthropic config heading so the deterministic CI failures on this branch clear. Regeneration-Prompt: | The branch picked up new CI failures after the migration-hardening work. Investigate which failures are actually caused by this branch versus inherited base instability. Preserve the migration hardening behavior and tests. Remove any accidental public API surface expansion from the session-store changes instead of broadening exports. If the plugin SDK baseline is stale relative to the current source tree, refresh only the minimal generated baseline entries needed to match the existing public surface. Also fix deterministic docs lint on this branch without pulling in unrelated docs churn.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sessions.jsonstore forces whole-store reads/writes under a shared lock, which causes contention, scaling pain, and memory pressure.sessions.dstate, direct per-entry hot paths, explicit version-stamp cache coherence, crash-safe migration, a local rollback script, and aggregated startup migration reporting.updateSessionStoremutators still serialize under the existing store lock; this does not claim to eliminate every lock/contention source.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
sessions.jsoninto directory-backed storage automatically.scripts/rollback-session-stores.ts.Security Impact (required)
No)No)No)No)No)Yes, explain risk + mitigation:Repro + Verification
Environment
Steps
pnpm build.pnpm check.Expected
Actual
pnpm buildpassed.pnpm checkpassed.Evidence
Attach at least one:
Human Verification (required)
What you personally verified (not just CI), and how:
readSessionUpdatedAtfast pathsessions.dback to legacysessions.jsonReview Conversations
Compatibility / Migration
Yes, with an explicit local rollback path)No)Yes)node --import tsx scripts/rollback-session-stores.ts.Failure Recovery (if this breaks)
node --import tsx scripts/rollback-session-stores.tsif the instance has already migrated tosessions.d, then restart the gateway.sessions.jsonand backs up the directory store assessions.d.bak.<timestamp>.readSessionUpdatedAtregressionsRisks and Mitigations
state.jsonreplaces TTL-only invalidation.