Environment
- Date observed: 2026-02-13
- OpenClaw: 2026.2.12 (本地运行 2026.2.9,配置为 2026.2.12)
- Node.js: 22.22.0
- OS: macOS 26.3 (arm64)
- Channel: Telegram
- State dir: default (
~/.openclaw)
Summary
When session history is large and cron writes happen concurrently, /new or /reset can appear to "do nothing". Logs show session-store lock timeout and related write lock contention.
Precondition
~/.openclaw/agents/main/sessions/sessions.json grows large
~/.openclaw/agents/main/sessions/ contains many transcript files
- There is active session traffic plus cron jobs writing state
Actual Environment Output
$ openclaw --version
2026.2.9
Config was last written by a newer OpenClaw (2026.2.12); current version is 2026.2.9.
$ du -sh ~/.openclaw/agents/main/sessions ~/.openclaw/agents/main/sessions/sessions.json
242M ~/.openclaw/agents/main/sessions
540K ~/.openclaw/agents/main/sessions/sessions.json
$ ls -la ~/.openclaw/agents/main/sessions/ | wc -l
35 files (including dirs)
Key Error Signals (from logs)
2026-02-13T14:39:14.838Z [diagnostic] lane task error: lane=cron durationMs=600046 error="FailoverError: LLM request timed out."
2026-02-13T14:39:14.842Z [diagnostic] lane task error: lane=session:agent:main:cron:48ab4446-9dca-465a-ae04-56b12b7c797d durationMs=600052 error="FailoverError: LLM request timed out."
2026-02-13T14:49:42.759Z [diagnostic] lane task error: lane=cron durationMs=600111 error="FailoverError: LLM request timed out."
2026-02-13T14:56:41.082Z [diagnostic] lane wait exceeded: lane=session:agent:main:telegram:direct:7688058064 waitedMs=24273 queueAhead=0
Suspected Root Cause
- Session store is a single shared JSON file with a global lock.
- Lock acquisition timeout is fixed at 10s (
timeoutMs = 10000) under heavy concurrent writers.
- Long sessions + cron writes increase lock hold time and contention probability.
Relevant Code References
- src/config/sessions/paths.ts:33 (default session store path =
sessions.json)
- src/config/sessions/store.ts:208 (whole-store JSON stringify/write)
- src/config/sessions/store.ts:302 (lock file path)
- src/config/sessions/store.ts:339 (timeout acquiring lock error)
Impact
- User-facing reliability issue in production-like usage.
- Reset/new commands become non-deterministic under load.
- Perceived "bot freeze" in Telegram/interactive channels.
Temporary Workaround
- Stop gateway.
- Archive old session transcripts out of hot path.
- Reset sessions.json to a small store.
- Restart gateway.
- Reduce cron frequency / concurrent writes.
Proposed Fix Direction
- Move from single global session index write to sharded/per-session metadata storage or append-only journal.
- Add lock retry/backoff and better degradation for reset path.
- Make /new and /reset resilient when lock acquisition fails (queue/retry/ user-visible reason).
- Add load/concurrency regression tests for session-store contention.
Environment
~/.openclaw)Summary
When session history is large and cron writes happen concurrently, /new or /reset can appear to "do nothing". Logs show session-store lock timeout and related write lock contention.
Precondition
~/.openclaw/agents/main/sessions/sessions.jsongrows large~/.openclaw/agents/main/sessions/contains many transcript filesActual Environment Output
Key Error Signals (from logs)
Suspected Root Cause
timeoutMs = 10000) under heavy concurrent writers.Relevant Code References
sessions.json)Impact
Temporary Workaround
Proposed Fix Direction