Skip to content

fix(memory): readonly sync recovery#25799

Merged
Takhoffman merged 3 commits intoopenclaw:mainfrom
rodrigouroz:fix/memory-readonly-sync-recovery
Feb 27, 2026
Merged

fix(memory): readonly sync recovery#25799
Takhoffman merged 3 commits intoopenclaw:mainfrom
rodrigouroz:fix/memory-readonly-sync-recovery

Conversation

@rodrigouroz
Copy link
Contributor

@rodrigouroz rodrigouroz commented Feb 24, 2026

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: memory sync could get stuck on attempt to write a readonly database and keep failing until gateway restart.
  • Why it matters: session-delta/search memory indexing degrades in long-running gateway sessions.
  • What changed: added in-flight dedup for concurrent MemoryIndexManager.get() creation and added one-shot readonly recovery in sync() (reopen sqlite handle + retry once).
  • What did NOT change (scope boundary): no schema changes, no config/env changes, no provider behavior/API contract changes.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #

User-visible / Behavior Changes

  • Gateway can self-recover memory sync when a stale sqlite handle hits readonly after atomic reindex swap.
  • Concurrent first-use manager requests for the same agent/config now share one initialization path.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node + pnpm
  • Model/provider: mocked embeddings in unit tests
  • Integration/channel (if any): N/A
  • Relevant config (redacted): default sqlite memory index path

Steps

  1. Trigger concurrent manager creation calls for the same cache key.
  2. Force one sync failure with readonly sqlite error.
  3. Run sync again.

Expected

  • Concurrent creation returns a single manager instance.
  • Readonly sync error triggers reopen and a single retry, then succeeds.

Actual

  • Verified with new tests below.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Passing commands:

  • pnpm vitest run src/memory/manager.get-concurrency.test.ts src/memory/manager.readonly-recovery.test.ts
  • pnpm vitest run src/memory/manager.atomic-reindex.test.ts src/memory/manager.sync-errors-do-not-crash.test.ts
  • pnpm check

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: singleton dedup under concurrent get(), readonly self-heal retry path, non-readonly errors do not retry.
  • Edge cases checked: retry happens only on readonly-class errors and only once.
  • What you did not verify: full live channel E2E run on this branch.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert commit Memory: dedupe manager creation and recover readonly sync.
  • Files/config to restore: src/memory/manager.ts.
  • Known bad symptoms reviewers should watch for: unexpected retry behavior on non-readonly sync failures.

Risks and Mitigations

  • Risk: readonly recovery could hide persistent underlying filesystem issues.
    • Mitigation: retry is one-shot and only for readonly-class errors; subsequent failures still surface.
  • Risk: in-flight dedup map could leak pending entries on errors.
    • Mitigation: pending key is always cleaned in finally.

Greptile Summary

Adds two targeted fixes to memory sync reliability: (1) deduplicates concurrent MemoryIndexManager.get() calls to prevent multiple initialization paths racing to create the same manager, and (2) adds one-shot readonly recovery in sync() that reopens stale SQLite handles and retries once after atomic reindex swaps.

Both changes are well-scoped, backward-compatible, and address real production failure modes (concurrent gateway init and readonly handle errors after atomic reindex). The implementation correctly:

  • Uses a pending promise map that cleans up in finally blocks to prevent leaks
  • Limits retry to readonly-class errors only and retries exactly once
  • Resets all vector/FTS state during recovery to ensure clean reinitialization
  • Adds comprehensive test coverage for both dedup and recovery paths

The changelog entry accurately describes the user-visible impact and matches the commit history.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The changes are narrowly scoped to fix specific failure modes with clear mitigations: (1) concurrent manager creation dedup uses standard async pattern with proper cleanup, (2) readonly recovery is limited to one retry on readonly-class errors only. Both fixes have comprehensive test coverage, maintain backward compatibility, and follow established patterns in the codebase. No schema changes, no config changes, no external API changes.
  • No files require special attention

Last reviewed commit: 2d2d1b9

@rodrigouroz rodrigouroz force-pushed the fix/memory-readonly-sync-recovery branch 2 times, most recently from 2d2d1b9 to 9c3e5f8 Compare February 26, 2026 17:28
@Takhoffman Takhoffman force-pushed the fix/memory-readonly-sync-recovery branch from 00ffcbf to 934b24e Compare February 27, 2026 18:26
@Takhoffman Takhoffman merged commit 1867611 into openclaw:main Feb 27, 2026
9 checks passed
@Takhoffman
Copy link
Contributor

PR #25799 - fix(memory): readonly sync recovery (#25799)

Merged via squash.

  • Merge commit: 1867611
  • Verified:
    • pnpm build
    • pnpm check
    • pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with explicit Tak override)
  • Changes made:
    • CHANGELOG.md
    • src/memory/manager.ts
    • src/memory/manager.readonly-recovery.test.ts
    • src/memory/manager.get-concurrency.test.ts
  • Why these changes were made:
    • Preserve the original memory readonly-recovery fix while adding mitigation hardening before landing: broaden readonly error-shape detection and add readonly-recovery telemetry (attempts/successes/failures/lastError), with targeted tests for code-based readonly errors.
  • Changelog: CHANGELOG.md updated=true required=true opt_out=false

Thanks @rodrigouroz!

execute008 pushed a commit to execute008/openclaw that referenced this pull request Feb 27, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
velvet-shark pushed a commit to lailoo/openclaw that referenced this pull request Feb 27, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
r4jiv007 pushed a commit to r4jiv007/openclaw that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
xiexikang pushed a commit to cclawd007/cclawd that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
mylukin pushed a commit to mylukin/openclaw that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
(cherry picked from commit f446d90)
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
(cherry picked from commit f446d90)
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
(cherry picked from commit f446d90)
vincentkoc pushed a commit to Sid-Qin/openclaw that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
vincentkoc pushed a commit to rylena/rylen-openclaw that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
newtontech pushed a commit to newtontech/openclaw-fork that referenced this pull request Feb 28, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Mar 1, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Mar 1, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
steipete pushed a commit to Sid-Qin/openclaw that referenced this pull request Mar 2, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
safzanpirani pushed a commit to safzanpirani/clawdbot that referenced this pull request Mar 2, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
steipete pushed a commit to Sid-Qin/openclaw that referenced this pull request Mar 2, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
venjiang pushed a commit to venjiang/openclaw that referenced this pull request Mar 2, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
robertchang-ga pushed a commit to robertchang-ga/openclaw that referenced this pull request Mar 2, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
dorgonman pushed a commit to kanohorizonia/openclaw that referenced this pull request Mar 3, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
sachinkundu pushed a commit to sachinkundu/openclaw that referenced this pull request Mar 6, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026


Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (fails in this environment at src/daemon/launchd.integration.test.ts beforeAll hook timeout; merged with Tak override)

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants