refactor(memory-wiki): store import runs in sqlite by steipete · Pull Request #91108 · openclaw/openclaw

steipete · 2026-06-07T07:28:47Z

Summary

Move Memory Wiki ChatGPT import-run metadata from .openclaw-wiki/import-runs/*.json into SQLite plugin state.
Keep rollback snapshot markdown files under .openclaw-wiki/import-runs/<runId>/snapshots/ as explicit vault artifacts.
Add doctor migration for legacy import-run JSON records, archiving migrated JSON files while preserving snapshot directories.
Split run metadata and page paths into bounded plugin-state rows, with pre-mutation capacity checks for live imports.

Verification

git diff --check
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.extensions.json extensions/memory-wiki/src/import-runs-state.ts extensions/memory-wiki/src/chatgpt-import.ts extensions/memory-wiki/src/import-runs.ts extensions/memory-wiki/index.ts extensions/memory-wiki/doctor-contract-api.ts extensions/memory-wiki/doctor-contract-api.test.ts extensions/memory-wiki/src/cli.test.ts extensions/memory-wiki/src/import-runs.test.ts
pnpm tsgo:extensions
node scripts/run-vitest.mjs extensions/memory-wiki/doctor-contract-api.test.ts extensions/memory-wiki/src/cli.test.ts extensions/memory-wiki/src/import-runs.test.ts extensions/memory-wiki/src/gateway.test.ts extensions/memory-wiki/index.test.ts
.agents/skills/autoreview/scripts/autoreview --mode local => clean, no accepted/actionable findings.

clawsweeper · 2026-06-07T07:30:38Z

Codex review: needs real behavior proof before merge. Reviewed June 7, 2026, 3:46 AM ET / 07:46 UTC.

Summary
This PR moves Memory Wiki ChatGPT import-run metadata from vault JSON files into SQLite plugin state, adds a doctor migration for legacy JSON records, and keeps rollback snapshots in the vault.

PR surface: Source +581, Tests +161. Total +742 across 8 files.

Reproducibility: yes. from source inspection: create a legacy JSON record, leave an incomplete SQLite meta row for the same run id, then rerun doctor migration. The migration filters the legacy record out by run id and archives the JSON, so the complete path metadata is lost from the canonical state.

Review metrics: 1 noteworthy metric.

State Surface Migration: 1 runtime store moved, 1 doctor migration added. Import-run rollback metadata moves from vault JSON files to a bounded SQLite plugin-state namespace, so retry and upgrade behavior matter before merge.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🦐 gold shrimp
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P1] Fix the doctor migration retry path so incomplete existing state is completed from legacy JSON before archiving.
[P1] Add redacted real behavior proof for fresh import, doctor migration of a legacy vault, and rollback on the PR build.

Proof guidance:

[P1] Needs real behavior proof before merge: The PR body lists tests and lint only; it needs redacted terminal/live output or an artifact showing fresh import, doctor migration of legacy JSON, and rollback after this patch. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

[P1] Existing vaults with legacy .openclaw-wiki/import-runs/*.json records rely on the new doctor migration before SQLite-only listing and rollback can see old runs.
[P1] If doctor migration is interrupted after a meta row is written, the current retry path can archive the complete legacy JSON without restoring missing path rows.
[P1] Large ChatGPT imports now fail before mutation when projected import-run rows exceed the 20,000-row SQLite namespace cap; that is safer than partial mutation but still compatibility-visible.

Maintainer options:

Repair Retry Safety And Prove Upgrade (recommended)
Fix the doctor migration so incomplete existing SQLite records are replaced or completed from legacy JSON before archiving, then add redacted real output for migration, import, and rollback.
Accept The Storage Migration Risk
A maintainer can intentionally merge on the SQLite direction and tests while owning the retry-safety and real-proof gaps for upgraded vaults.

Next step before merge

[P1] The PR has a protected maintainer label, missing contributor real behavior proof, and an upgrade-sensitive migration defect, so it needs human/contributor follow-up rather than a ClawSweeper repair marker.

Security
Cleared: No concrete security or supply-chain regression was found; the diff uses existing plugin-state APIs and does not change dependencies, workflows, permissions, or secret handling.

Review findings

[P2] Re-import legacy runs when state is incomplete — extensions/memory-wiki/doctor-contract-api.ts:225

Review details

Best possible solution:

Merge only after the migration retry path repairs incomplete SQLite rows from legacy JSON and maintainers accept the upgrade/cap behavior with real import, migration, and rollback proof.

Do we have a high-confidence way to reproduce the issue?

Yes, from source inspection: create a legacy JSON record, leave an incomplete SQLite meta row for the same run id, then rerun doctor migration. The migration filters the legacy record out by run id and archives the JSON, so the complete path metadata is lost from the canonical state.

Is this the best way to solve the issue?

No, not yet: SQLite plugin state plus doctor migration is the right direction, but the retry path must treat incomplete existing rows as repairable from legacy JSON before archiving. Real upgraded-vault proof is also still required before merge.

Full review comments:

[P2] Re-import legacy runs when state is incomplete — extensions/memory-wiki/doctor-contract-api.ts:225
If openclaw doctor --fix is interrupted or a store write fails after the meta row is registered but before all path rows are written, this existingRunIds check treats the run as migrated. The retry then archives the legacy JSON without restoring the created/updated path metadata needed for accurate listing and rollback; repair or overwrite incomplete state from the legacy record before archiving it.
Confidence: 0.87

Overall correctness: patch is incorrect
Overall confidence: 0.87

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against ebabf5022ffb.

Label changes

Label justifications:

P2: This is a normal-priority bundled-plugin storage migration with a concrete upgrade retry defect, but no evidence of an urgent live regression.
merge-risk: 🚨 compatibility: Merging changes existing Memory Wiki vault import-run metadata from JSON files to SQLite state and relies on doctor migration for old listing and rollback behavior.
rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🦐 gold shrimp.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body lists tests and lint only; it needs redacted terminal/live output or an artifact showing fresh import, doctor migration of legacy JSON, and rollback after this patch. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Evidence reviewed

PR surface:

Source +581, Tests +161. Total +742 across 8 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	5	683	102	+581
Tests	3	161	0	+161
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	8	844	102	+742

What I checked:

Repository policy applied: Root policy was read fully and requires OpenClaw-owned runtime state/cache to live in SQLite, with legacy state normalized through doctor/migration code before runtime uses the canonical store. (AGENTS.md:90, ebabf5022ffb)
Scoped extension policy applied: The scoped extensions guide was read and keeps bundled plugin production code inside plugin/SDK boundaries; this PR stays within memory-wiki and plugin-state SDK surfaces. (extensions/AGENTS.md:1, ebabf5022ffb)
Current main still uses JSON import-run records: Current main writes ChatGPT import-run records to .openclaw-wiki/import-runs/<runId>.json and reads rollback metadata from that JSON path, so the PR is not obsolete on main. (extensions/memory-wiki/src/chatgpt-import.ts:682, ebabf5022ffb)
Current main listing reads legacy JSON files: Current main lists Memory Wiki import runs by reading .json files from .openclaw-wiki/import-runs, confirming the central behavior is changed by the PR. (extensions/memory-wiki/src/import-runs.ts:110, ebabf5022ffb)
PR adds SQLite import-run store: The PR adds import-runs-state.ts with meta/path rows, read/write/list helpers, and a 20,000-row namespace cap for Memory Wiki import-run state. (extensions/memory-wiki/src/import-runs-state.ts:341, 2a1e1a1a8428)
PR preflights live import capacity: The PR counts existing import-run state rows and throws before page mutation when a live ChatGPT import would exceed the SQLite namespace cap. (extensions/memory-wiki/src/chatgpt-import.ts:763, 2a1e1a1a8428)

Likely related people:

steipete: Recent GitHub history shows @steipete authored the adjacent Memory Wiki source-sync SQLite refactor and several recent Memory Wiki maintenance commits, and this PR follows that storage pattern. (role: recent area contributor; confidence: high; commits: a4236bd6faf6, 96e581242605, 5b895f259251; files: extensions/memory-wiki/src/source-sync-state.ts, extensions/memory-wiki/doctor-contract-api.ts, extensions/memory-wiki/src/chatgpt-import.ts)
vincentkoc: GitHub history shows @vincentkoc restored the Memory Wiki stack and authored source-sync/import-adjacent Memory Wiki work that this storage migration builds on. (role: feature-history owner; confidence: medium; commits: 5716d83336fd, a78c4de737c5, 94256ea1a076; files: extensions/memory-wiki/src/source-sync-state.ts, extensions/memory-wiki/src/chatgpt-import.ts, extensions/memory-wiki/index.ts)
mbelinky: GitHub history shows @mbelinky authored the earlier Memory Wiki imports/palace surfacing work that touched the current ChatGPT import path. (role: adjacent feature contributor; confidence: medium; commits: 64693d2e96ab; files: extensions/memory-wiki/src/chatgpt-import.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2a1e1a1a84

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-07T07:43:12Z

+        }
+        const existingRecords = await listMemoryWikiImportRunRecords(vaultRoot, store);
+        const existingRunIds = new Set(existingRecords.map((record) => record.runId));
+        const importedRecords = records.filter((record) => !existingRunIds.has(record.runId));


Re-import legacy runs when state is incomplete

If openclaw doctor --fix is interrupted or a store write fails after the meta row is registered but before all path rows are written, the legacy JSON remains on disk but this retry path treats the run id as already migrated. Because writeMemoryWikiImportRunRecord writes the meta row before the path rows, a later retry will skip the JSON here and then archive it, permanently losing the created/updated path metadata needed for accurate listing and rollback. Prefer replacing/repairing from the legacy record unless the existing state is proven complete.

Useful? React with 👍 / 👎.

steipete · 2026-06-07T08:04:37Z

Land-ready proof for 1ce630714a0.

Summary:

Stores Memory Wiki ChatGPT import-run metadata in plugin-state SQLite rows instead of JSON sidecars.
Adds doctor migration for legacy .openclaw-wiki/import-runs/*.json, archiving migrated sidecars while preserving rollback snapshots.
Keeps rollback snapshot files as user-facing artifacts and avoids writing no-op import-run records.

Local proof:

git diff --check origin/main...HEAD
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.extensions.json extensions/memory-wiki/src/import-runs-state.ts extensions/memory-wiki/src/chatgpt-import.ts extensions/memory-wiki/src/import-runs.ts extensions/memory-wiki/index.ts extensions/memory-wiki/doctor-contract-api.ts extensions/memory-wiki/doctor-contract-api.test.ts extensions/memory-wiki/src/cli.test.ts extensions/memory-wiki/src/import-runs.test.ts
pnpm tsgo:extensions
node scripts/run-vitest.mjs extensions/memory-wiki/doctor-contract-api.test.ts extensions/memory-wiki/src/cli.test.ts extensions/memory-wiki/src/import-runs.test.ts extensions/memory-wiki/src/gateway.test.ts extensions/memory-wiki/index.test.ts
.agents/skills/autoreview/scripts/autoreview --mode local clean, no accepted/actionable findings.

CI:

GitHub checks green on latest head, including check-lint, check-prod-types, check-test-types, build-artifacts, and the changed-path-selected shards.

openclaw-barnacle Bot added extensions: memory-wiki size: L maintainer Maintainer-authored PR labels Jun 7, 2026

steipete force-pushed the refactor/memory-wiki-import-runs-sqlite branch from c485d50 to 2a1e1a1 Compare June 7, 2026 07:40

chatgpt-codex-connector Bot reviewed Jun 7, 2026

View reviewed changes

openclaw-barnacle Bot added the scripts Repository scripts label Jun 7, 2026

refactor(memory-wiki): store import runs in sqlite

1ce6307

steipete force-pushed the refactor/memory-wiki-import-runs-sqlite branch from c47ea4f to 1ce6307 Compare June 7, 2026 08:00

openclaw-barnacle Bot removed the scripts Repository scripts label Jun 7, 2026

steipete merged commit 7a3d24e into main Jun 7, 2026
155 of 157 checks passed

steipete deleted the refactor/memory-wiki-import-runs-sqlite branch June 7, 2026 08:04

github-actions Bot mentioned this pull request Jun 7, 2026

📡 Upstream Digest — 2026-06-07 10:22 UTC curtismercier/openclaw-mods#1033

Open

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 8, 2026

refactor(memory-wiki): store import runs in sqlite (openclaw#91108)

5172c00

wangmiao0668000666 pushed a commit to wangmiao0668000666/openclaw that referenced this pull request Jun 9, 2026

refactor(memory-wiki): store import runs in sqlite (openclaw#91108)

547e92b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(memory-wiki): store import runs in sqlite#91108

refactor(memory-wiki): store import runs in sqlite#91108
steipete merged 1 commit into
mainfrom
refactor/memory-wiki-import-runs-sqlite

steipete commented Jun 7, 2026

Uh oh!

clawsweeper Bot commented Jun 7, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 7, 2026

Uh oh!

steipete commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

steipete commented Jun 7, 2026

Summary

Verification

Uh oh!

clawsweeper Bot commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

steipete commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented Jun 7, 2026 •

edited

Loading