fix: harden malformed FTS migration recovery#334
Conversation
There was a problem hiding this comment.
Pull request overview
This PR hardens lossless-claw SQLite migration recovery around FTS5 virtual tables to prevent startup/migration failures from partially-present (malformed) FTS backing state and from runtimes that lack the trigram tokenizer.
Changes:
- Add validation + rebuild logic for standalone FTS tables by verifying expected columns and presence of FTS5 shadow tables, and recreating/ reseeding when malformed.
- Add runtime feature probing for
trigramtokenizer support and gate creation ofsummaries_fts_cjkon that probe. - Add a regression test that simulates a missing
summaries_ftsshadow table and asserts migrations rebuild correctly.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| test/migration.test.ts | Adds regression coverage for rebuilding summaries_fts when a shadow table is missing. |
| src/db/migration.ts | Introduces generic standalone-FTS validation/rebuild helpers; applies them to messages_fts, summaries_fts, and conditionally to summaries_fts_cjk. |
| src/db/features.ts | Extends DB feature detection to probe for trigram tokenizer availability. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Restore the broadened migration tests after rebasing onto main and add a changeset for the startup recovery fixes. Regeneration-Prompt: | Rebase the malformed-FTS recovery branch onto origin/main after PR Martian-Engineering#331 landed. Preserve the broader standalone FTS hardening from the branch, but fix the review findings before pushing: startup migrations must not reuse a cached fts5Available value from the engine constructor, and stale summaries_fts_cjk tables on runtimes without trigram support must be dropped before any other FTS schema introspection runs. Keep the existing malformed summaries_fts probe regression from Martian-Engineering#331, update it for the quoted PRAGMA helper, retain the added shadow-table recovery coverage, add a new regression for the stale-trigram ordering issue, and include a patch changeset.
e216e94 to
3143781
Compare
Summary
This PR hardens SQLite migration recovery when Lossless Claw encounters malformed standalone FTS tables or a runtime that does not support the
trigramtokenizer.Why this exists
A restored real-world
lcm.dbfailed during plugin registration with:The original failure pattern was not just a bad
PRAGMA table_info(...)probe. In practice we found two adjacent failure modes:messages_fts/summaries_ftscan exist while required shadow tables are missing.summaries_fts_cjkcan be left behind on a runtime that no longer supportstokenize='trigram'.Either case can make migration or later runtime code trip over a table that exists in name but is not actually safe to use.
Related:
What this PR changes
trigrambefore attempting to createsummaries_fts_cjkmessages_fts/summaries_ftstables when the top-level virtual table exists but backing state is incompletesummaries_fts_cjkwhen the runtime cannot support trigram tokenizationsummaries_fts_cjkcleanup pathDesign decisions and trade-offs
If an FTS table is malformed, the safest behavior is to rebuild it from source rows rather than keep a half-valid structure alive.
Leaving
summaries_fts_cjkaround on a non-trigram runtime is actively risky because later code may key off table existence.Migration should not fail just because stale cleanup also hits an edge-case SQLite error.
Validation
pnpm exec vitest run test/migration.test.ts -t "drops stale summaries_fts_cjk when trigram tokenizer support is unavailable"lcm.dbthat had failed at plugin registration time