Skip to content

fix: harden malformed FTS migration recovery#334

Merged
jalehman merged 5 commits into
Martian-Engineering:mainfrom
electricsheephq:codex/fix-fts-schema-recovery-2
Apr 9, 2026
Merged

fix: harden malformed FTS migration recovery#334
jalehman merged 5 commits into
Martian-Engineering:mainfrom
electricsheephq:codex/fix-fts-schema-recovery-2

Conversation

@100yenadmin

@100yenadmin 100yenadmin commented Apr 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR hardens SQLite migration recovery when Lossless Claw encounters malformed standalone FTS tables or a runtime that does not support the trigram tokenizer.

Why this exists

A restored real-world lcm.db failed during plugin registration with:

Error: malformed database schema (1)

The original failure pattern was not just a bad PRAGMA table_info(...) probe. In practice we found two adjacent failure modes:

  1. messages_fts / summaries_fts can exist while required shadow tables are missing.
  2. summaries_fts_cjk can be left behind on a runtime that no longer supports tokenize='trigram'.

Either case can make migration or later runtime code trip over a table that exists in name but is not actually safe to use.

Related:

What this PR changes

  • probes whether the runtime actually supports trigram before attempting to create summaries_fts_cjk
  • validates standalone FTS tables by checking expected columns and required shadow tables
  • rebuilds malformed messages_fts / summaries_fts tables when the top-level virtual table exists but backing state is incomplete
  • best-effort drops stale summaries_fts_cjk when the runtime cannot support trigram tokenization
  • adds a regression test for the stale summaries_fts_cjk cleanup path

Design decisions and trade-offs

  • Prefer recovery over partial tolerance.
    If an FTS table is malformed, the safest behavior is to rebuild it from source rows rather than keep a half-valid structure alive.
  • Treat unsupported trigram as an environment capability issue, not a soft warning.
    Leaving summaries_fts_cjk around on a non-trigram runtime is actively risky because later code may key off table existence.
  • Keep the stale CJK-table cleanup best-effort.
    Migration should not fail just because stale cleanup also hits an edge-case SQLite error.

Validation

  • pnpm exec vitest run test/migration.test.ts -t "drops stale summaries_fts_cjk when trigram tokenizer support is unavailable"
  • previously validated recovery against a real restored lcm.db that had failed at plugin registration time

@100yenadmin 100yenadmin marked this pull request as ready for review April 9, 2026 06:32
Copilot AI review requested due to automatic review settings April 9, 2026 06:32

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens lossless-claw SQLite migration recovery around FTS5 virtual tables to prevent startup/migration failures from partially-present (malformed) FTS backing state and from runtimes that lack the trigram tokenizer.

Changes:

  • Add validation + rebuild logic for standalone FTS tables by verifying expected columns and presence of FTS5 shadow tables, and recreating/ reseeding when malformed.
  • Add runtime feature probing for trigram tokenizer support and gate creation of summaries_fts_cjk on that probe.
  • Add a regression test that simulates a missing summaries_fts shadow table and asserts migrations rebuild correctly.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
test/migration.test.ts Adds regression coverage for rebuilding summaries_fts when a shadow table is missing.
src/db/migration.ts Introduces generic standalone-FTS validation/rebuild helpers; applies them to messages_fts, summaries_fts, and conditionally to summaries_fts_cjk.
src/db/features.ts Extends DB feature detection to probe for trigram tokenizer availability.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/db/migration.ts
@100yenadmin 100yenadmin changed the title [codex] Harden migration recovery for malformed FTS tables fix: harden malformed FTS migration recovery Apr 9, 2026
@100yenadmin 100yenadmin requested a review from Copilot April 9, 2026 08:08

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/db/migration.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/db/migration.ts Outdated
Comment thread src/db/migration.ts Outdated
Comment thread test/migration.test.ts

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

100yenadmin and others added 5 commits April 9, 2026 11:49
Restore the broadened migration tests after rebasing onto main and add a
changeset for the startup recovery fixes.

Regeneration-Prompt: |
  Rebase the malformed-FTS recovery branch onto origin/main after PR Martian-Engineering#331
  landed. Preserve the broader standalone FTS hardening from the branch, but
  fix the review findings before pushing: startup migrations must not reuse a
  cached fts5Available value from the engine constructor, and stale
  summaries_fts_cjk tables on runtimes without trigram support must be dropped
  before any other FTS schema introspection runs. Keep the existing malformed
  summaries_fts probe regression from Martian-Engineering#331, update it for the quoted PRAGMA
  helper, retain the added shadow-table recovery coverage, add a new regression
  for the stale-trigram ordering issue, and include a patch changeset.
@jalehman jalehman force-pushed the codex/fix-fts-schema-recovery-2 branch from e216e94 to 3143781 Compare April 9, 2026 18:52
@jalehman jalehman merged commit 71d6d9c into Martian-Engineering:main Apr 9, 2026
2 checks passed
@github-actions github-actions Bot mentioned this pull request Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migration can fail on malformed standalone FTS tables

3 participants