Skip to content

state.db FTS corruption goes undetected — no integrity check, no repair path #33865

@tuancookiez-hub

Description

@tuancookiez-hub

state.db FTS corruption goes undetected — no integrity check, no repair path

Summary

The messages_fts and messages_fts_trigram FTS5 indexes in state.db can become corrupt ("database disk image is malformed"), silently breaking session_search, /resume, /history, and any feature backed by FTS. There is currently:

  1. No integrity check on startup_init_schema() creates/reconciles tables but never runs PRAGMA integrity_check
  2. No FTS health validationhermes doctor only checks SELECT COUNT(*) FROM sessions; it doesn't validate FTS indexes match the messages table
  3. No repair commandhermes sessions has list, prune, stats, rename, export, delete, browse — but no repair
  4. No auto-recovery — When FTS is corrupt, _init_schema() catches sqlite3.OperationalError (table missing) but not sqlite3.DatabaseError (table corrupt/malformed)

Root Cause

FTS5 virtual tables and their triggers insert into the FTS index as part of the message INSERT transaction. If that transaction is interrupted mid-commit (force-kill, WAL checkpoint failure, power loss), the FTS and messages tables desync. The _try_wal_checkpoint() runs every 50 writes but is best-effort with bare except Exception: pass — corrupt FTS during checkpoint is silently swallowed.

Reproduction

  1. Run Hermes with heavy session activity (gateway + CLI + worktree agents sharing state.db)
  2. Force-kill the process (taskkill /F /IM hermes.exe on Windows, or SIGKILL on Linux)
  3. Restart — session_search returns "database disk image is malformed"

Related Issues

Impact

  • session_search (the only way for Hermes to recall cross-session context) is completely broken
  • /resume, /title, /history, /branch all fail
  • The only recovery path is manual: stop gateway, export JSON, rebuild from scratch
  • Users lose session history if they don't know the manual recovery procedure

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt buildercomp/cliCLI entry point, hermes_cli/, setup wizardtype/featureNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions