Skip to content

Kanban DB corruption risk from multi-gateway concurrent SQLite access #30445

@richarddxmmqfans-oss

Description

@richarddxmmqfans-oss

Description

Every Hermes profile gateway opens a SQLite connection to the shared kanban.db on startup, regardless of whether the profile has kanban.dispatch_in_gateway enabled. On a multi-profile setup with 7+ active gateways, this creates 7+ concurrent SQLite connections to the same WAL-mode database.

Root Cause

When one process runs hermes kanban init (which deletes and recreates the file), older gateway processes hold file handles on the old inode. New processes write to the new inode, while old processes may still write to the stale inode, causing database disk image is malformed and disk I/O error.

lsof output from a typical setup:

7 python processes × ~40 file descriptors each = 280+ open handles on kanban.db

Suggested Fixes (any would help)

  1. Gateways with dispatch_in_gateway: false should skip kanban DB initialization entirely
  2. Add a per-profile config flag like kanban.enabled: false to prevent DB connection on non-participating profiles
  3. Single-writer proxy: only one process (the architect gateway) writes to SQLite; other gateways communicate via IPC

Workaround Deployed

We wrote a cron-based health checker (kanban_health.py) that runs PRAGMA integrity_check every 6 hours, kills zombie processes holding stale handles, runs hermes kanban init, and restarts all gateways. This is a band-aid, not a proper fix.

Environment

  • Profiles: architect (dispatch=true), wikid (dispatch=true), a-dev, a-creative, a-eval, a-view, intel-pilot (dispatch=false but still open handles)
  • All gateways running on same Linux host, same user

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havecomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions