feat(kanban): gate notifier watcher on dispatch_in_gateway#31964
feat(kanban): gate notifier watcher on dispatch_in_gateway#31964steveonjava wants to merge 2 commits into
Conversation
When dispatch_in_gateway is false (the default), the Kanban notifier watcher now exits immediately after the initial status sync. This prevents multi-gateway deployments from running duplicate watchers when only one gateway is acting as dispatcher. No behavior change for single-gateway setups (dispatch_in_gateway defaults to true there).
6f37aa0 to
573e50e
Compare
|
Rebased onto current |
|
Heads up on a CI failure in this PR's checks that doesn't look related to the change here:
Looks like a known-fragile area. The sibling test Locally the test passes 60-for-60 in <1s. On I don't want to bolt a
Let me know which route makes sense. |
Non-dispatch gateways no longer open per-board kanban DBs for notifier polling. Mirrors the existing dispatcher gate (config kanban.dispatch_in_gateway, default True; env override HERMES_KANBAN_DISPATCH_IN_GATEWAY) so multi-gateway setups collapse to a single process holding kanban.db file descriptors. Salvaged from PR #31964 by @steveonjava; tests and docs trimmed during salvage.
|
Salvaged onto current main via #37174 with your authorship preserved (rebase-merge). Trimmed during salvage — removed a duplicate test and the redundant ASCII diagram, 188 to 137 LOC. Thanks @steveonjava! |
Non-dispatch gateways no longer open per-board kanban DBs for notifier polling. Mirrors the existing dispatcher gate (config kanban.dispatch_in_gateway, default True; env override HERMES_KANBAN_DISPATCH_IN_GATEWAY) so multi-gateway setups collapse to a single process holding kanban.db file descriptors. Salvaged from PR #31964 by @steveonjava; tests and docs trimmed during salvage.
Non-dispatch gateways no longer open per-board kanban DBs for notifier polling. Mirrors the existing dispatcher gate (config kanban.dispatch_in_gateway, default True; env override HERMES_KANBAN_DISPATCH_IN_GATEWAY) so multi-gateway setups collapse to a single process holding kanban.db file descriptors. Salvaged from PR NousResearch#31964 by @steveonjava; tests and docs trimmed during salvage.
What changes
When you run multiple gateway processes (e.g. one per messaging platform, or a separate dispatch-only process), only the gateway that owns kanban dispatch needs to poll kanban databases for task-completion notifications. This PR adds that gate.
Non-dispatch gateways no longer open kanban DBs for notifier polling. No behavior change for single-gateway setups — the gate is open by default unless
kanban.dispatch_in_gateway: falseis set in config (or theHERMES_KANBAN_DISPATCH_IN_GATEWAY=falseenv var is set).Why you'd notice this
Before this change, running N gateway processes on the same host causes all N processes to open the kanban DB files concurrently. For each open, SQLite's WAL mode writes to the
-shmshared-memory file. Under moderate task throughput those concurrent opens compound into reader-count exhaustion inside the-shmfile, which surfaces as database corruption errors or I/O errors in the non-dispatch gateways. The corruption only affects non-dispatch processes (they only polled, never wrote), but the error messages look alarming.After this change the non-dispatch gateways exit the notifier path early, reducing concurrent DB opens to 1.
When to use
kanban.dispatch_in_gateway: falsein each non-dispatch gateway's config, or setHERMES_KANBAN_DISPATCH_IN_GATEWAY=false. This eliminates the concurrent-open problem entirely.How to Test
All 5 notifier-gate tests pass. See
docs/kanban/multi-gateway.mdfor deployment patterns.Checklist
docs/kanban/multi-gateway.md)