fix: handle transient kanban SQLite disk I/O errors#31973
Conversation
|
Read through this — the three changes hang together well and the dispatcher-side handling is correct. One behavior-change tradeoff worth calling out explicitly for the maintainer, since it's the crux of the PR: Removing That's a defensible call (surface-over-mask — a transient I/O error genuinely isn't a WAL-capability signal, and masking it as 'WAL unsupported' hides a real FS problem), and you've backstopped it well: The only thing I'd want confirmed: for a user on a genuinely WAL-incapable mount that also surfaces as 'disk i/o error' (some FUSE/network setups conflate these), do they still end up in a working DELETE-mode session, or do they now hit the propagated raise on every init? If the latter, a brief note in the PR on the intended migration for those users would help. Logic itself looks sound. |
|
Kanban t_c0d6fa7b reopened this after the 2026-05-26 recurrence. Update:
Verification:
|
|
Closing per project governance: this work must not be merged into NousResearch/hermes-agent or any upstream/third-party branch from our automation. Keep the branch only in nuch1011/hermes-agent for local/fork reference unless Christian explicitly requests an upstream contribution. |
Summary
disk I/O erroras proof that WAL is unsupported.Test Plan
git diff --check/usr/local/lib/hermes-agent/venv/bin/python -m pytest tests/test_hermes_state_wal_fallback.py tests/hermes_cli/test_kanban_core_functionality.py -q -o 'addopts='\n- Added-line no-secrets scan overgit diff\n\nKanban:t_1719c3ef