Skip to content

fix(cli): pin HERMES_KANBAN_BOARD at chat boot to stop subprocess board drift (salvages #20094)#20186

Merged
teknium1 merged 2 commits into
mainfrom
hermes/hermes-9ddf5187
May 5, 2026
Merged

fix(cli): pin HERMES_KANBAN_BOARD at chat boot to stop subprocess board drift (salvages #20094)#20186
teknium1 merged 2 commits into
mainfrom
hermes/hermes-9ddf5187

Conversation

@teknium1

@teknium1 teknium1 commented May 5, 2026

Copy link
Copy Markdown
Contributor

Chat sessions now pin HERMES_KANBAN_BOARD into the process env at boot, so in-process kanban_* tools and shelled-out hermes kanban … subprocesses always resolve to the same board even if a concurrent session runs hermes kanban boards switch mid-turn.

Salvaged from #20094 (@0xDevNinja).

Root cause: kanban_db.connect() resolves the active board via a two-source chain — HERMES_KANBAN_BOARD env var, then the global <root>/kanban/current file. Agent tool calls run in the chat process where env may be set; shell-outs spawn fresh subprocesses with no env inherited, so they fall through to the current file. The current file is global mutable state — another session's boards switch flips it — so mid-turn the same chat routes tool calls to board A while its shell calls hit board B. Symptom: kanban_create returns a task id, immediately followed by hermes kanban show <id> reporting "no such task."

Mirrors the pin the dispatcher already does for spawned workers (kanban_db.py:2622-2623), so every code path resolves the same DB consistently.

Changes

  • hermes_cli/main.py: new _pin_kanban_board_env() helper, called from cmd_chat after the existing env-flag block, before forking to TUI vs CLI. Idempotent — no-op if HERMES_KANBAN_BOARD is already set; swallows get_current_board() failures so chat boot never crashes because the kanban dir is unreadable.
  • tests/hermes_cli/test_pin_kanban_board_env.py: three tests covering pin-when-unset, no-op-when-set, and swallow-on-failure. Autouse fixture snapshots and restores HERMES_KANBAN_BOARD around each test (the helper writes to os.environ directly, bypassing monkeypatch tracking — without the fixture the mutation leaked into TestSharedBoardPaths in the same suite).

Validation

Before After
Concurrent boards switch mid-turn kanban_create and hermes kanban show <id> hit different boards → "no such task" both resolve to the chat's boot-time board
HERMES_KANBAN_BOARD already set unchanged unchanged (idempotent no-op)
Kanban dir missing/unreadable could crash chat boot swallowed, chat proceeds
Targeted tests 3/3 in new test file, 339/339 across full kanban suite

Includes a small follow-up commit fixing env-var leakage in the test file — the original test used monkeypatch.delenv but the helper writes os.environ[...] directly, so the mutation survived teardown and broke 9 unrelated tests in the same suite. Added an autouse snapshot/restore fixture.

Closes #20074

Co-authored-by: 0xDevNinja manmit0x@gmail.com

0xDevNinja and others added 2 commits May 5, 2026 04:35
…rd drift

Without an explicit pin, in-process kanban tools and shelled-out
`hermes kanban …` subprocesses resolve the active board on different
paths: the env var when set, otherwise the global `<root>/kanban/current`
file. When a concurrent session toggles the current-board pointer
mid-turn, the same chat ends up routing tool calls to board A while its
shell calls hit board B, surfacing as phantom "no such task" errors.

Pin the resolved board into env once at `cmd_chat` boot when
HERMES_KANBAN_BOARD isn't already set. Mirrors what the dispatcher does
for spawned workers (kanban_db.py:2622-2623). Idempotent and a no-op
when the env is already pinned by the caller.

Closes #20074
The helper under test writes to os.environ directly, bypassing
monkeypatch tracking. Without an explicit snapshot/restore fixture,
the mutation leaks into subsequent tests and breaks TestSharedBoardPaths
(kanban path resolution reads HERMES_KANBAN_BOARD and routes through
boards/<leaked-slug>/ instead of the test's own HERMES_HOME).

Add an autouse fixture that snapshots the env var before the test and
restores (or pops) it after, regardless of what the helper did.
@teknium1 teknium1 merged commit f8a6db6 into main May 5, 2026
9 of 10 checks passed
@teknium1 teknium1 deleted the hermes/hermes-9ddf5187 branch May 5, 2026 11:37
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/cli CLI entry point, hermes_cli/, setup wizard comp/plugins Plugin system and bundled plugins labels May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: harness kanban CLI invoked from agent session ignores active-board pin, races current file with concurrent boards switch

3 participants