Skip to content

kanban: dispatcher auto-promotes blocked task → respawn worker → protocol_violation loop #28712

@vyductan

Description

@vyductan

Summary

Kanban dispatcher auto-promotes a task that was correctly blocked by its worker (with outcome=blocked, reason=review-required), spawning a fresh worker that has no actionable instructions left. The fresh worker reads the task body, sees the existing review-required handoff comment, finds nothing to do (work is already applied to disk), and exits cleanly without calling kanban_complete or kanban_block — the dispatcher records this as protocol_violation and loops indefinitely.

Hermes version

Hermes Agent v0.14.0 (2026.5.16)
Up to date

Reproduce evidence (real task t_9d1f36e2)

Worker successfully blocks for human review:

[2026-05-19 17:23] [run 7] claimed
[2026-05-19 17:23] [run 7] spawned {pid: 78840}
[2026-05-19 17:32] commented {author: default, len: 1981}    # review-required handoff posted
[2026-05-19 17:32] [run 7] blocked {reason: "review-required: ..."}

Dispatcher then promotes the blocked task back to ready and respawns:

[2026-05-19 17:43] promoted
[2026-05-19 17:43] [run 11] claimed
[2026-05-19 17:43] [run 11] spawned {pid: 83656}
[2026-05-19 17:45] [run 11] protocol_violation {exit_code: 0}
[2026-05-19 17:45] gave_up
[2026-05-19 17:45] promoted                  # <- loops again
[2026-05-19 17:45] [run 12] claimed
[2026-05-19 17:45] [run 12] spawned
[2026-05-19 17:56] [run 12] protocol_violation {exit_code: 0}
[2026-05-19 17:56] gave_up
[2026-05-19 17:56] promoted                  # <- and again
[2026-05-19 17:56] [run 13] claimed

This loop only stopped after I manually hermes kanban reclaim + hermes kanban block again.

Expected behavior

A task whose latest run ended with outcome=blocked should NOT be auto-promoted by the dispatcher. Promotion to ready should require explicit operator action (hermes kanban unblock <id>), exactly like the documented human-in-the-loop pattern in the kanban-orchestrator skill:

Any task can kanban_block() to wait for input. Dispatcher respawns after /unblock.

The dispatcher is respawning even without /unblock.

Actual behavior

Dispatcher promotes the blocked task back to ready after some interval, even though no unblock was issued. The fresh worker has no instructions to act on (work already applied, review handoff already posted) so it exits cleanly without calling kanban_complete/kanban_block. Dispatcher records protocol_violation, gives up that run, and... promotes again.

Impact

  • Burns API calls in a tight loop (each respawn = full agent boot, context load, possibly tool calls before the worker realizes there's nothing to do).
  • Pollutes task history with phantom crashed runs that aren't actually crashes.
  • hermes kanban diag flags it as repeated_crashes which is misleading — the original work succeeded.
  • On rate-limited providers (we hit 429s on Kiro/Anthropic during this), the loop amplifies the rate-limit pressure.

Suggested fix

In the dispatcher promotion logic, check the task's most recent run's outcome:

  • If outcome IN ('blocked', 'completed') — don't promote unless an unblock event has been recorded since.
  • Or: track a requires_human_unblock flag on the task that gets set when a worker calls kanban_block and only cleared by unblock.

Either way, a worker-issued kanban_block should be sticky until the operator unblocks it.

Tangentially related

While reproducing this I also found that hermes kanban init failed with no such column: session_id on a kanban.db that pre-dated the session_id migration. Workaround was ALTER TABLE tasks ADD COLUMN session_id TEXT; CREATE INDEX IF NOT EXISTS idx_tasks_session_id ON tasks(session_id);. The migration code in hermes_cli/kanban_db.py:1168-1179 exists but didn't fire on my DB — possibly because init errored out before reaching the schema-upgrade pass. Probably a separate issue but flagging in case it's part of the same code path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havecomp/cliCLI entry point, hermes_cli/, setup wizardtype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions