RFC: review the Kanban — multi-profile collaboration board (PR #16100)

> **UPDATE — Apr 28, 2026** — This RFC is now tracking the implemented PR. Current status, review history, and all bug/fix summaries live on PR #16100: https://github.com/NousResearch/hermes-agent/pull/16100
>
> **Since this RFC was filed:**
> - Moved from a cron-driven dispatcher to a real long-lived daemon (`hermes kanban daemon`) with a systemd unit. Cron was burning LLM tokens per tick.
> - **Runs as first-class** — one row per attempt; preserves full retry history with structured `summary` + `metadata` handoff. (Ported from @vulcan-artivus's review.)
> - Dashboard plugin with drag-drop board, Run History, Worker Log panel, live WebSocket updates, and per-task event attribution.
> - **Four audit passes + external review by @erosika + full battle-test suite** — 15 bugs found and fixed across the cycle. `tests/stress/` exercises real multi-process concurrency, real subprocess E2E, property fuzzing (~40k randomized ops, 9 invariants), and scale benchmarks at 10k tasks.
> - **Tutorial + 10 dashboard screenshots** walking the four user stories: `website/docs/user-guide/features/kanban-tutorial.md`.
> - 182/182 main kanban test suite green.
>
> Comments on this issue remain open for v2 design input (workflow templates, structured comments as multi-peer session substrate, skill-aware routing). PR #16100 is the merge-tracking location.

---

# Request for review: Kanban — durable multi-profile collaboration board

**PR:** https://github.com/NousResearch/hermes-agent/pull/16100
**Design spec:** [`docs/hermes-kanban-v1-spec.pdf`](https://github.com/NousResearch/hermes-agent/blob/feat/kanban-standing/docs/hermes-kanban-v1-spec.pdf) (committed in the PR, 14 sections + diagrams + bibliography)
**Design discussion:** Nous Discord thread, April 25–26 2026 (contributors credited at bottom)

Kanban is a new durable, SQLite-backed task board shared across all Hermes profiles on a host. Tasks carry an assignee (a profile name), optional dependency links, a workspace kind (`scratch` / `worktree` / `dir:<path>`), and an optional tenant namespace. A cron-driven dispatcher atomically claims ready tasks and spawns the assigned profile as its own OS process — no in-process subagent swarms. The `/kanban` slash command works in both CLI and all gateway platforms (same `COMMAND_REGISTRY` pipe).

Before we merge, we'd like eyes on it from anyone who runs multiple profiles, has opinions on agent coordination primitives, or plans to use this for non-coding workloads (research, ops, digital twins, fleet work). The PR is substantial (~2900 LOC including tests, spec, skills, and docs) and introduces a new top-level concept users will need to reason about alongside `delegate_task`.

## The shape at a glance

- **Board:** `~/.hermes/kanban.db`, WAL-mode SQLite, profile-agnostic. Four tables (`tasks`, `task_links`, `task_comments`, `task_events`) + six indexes.
- **Status machine:** `todo → ready → running → done` (plus `blocked` and `archived` side branches). Only one role may transition each status; eliminates write contention.
- **Atomic claim:** compare-and-swap `UPDATE ... WHERE status='ready' AND claim_lock IS NULL` inside `BEGIN IMMEDIATE`. Proven-serial under SQLite's WAL; the test suite includes a concurrent-thread race where exactly one of 8 claimers wins.
- **Dispatcher:** `hermes kanban dispatch` — reclaims stale running tasks (15-min claim TTL), promotes `todo → ready` when all parents `done`, atomically claims, spawns `hermes -p <profile> chat -q "work kanban task <id>"` with `HERMES_KANBAN_TASK` / `HERMES_KANBAN_WORKSPACE` / `HERMES_TENANT` env vars set, redirects output to `~/.hermes/kanban/logs/<id>.log`.
- **Workspace kinds:**
  - `scratch` (default) — fresh tmp dir per task, GC'd on archive.
  - `worktree` — git worktree under `.worktrees/<id>/` for coding tasks.
  - `dir:<path>` — existing shared directory (Obsidian vault, mail ops dir, per-account folder).
- **Tenant column:** one nullable string; one specialist fleet can serve many business contexts (`--tenant business-a`) with data isolation by workspace path + memory key prefix.
- **Zero changes to run_agent.py.** No new core tools. No tool-schema bloat on any API call.

## CLI / gateway surface

Fifteen verbs, all available as both `hermes kanban <verb>` and `/kanban <verb>`:

```
init · create · list · show · assign · link · unlink · claim ·
comment · complete · block · unblock · archive · tail · dispatch · context · gc
```

The slash command bypasses the running-agent guard in the gateway — `/kanban unblock` can free a stuck worker while the main agent is mid-conversation. Board writes don't touch agent state.

## Skills shipped alongside

- `kanban-worker` — how a profile claims context, does work in its workspace, blocks on ambiguity, completes with a result, delegates follow-ups.
- `kanban-orchestrator` — "you are a dispatcher, not a worker" template with anti-temptation rules and a standard specialist roster (`researcher`, `writer`, `analyst`, `backend-eng`, `reviewer`, `ops`).

## Why not just `delegate_task`?

These look similar and they are not the same primitive. The one-sentence distinction: **`delegate_task` is a function call; Kanban is a durable work queue where every handoff is a row any profile (or human) can read and edit.** The full 12-dimension comparison table is in §6 of the spec.

They coexist. A kanban worker may call `delegate_task` internally for reasoning within its own run. The single test: *does this handoff need to outlive a single API loop and be visible to others?*

Use `delegate_task` for short, self-contained reasoning subtasks the parent agent wants an answer to before continuing — seconds-to-minutes, no human in the loop, result goes back into parent's context.

Use Kanban for work that crosses agent boundaries, needs to survive restarts, might need human input, might be picked up by a different role (engineer → reviewer → engineer), or needs to be discoverable after the fact.

## What we'd especially like feedback on

1. **The `delegate_task` / Kanban boundary.** Is the "does this handoff need to outlive a single API loop" test clear enough? Should the spec land a doc page explicitly titled "when to use which"? Are there workloads you can't tell which side of the line they fall on?

2. **Eight collaboration patterns.** Spec §5 names P1 Fan-out, P2 Pipeline, P3 Voting/quorum, P4 Long-running journal, P5 Human-in-the-loop triage, P6 `@mention` delegation, P7 Thread-scoped workspace, P8 Fleet farming. P6 and P8 are the only patterns that require infra beyond the base primitives (P6 is a parser hook; P8 is a `dispatch-fleet` helper). Is the set right? Missing any obvious shapes?

3. **Workspace kinds.** Three: `scratch`, `worktree`, `dir:<path>`. Research / ops / digital-twin use cases all work with the default `scratch`; coding uses `worktree`; long-running journals and per-subject fleets use `dir:`. Is the kind vocabulary right, or should we flatten it (e.g., always `dir:`; scratch is just an auto-allocated path)?

4. **Tenant as one nullable column.** Design choice: tenants are *namespaces*, not entity types. One researcher profile serves multiple businesses via `--tenant business-a`. Is this enough for people actually running multi-business setups (cc @sudo_relax from the design thread), or does it need more — per-tenant access control, cross-tenant task linking, tenant-scoped profile definitions?

5. **Dispatcher cadence.** Runs via cron, default 60 seconds. Cheap "mini dispatch" (recompute ready) also runs on every `hermes kanban list` invocation to keep laptop-sleep-wake cases responsive. Too aggressive? Too conservative? Worth a dedicated long-lived ticker process instead?

6. **Claim TTL.** Default 15 minutes before a claim is considered stale and reclaimed. Workers that know they'll run longer should call `heartbeat_claim()` periodically. Is 15m the right default, or should it scale with profile (e.g., 60m for `backend-eng`, 5m for `researcher`)?

7. **`terminal`-based spawn, output to log file.** `dispatch_once` uses `subprocess.Popen` with `start_new_session=True` and redirects output to `~/.hermes/kanban/logs/<id>.log`. No stdin. Acceptable, or do we need more — PID recording on the task row? Structured log format? Live-streaming output back to the dispatcher's gateway session?

8. **Orchestrator profile design.** The `kanban-orchestrator` skill plus a recommendation to restrict toolsets to `[kanban, gateway, memory]` is the proposed fix for the "orchestrator does the work itself" failure mode raised by @sudo_relax. Is this enough, or do we need kernel-level enforcement (a "router-only" profile flag that the dispatcher honors)?

9. **Running-agent guard bypass.** `/kanban` is in the bypass list (same tier as `/background`). Mutations are allowed mid-run because the board is profile-agnostic and doesn't touch the running agent's state. Worth a stricter rule — mutations gated, reads allowed?

10. **What's left out.** Deliberately not in v1: per-tenant access control, cross-tenant links, tenant-scoped profile definitions, round-robin worker pools, auto-assignment ("any idle profile claims it"), smart routing, per-agent budgets, approval gates, fleet dashboards, org-chart types. All user-space (plugins or profile conventions). If any of these feel like they belong in the kernel, say so now.

11. **Bugs, edge cases, race conditions** — the usual. The concurrent-claim race is tested; the stale-claim recovery is tested; cycle detection in `link_tasks` is tested (caught a bug during implementation — direction of graph walk). What else should be in the test matrix?

## What Kanban does NOT do (intentionally)

- Does not run workers in-process — every worker is a full OS process with its own HERMES_HOME. No SDK-lifecycle fragility (the NanoClaw failure class).
- Does not auto-assign, auto-route, or auto-escalate. All those are user-space profile behaviors.
- Does not delete anything automatically. Archive only; `gc` removes scratch workspace dirs for archived tasks.
- Does not modify `run_agent.py`, `model_tools.py`, or any tool schema.
- Does not invalidate the main session's prompt cache. The board is external to any agent's context.
- Does not cross tenants with task links (v1 limitation; noted in spec §7).

## How to try it locally

```bash
gh pr checkout 16100
hermes kanban init
hermes kanban create "research AI funding" --assignee researcher
hermes kanban list
hermes kanban dispatch --dry-run
# Then for real:
hermes kanban dispatch
```

The two skills (`kanban-worker` + `kanban-orchestrator`) are in `skills/devops/` and load like any other skill.

For a full worked example, see spec §5 (research triage), §6 (the 8 patterns), and §9 (50-account fleet example).

## Related systems & design input

The design synthesizes three existing systems plus one April-2026 release:

- **Cline Kanban** — board + linked tasks + ephemeral worktrees shape. We adopted.
- **Paperclip** — atomic task checkout + persistent agent identity. Mapped onto Hermes profiles.
- **NanoClaw Agent Swarms** — the negative lesson: in-process SDK subagent swarms are fragile to upstream lifecycle semantics. We explicitly reject.
- **Google Gemini Enterprise Agent Designer + CLI Subagents** (April 2026) — portable subagent-as-file artifacts (we'll match in a follow-up `hermes profile export`), and `@name` delegation syntax (implemented as P6).

Community design input from the Nous Discord design thread, credited in the PR body: @Teknium, waxhy, A Real Icehole, Keimpe, LLM.STORE, caco, hunter_cat, djm, ionmanden, psbd, Aiz, Rikllo, sudo_relax, neo2k8.

## Timing

Happy to let this sit for review. The PR is standing — not merged pending design approval. If something needs to change before we ship, flag it on the PR directly. Higher-level design concerns (primitives, scope boundaries, naming) go here on the RFC.

Thanks in advance for the eyes.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: review the Kanban — multi-profile collaboration board (PR #16100) #16102

Request for review: Kanban — durable multi-profile collaboration board

The shape at a glance

CLI / gateway surface

Skills shipped alongside

Why not just `delegate_task`?

What we'd especially like feedback on

What Kanban does NOT do (intentionally)

How to try it locally

Related systems & design input

Timing

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

RFC: review the Kanban — multi-profile collaboration board (PR #16100) #16102

Description

Request for review: Kanban — durable multi-profile collaboration board

The shape at a glance

CLI / gateway surface

Skills shipped alongside

Why not just delegate_task?

What we'd especially like feedback on

What Kanban does NOT do (intentionally)

How to try it locally

Related systems & design input

Timing

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Why not just `delegate_task`?