Problem or Use Case
Kanban has recently grown into a durable multi-process task queue, but some queue invariants are still only enforced indirectly through prompts, skill docs, or manual operator recovery.
This shows up as a cluster of related failure modes rather than one isolated bug:
- tasks can be created with invalid
skills values that are actually toolset names
ready tasks can remain dispatchable even when they are obviously undispatchable
- stale / reclaimed / superseded workers can still begin work unless ownership is checked at startup
- operators sometimes need diagnostics and narrow recovery commands for stuck task state
Related issues include:
The problem is not that Kanban needs a redesign. The problem is that a few important invariants are not yet enforced consistently across task creation, dispatch, and worker startup.
Proposed Solution
Do a narrow Kanban hardening pass across three layers:
- Task validity and diagnostics
- reject known toolset names in
task.skills at create time
- surface historical bad rows through diagnostics
- add minimal recovery support for invalid persisted skills
- Dispatch preflight for obviously undispatchable tasks
- hard-skip
ready tasks that are clearly invalid before spawn
- initial hard-skip cases:
- invalid persisted task skills
- missing assignee profile
- keep softer capability concerns advisory-only rather than blocking dispatch
- Worker startup ownership guard
- verify task/run/claim ownership at worker startup before useful work begins
- benign-exit stale / reclaimed / superseded workers
- avoid counting those ownership/lifecycle exits as real worker failures
This should stay intentionally narrow:
- no required_toolsets manifest
- no default profile permission expansion
- no full capability model
- no broad dispatcher redesign
Alternatives Considered
A few broader approaches were considered, but deferred on purpose:
- Full capability modeling for profiles and tasks
- likely useful later, but too large for the current bug cluster
- Expanding default profile toolsets
- changes default permissions and invites a bigger policy debate
- A broader dispatcher redesign
- unnecessary for the current failure chain
- Relying only on prompt/skill guidance
- this is the current weak point; core ownership and validity checks should live in the system, not only in model
behavior
The proposed approach is better because it closes the concrete failure chain with a small number of reviewable
changes.
Feature Type
Reliability / correctness
Scope
Medium (few files, < 300 lines)
Contribution
Linked / stacked PRs
Problem or Use Case
Kanban has recently grown into a durable multi-process task queue, but some queue invariants are still only enforced indirectly through prompts, skill docs, or manual operator recovery.
This shows up as a cluster of related failure modes rather than one isolated bug:
skillsvalues that are actually toolset namesreadytasks can remain dispatchable even when they are obviously undispatchableRelated issues include:
The problem is not that Kanban needs a redesign. The problem is that a few important invariants are not yet enforced consistently across task creation, dispatch, and worker startup.
Proposed Solution
Do a narrow Kanban hardening pass across three layers:
task.skillsat create timereadytasks that are clearly invalid before spawnThis should stay intentionally narrow:
Alternatives Considered
A few broader approaches were considered, but deferred on purpose:
behavior
The proposed approach is better because it closes the concrete failure chain with a small number of reviewable
changes.
Feature Type
Reliability / correctness
Scope
Medium (few files, < 300 lines)
Contribution
Linked / stacked PRs