Skip to content

feat(kanban): worker visibility endpoints (workers/active, runs/{id}, inspect) (#23761)#28432

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-de55f5dd
May 19, 2026
Merged

feat(kanban): worker visibility endpoints (workers/active, runs/{id}, inspect) (#23761)#28432
teknium1 merged 1 commit into
mainfrom
hermes/hermes-de55f5dd

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Salvages #23761 by @Interstellar-code.

Adds worker visibility endpoints to dashboard plugin API (463 LOC), self-contained, plugin_api exists.

Cherry-picked onto current main with original authorship preserved via rebase merge.

… inspect)

Adds three read-only endpoints to the kanban dashboard plugin so the
SwitchUI workspace (and any other dashboard consumer) can track
workers across tasks without N+1 round-trips through /tasks/{task_id}.

- GET /workers/active
  Single SQL JOIN of task_runs + tasks where ended_at IS NULL,
  worker_pid IS NOT NULL, status='running'. Returns
  {workers: [...], count, checked_at}.

- GET /runs/{run_id}
  Direct lookup of any task_run row by id. Reuses existing
  kanban_db.get_run() helper and _run_dict() serialiser. 404 when
  not found. Mirrors GET /tasks/{task_id} 404 shape.

- GET /runs/{run_id}/inspect
  Live PID stats via psutil.Process.as_dict() — cpu_percent,
  memory_rss_bytes, memory_vms_bytes, num_threads, num_fds, status,
  create_time, cmdline. Short-circuits with alive:false when run
  has ended, has no worker_pid, the pid is gone, or psutil is
  unavailable. AccessDenied surfaces as alive:true with error
  rather than a 500.

11 new tests in tests/plugins/test_kanban_worker_runs.py cover the
empty-board case, running-task case, ended-run filtering,
missing-pid filtering, 404 paths, already-ended inspect, no-pid
inspect, dead-pid inspect, and live-pid inspect (psutil mocked).
All pass.

Companion termination endpoint (POST /runs/{run_id}/terminate) is
intentionally out of scope here — opening a separate issue first
since the RBAC and dispatcher-mediated soft-cancel design needs
maintainer input before code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@teknium1 teknium1 merged commit 02efad7 into main May 19, 2026
@teknium1 teknium1 deleted the hermes/hermes-de55f5dd branch May 19, 2026 04:01
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins labels May 19, 2026
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-de55f5dd vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8819 on HEAD, 8815 on base (🆕 +4)

🆕 New issues (4):

Rule Count
unresolved-import 4
First entries
tests/plugins/test_kanban_worker_runs.py:18: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
plugins/kanban/dashboard/plugin_api.py:1059: [unresolved-import] unresolved-import: Cannot resolve imported module `psutil`
tests/plugins/test_kanban_worker_runs.py:20: [unresolved-import] unresolved-import: Cannot resolve imported module `fastapi.testclient`
tests/plugins/test_kanban_worker_runs.py:19: [unresolved-import] unresolved-import: Cannot resolve imported module `fastapi`

✅ Fixed issues: none

Unchanged: 4627 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

teknium1 pushed a commit that referenced this pull request May 29, 2026
Closes the termination-control gap left by PR #28432, which shipped the
read-only sibling endpoints (/workers/active, /runs/{run_id},
/runs/{run_id}/inspect) but no way to stop a misbehaving worker from
the dashboard without dropping to the CLI.

The new endpoint resolves run_id -> task_id and delegates to the
existing kanban_db.reclaim_task() flow, so the SIGTERM->SIGKILL
escalation, run-outcome bookkeeping, and event-log append all match
POST /tasks/{task_id}/reclaim exactly. No new termination semantics
introduced.

Responses:
  200 {ok, run_id, task_id} on success
  404 unknown run_id
  409 run already ended OR task no longer reclaimable

Refs: #23762
KKT-OPT pushed a commit to KKT-OPT/hermes-agent that referenced this pull request May 31, 2026
Closes the termination-control gap left by PR NousResearch#28432, which shipped the
read-only sibling endpoints (/workers/active, /runs/{run_id},
/runs/{run_id}/inspect) but no way to stop a misbehaving worker from
the dashboard without dropping to the CLI.

The new endpoint resolves run_id -> task_id and delegates to the
existing kanban_db.reclaim_task() flow, so the SIGTERM->SIGKILL
escalation, run-outcome bookkeeping, and event-log append all match
POST /tasks/{task_id}/reclaim exactly. No new termination semantics
introduced.

Responses:
  200 {ok, run_id, task_id} on success
  404 unknown run_id
  409 run already ended OR task no longer reclaimable

Refs: NousResearch#23762
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants