Skip to content

feat: add durable execution work orders#8403

Closed
MestreY0d4-Uninter wants to merge 1 commit into
NousResearch:mainfrom
MestreY0d4-Uninter:h007-durable-work-orders
Closed

feat: add durable execution work orders#8403
MestreY0d4-Uninter wants to merge 1 commit into
NousResearch:mainfrom
MestreY0d4-Uninter:h007-durable-work-orders

Conversation

@MestreY0d4-Uninter

Copy link
Copy Markdown
Contributor

Summary

  • add durable execution work orders persisted in JSON + SQLite
  • add queue / run_due, retry, resume, cancel, and reclaim_stale operations
  • add cron-backed runner install/status/remove plus operator-facing workorders CLI/slash surfaces

H007 line overview

This is part 2 of the same H007 line as #8402.

Intended review shape:

The architectural point is narrowness, not breadth:

  • due work orders still execute through the official delegate_task direct path
  • receipts and runtime metadata remain intact
  • resumability here means durable replay/re-entry of an explicit one-shot work order, not process-level continuation of arbitrary in-flight jobs

Motivation

Once the narrow direct lane is auditable, the next H007 step is to make it durable.

This PR adds a small control-plane slice around that lane:

  • persisted work orders
  • queueing and due execution
  • retry scheduling
  • stale-run reclaim
  • operator-visible runner controls

It deliberately avoids inventing a second execution path.

What this PR changes

  • add tools/execution_work_orders.py
    • durable JSON + SQLite work-order ledger
    • explicit statuses: queued, running, retry_scheduled, completed, failed, cancelled
  • add tools/execution_work_orders_tool.py
    • list, query, enqueue, run_due, reclaim_stale, retry, resume, cancel, runner_status, install_runner, remove_runner
  • add operator-facing work-order surfaces
    • hermes workorders ...
    • /workorders ...
  • add cron-backed runner configuration for due work-order execution
  • keep execution explicitly scoped to direct_terminal_work_order
  • add focused tests for queue semantics, tool surface, and CLI/slash management

Validation

Focused local validation on this branch:

source /home/ubuntu/hermes-agent-dev/extraordinary-prototypes-2026-04-11/.venv/bin/activate
python -m py_compile \
  cli.py \
  hermes_cli/commands.py \
  hermes_cli/main.py \
  hermes_cli/tools_config.py \
  hermes_cli/work_orders.py \
  model_tools.py \
  toolsets.py \
  tools/execution_work_orders.py \
  tools/execution_work_orders_tool.py \
  tests/hermes_cli/test_work_orders.py \
  tests/tools/test_execution_work_orders.py \
  tests/tools/test_execution_work_orders_tool.py

pytest \
  tests/run_agent/test_run_agent.py \
  tests/tools/test_delegate.py \
  tests/tools/test_docker_environment.py \
  tests/tools/test_tool_result_storage.py \
  tests/tools/test_execution_receipts.py \
  tests/tools/test_execution_receipts_tool.py \
  tests/hermes_cli/test_receipts.py \
  tests/tools/test_execution_work_orders.py \
  tests/tools/test_execution_work_orders_tool.py \
  tests/hermes_cli/test_work_orders.py \
  -q -o addopts=''

Result:

  • 434 passed, 1 warning

External real-run evidence for this queue layer on the supported slice:

  • classic warm: 4.37s duration / 4.56s wall
  • direct warm: 0.20s duration / 0.38s wall
  • queued warm: 0.22s duration / 0.44s wall
  • queued warm vs classic warm: about 94.97% faster duration
  • queued warm vs classic warm: about 90.35% faster wall
  • 30/30 valid runs
  • 30/30 exact-match runs
  • one reused concrete Docker runtime ID across valid queued-warm runs

Real scheduler/resumability validation also proved:

  • delayed work orders remain queued until due
  • retry-once completes on attempt 2
  • stale-running work orders are reclaimed and then complete on the next run_due
  • completed work orders stay linked to real execution receipts

Notes for reviewers

This PR is intentionally narrow and depends conceptually on #8402.

It does not claim:

  • process-level continuation of arbitrary running jobs
  • general orchestration is solved
  • every terminal workload belongs in this lane
  • host-bound git worktree or host-venv pytest workloads are already covered

The claim is narrower:

  • the already-auditable direct lane can now be wrapped in a durable work-order queue with retry and stale-run reclaim while preserving its supported-slice performance profile

@MestreY0d4-Uninter

MestreY0d4-Uninter commented Apr 12, 2026

Copy link
Copy Markdown
Contributor Author

Quick CI note on this draft follow-up:

  • I checked the red checks after opening it.
  • build-and-push is failing in the Dockerfile/npm layer with spawn git ENOENT; that same Docker workflow is already red on recent upstream main runs.
  • The repo-wide test workflow is also already red on current main; I reproduced the baseline locally from a clean origin/main worktree before blaming this branch.
  • The focused H007 validation relevant to this follow-up is green locally: 434 passed, 1 warning for the receipt/direct-lane + work-order queue slice listed in the PR body.

So the current red CI does not appear to be introduced by this H007 follow-up itself.

@MestreY0d4-Uninter MestreY0d4-Uninter force-pushed the h007-durable-work-orders branch from 447625d to 85857d5 Compare April 14, 2026 19:26
@MestreY0d4-Uninter

Copy link
Copy Markdown
Contributor Author

Refresh — April 14 2026

Branch restructured and rebased onto current origin/main. Conflicts with upstream resolved.

What changed in this refresh

The original branch had two commits stacked: an older version of execution_receipts (pre-#9209) plus the durable_work_orders feature. This refresh:

  1. Rebuilt the branch on top of origin/main with the clean execution receipts from PR feat: add execution receipts — auditable records of delegated task execution #9209 (b101301e)
  2. Cherry-picked only the durable_work_orders commit (feat: add durable execution work orders)
  3. Resolved 5 cherry-pick conflicts (hermes_cli/commands.py, hermes_cli/main.py, hermes_cli/tools_config.py, model_tools.py, toolsets.py) — all caused by the older receipts baseline
  4. Added attribution housekeeping commit (chore: add MestreY0d4-Uninter to AUTHOR_MAP and .mailmap)

Test results

Suite Result
tests/test_execution_receipts_impl.py 22 passed
tests/tools/test_delegate.py 67 passed
tests/hermes_cli/test_work_orders.py 7 passed
tests/tools/test_execution_work_orders.py 3 passed
tests/tools/test_execution_work_orders_tool.py 3 passed
Total 102 passed, 0 failed

Commit log

85857d55 feat: add durable execution work orders
9d9f8ed0 chore: add MestreY0d4-Uninter to AUTHOR_MAP and .mailmap
b101301e feat: execution receipts with auto-instrumentation

Dependency note

This PR builds on top of PR #9209 (execution receipts). The branch currently includes the #9209 commits inline; if #9209 merges first, this branch can be trimmed to just the durable_work_orders commit on top of the updated origin/main.

CI notes

  • No merge conflicts with current origin/main
  • check-attribution — fixed via housekeeping commit
  • test job — pending new CI run (previous failure was upstream acp module collection error, not caused by this PR)

@MestreY0d4-Uninter MestreY0d4-Uninter marked this pull request as ready for review April 14, 2026 19:28
@MestreY0d4-Uninter

Copy link
Copy Markdown
Contributor Author

Stacking note

This PR depends on #9209 (execution receipts). The current branch includes the #9209 commits inline so it is reviewable and testable in isolation, but the intended merge order is:

  1. Merge feat: add execution receipts — auditable records of delegated task execution #9209 first
  2. Rebase this branch onto the updated origin/main (drops the inline receipts commits, leaving only the feat: add durable execution work orders commit)
  3. Merge this PR

If you prefer to review both together before any merge, no action is needed — the branch is self-contained as-is.

@MestreY0d4-Uninter MestreY0d4-Uninter force-pushed the h007-durable-work-orders branch from 85857d5 to 7d9f653 Compare April 19, 2026 16:08
@MestreY0d4-Uninter

Copy link
Copy Markdown
Contributor Author

Refresh concluído em branch limpa a partir de origin/main. Cherry-pick aplicado e push forçado para h007-durable-work-orders. Validação: py_compile nos .py alterados e pytest -o addopts='' -q nos testes afetados (13 passed).

@MestreY0d4-Uninter

Copy link
Copy Markdown
Contributor Author

⚠️ Audit follow-up (2026-04-19)

There is substantial real work here and the PR body includes strong local evidence:

  • durable JSON + SQLite work-order ledger
  • queue/run_due/retry/resume/cancel/reclaim_stale operations
  • CLI + slash-command surfaces
  • 434 passing tests reported locally

So this is not a “throwaway” PR. However, it is still not merge-ready as packaged today.

Main reasons:

  1. this is the largest PR in the backlog (very high review cost)
  2. it is explicitly part of the same H007 line as feat: add execution receipts and direct terminal work-order lane #8402, and that base PR is closed/unmerged, so the acceptance path for this control-plane layer is still socially/architecturally unresolved
  3. CI is currently red, including check-attribution
  4. even if the direction is good, this likely needs either a clearer accepted base or smaller reviewable slices

Recommendation: KEEP OPEN, but treat it as blocked on packaging + dependency/acceptance decisions.

Suggested next step:

So: likely still relevant, but currently too large and too dependent on unresolved base decisions to push for merge as-is.


Batch 4 — technical follow-up required

@MestreY0d4-Uninter

Copy link
Copy Markdown
Contributor Author

Fechando para Reavaliação Arquitetural

Como parte de uma auditoria completa do backlog (23 PRs abertas auditadas), esta PR foi identificada como dependente de clareza arquitetural upstream.

Contexto:
Esta PR implementa work orders duráveis (JSON + SQLite ledger) como parte da linha H007, que por sua vez depende conceitualmente da PR #8402 (execution receipts + direct terminal work-order lane) — que está fechada sem merge.

Motivo do fechamento:

O que acontece agora:

  • O código está disponível na branch original:
  • Se houver interesse futuro em execution receipts + work orders duráveis, esta PR pode ser reaberta ou recriada com base em nova arquitetura aceita
  • O trabalho não está perdido — apenas arquivado até haver clareza sobre a direção desejada

Agradecimento:
Obrigado pela revisão e feedback até aqui. Reabrirei quando houver consenso sobre a arquitetura de execução durável no Hermes.


Fechado como parte do audit de backlog 2026-04-19: 21 PRs refreshadas, 4 fechadas (2 absorvidas, 2 reavaliadas)

@MestreY0d4-Uninter MestreY0d4-Uninter deleted the h007-durable-work-orders branch April 27, 2026 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant