Skip to content

[codex] Fix Kanban retry runtime clock#19828

Closed
misery-hl wants to merge 1 commit into
NousResearch:mainfrom
misery-hl:codex/kanban-reset-started-at
Closed

[codex] Fix Kanban retry runtime clock#19828
misery-hl wants to merge 1 commit into
NousResearch:mainfrom
misery-hl:codex/kanban-reset-started-at

Conversation

@misery-hl

Copy link
Copy Markdown
Contributor

Summary

  • Reset the task-level started_at timestamp on every Kanban claim.
  • Add regressions for retry-after-unblock and retry-after-timeout behavior.

Root Cause

claim_task preserved tasks.started_at with COALESCE(started_at, ?), while max-runtime enforcement reads the task-level started_at value. When a blocked or timed-out task was retried, the retry could inherit an old runtime clock and immediately time out again.

Impact

Retries now receive a fresh task runtime window when they are claimed again. Attempt history is still preserved in task_runs, but the active task record reflects the current attempt's runtime clock.

Validation

  • /home/clockwork/.hermes/hermes-agent/venv/bin/python -m pytest tests/hermes_cli/test_kanban_core_functionality.py -k 'claim_resets_task_started_at_after_unblock or timeout_retry_uses_fresh_task_started_at or enforce_max_runtime_integrates_with_dispatch'
  • /home/clockwork/.hermes/hermes-agent/venv/bin/python -m pytest tests/hermes_cli/test_kanban_core_functionality.py

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/cli CLI entry point, hermes_cli/, setup wizard labels May 4, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #19473 — same retry-runtime-clock root cause (started_at preserved across claims), different fix approach: this PR resets task-level started_at on claim instead of reading from task_runs.

@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #19473 — same retry-runtime-clock root cause (started_at preserved across claims), different fix approach.

@teknium1

teknium1 commented May 6, 2026

Copy link
Copy Markdown
Contributor

Closed in favor of competing PR #19473 which solved the same retry-timeout-loop bug non-destructively (JOIN on task_runs.started_at via COALESCE, preserves tasks.started_at as a lifetime timestamp). Your diagnosis of the root cause was correct — the implementation difference was that #19473's approach doesn't destroy tasks.started_at history. Both fixes land the same test behavior. Thanks — you'd submitted first, and the analysis was solid.

Salvaged via PR #20448: #20448

@misery-hl

Copy link
Copy Markdown
Contributor Author

wow big teknium commenting in my pull request... thank you for everything you and the team have put out in the world! love Hermes so much :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants