Skip to content

fix(janitor): require completion evidence before auto-closing on completed_at#2

Merged
Brecht-H merged 1 commit into
mainfrom
orion-cc/janitor-completion-evidence
May 16, 2026
Merged

fix(janitor): require completion evidence before auto-closing on completed_at#2
Brecht-H merged 1 commit into
mainfrom
orion-cc/janitor-completion-evidence

Conversation

@Brecht-H

@Brecht-H Brecht-H commented May 16, 2026

Copy link
Copy Markdown
Owner

Root cause

The kanban janitor is a second phantom-done path. explicit_completion_evidence() (the completed_at_present branch) returned a CLOSE decision purely because completed_at was set on a non-terminal task — with no requirement of real result/summary/comment evidence:

if task.completed_at is not None and task.status not in TERMINAL_STATUSES:
    evidence = task.result or comments[-1].body if comments else task.result
    return "completed_at_present", shorten(evidence or "task has completed_at but non-terminal status")

build_scan() collected it and apply_closes() raw-UPDATEd status='done', stamping a tautological placeholder ("task has completed_at but non-terminal status") as the task's only "evidence". A fresh ready/triage/in_progress task that merely had completed_at stamped by a brief worker claim got rubber-stamped to done.

Live proof — t_5dbfc384

Task t_5dbfc384 carries a janitor_completed event closing it ~13 min after creation:

janitor_completed | {"reason": "completed_at_present", "previous_status": "in_progress",
                     "evidence": "task has completed_at but non-terminal status",
                     "source": "kanban-janitor"}

The completed event right before it had result_len: 0 — no real evidence. It had to be manually reopened and is now correctly in_review with a substantive result.

Why this is the 2nd path

This is the same pathology PR #1 fixed for complete_task() (CompletionEvidenceError). But the janitor bypasses complete_task entirely with a raw UPDATE status='done', so the PR #1 guard never fired here. This PR is the complement: it closes the janitor path.

Previously unversioned

The live janitor was a loose script at ~/.hermes/scripts/kanban_janitor.py, not in any git repo. This PR brings it under version control at scripts/kanban_janitor.py (verbatim copy of the live script as the starting point) and applies the fix to the repo copy.

The fix (scripts/kanban_janitor.py)

  • Add has_completion_evidence(task, comments) — True iff non-empty result (after strip) OR any comment with a non-empty stripped body. A bare completed_at timestamp is NOT evidence. Mirrors kanban_db._has_completion_evidence; the janitor uses its own Task/Comment dataclasses so it needs a local copy of the predicate.
  • explicit_completion_evidence() — the completed_at_present close decision is now gated on has_completion_evidence(...). Evidence-free completed_at tasks fall through (return None → not closed). Comment-pattern / trusted-operator checks still run normally.
  • build_scan() — new phantom_completed_at anomaly category (capped at 50): non-terminal, non-running tasks with completed_at set but no evidence.
  • render_markdown() — new "Phantom completed_at" triage section so operators see the anomaly instead of it being silently rubber-stamped.
  • apply_closes() unchanged — once explicit_completion_evidence stops emitting evidence-free closes, apply_closes naturally never closes them.

Tests (tests/scripts/test_kanban_janitor_completion_evidence.py)

10 tests, all passing:

  • has_completion_evidence: result-only True, comment-only True, whitespace-only False, nothing False.
  • explicit_completion_evidence: bare completed_at no-evidence → None; with result → ("completed_at_present", ...); with comment → ("completed_at_present", ...); whitespace-only → None.
  • build_scan-level: a fixture DB with a bare-completed_at no-evidence task → it appears in phantom_completed_at and NOT in close_decisions; a real-evidence task → in close_decisions, not flagged.

Verification

  • python3 -m py_compile scripts/kanban_janitor.py → OK
  • pytest tests/scripts/test_kanban_janitor_completion_evidence.py10 passed
  • Scan-only run (no --apply) against a read-only snapshot of the live kanban → 0 evidence-free completed_at_present closes, applied: 0.
  • Scan against a synthetic DB with an injected t_5dbfc384-style phantom task → task flagged in phantom_completed_at, absent from close_decisions, rendered in the markdown triage section. Pre-fix it would have been auto-closed.

Operator deploy step (Mac-gated)

After merge, sync the live janitor from the repo copy — deployment is an operator step, the live ~/.hermes/scripts/ file was intentionally left untouched by this PR:

cp <repo>/scripts/kanban_janitor.py ~/.hermes/scripts/kanban_janitor.py

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added a kanban janitor tool for managing Hermes kanban boards.
    • Detects stale items, redundant tasks, and phantom completion records.
    • Auto-completes tasks with proper evidence.
    • Generates JSON and Markdown reports with optional external copy.
    • Creates database backups before applying changes.
  • Tests

    • Added test coverage for completion evidence detection logic.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 16, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bf200a2b-13dc-4c83-81b7-83fb38fffb5f

📥 Commits

Reviewing files that changed from the base of the PR and between 76f020e and fe490ef.

📒 Files selected for processing (2)
  • scripts/kanban_janitor.py
  • tests/scripts/test_kanban_janitor_completion_evidence.py

📝 Walkthrough

Walkthrough

A new kanban janitor script scans Hermes SQLite task boards to classify tasks by completion evidence, staleness, and redundancy. It detects safe auto-close candidates, optionally applies decisions within a transaction with DB backup, generates JSON and Markdown reports, and discovers cron scheduler metadata. Tests validate evidence detection and integration with the main scan function.

Changes

Kanban Janitor Task Cleanup

Layer / File(s) Summary
Data model and core utilities
scripts/kanban_janitor.py (lines 1–191)
Status constants, file paths, regex patterns for evidence detection; immutable Task, Comment, CloseDecision dataclasses; utility functions for timestamps, SQLite connection/row mapping, text normalization, and primary completion-evidence predicates.
Evidence classification logic
scripts/kanban_janitor.py (lines 193–244)
Functions to detect explicit completion evidence (task result, comment patterns, trusted-operator comments), identify weak completion candidates, detect redundant/superseded tasks via references, and compute latest activity timestamps for staleness filtering.
Database scanning and task categorization
scripts/kanban_janitor.py (lines 245–339)
build_scan aggregates task status counts, generates auto-close decisions with evidence validation, categorizes weak/redundant/phantom completed_at tasks, groups duplicates by normalized title, and sorts/truncates results for reporting.
Safe persistence and DB mutations
scripts/kanban_janitor.py (lines 341–441)
backup_db creates timestamped copies with retention pruning; apply_closes backs up first, starts a transaction, re-validates each task, skips ineligible statuses, updates task fields and status, inserts janitor comments and task events, and optionally updates related task_runs.
Report generation and output
scripts/kanban_janitor.py (lines 443–553)
render_markdown generates consolidated Markdown reports with status counts, scheduler metadata, auto-close summaries, and needs-triage sections (redundant, phantom, stale, duplicates); write_reports persists JSON, Markdown, JSONL history, and optional secondary Markdown copy.
Scheduler detection and CLI orchestration
scripts/kanban_janitor.py (lines 555–677)
Discovers Hermes scheduler metadata from configured jobs.json files; orchestrates primary scan, optional apply/rescan, legacy DB scanning; constructs final report payload; defines argument parsing and main entrypoint.
Evidence detection unit tests
tests/scripts/test_kanban_janitor_completion_evidence.py (lines 1–130)
Test scaffolding with dynamic imports and helper constructors; validates has_completion_evidence returns True for non-empty result or non-whitespace comments, False otherwise; validates explicit_completion_evidence returns None for bare completed_at without evidence and completed_at_present decision when real evidence is present.
Integration tests with database scanning
tests/scripts/test_kanban_janitor_completion_evidence.py (lines 132–236)
SQLite schema and _make_db helper; tests verify build_scan flags tasks with bare completed_at and no evidence as phantom (skipping close decisions), and includes tasks with real result evidence in close decisions without phantom flags.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A janitor hops through task-board dust,
Sorting the done from the must-just-adjust,
With patterns and evidence, stale task decay—
Phantom completed_at whisked clean away!
Backups in place, transactions held tight,
Reports rendered proud in Markdown light. ✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch orion-cc/janitor-completion-evidence

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py
skills/productivity/google-workspace/scripts/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@github-actions

github-actions Bot commented May 16, 2026

Copy link
Copy Markdown

🔎 Lint report: orion-cc/janitor-completion-evidence vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8340 on HEAD, 8340 on base (➖ 0)

🆕 New issues (3):

Rule Count
invalid-argument-type 3
First entries
run_agent.py:7736: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:14031: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:14034: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`

✅ Fixed issues (3):

Rule Count
invalid-argument-type 3
First entries
run_agent.py:14031: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:14034: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:7736: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`

Unchanged: 4352 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@Brecht-H

Copy link
Copy Markdown
Owner Author

⚠️ CI bypassed — GitHub Actions billing failure (org-wide)

Same account-level billing block affecting every Brecht-H PR. Local verification:

  • py_compile scripts/kanban_janitor.py → ✅ OK
  • pytest tests/scripts/test_kanban_janitor_completion_evidence.py → ✅ 10 passed
  • Scan-only run vs a read-only snapshot of the live kanban (411 tasks) → 0 evidence-free completed_at_present closes.
  • Scan vs a synthetic DB with an injected t_5dbfc384-style phantom (in_progress, completed_at set, result=NULL, 0 comments) → correctly flagged in phantom_completed_at, absent from close_decisions. Pre-fix it would have been auto-closed to done.

The fix logic is ~15 lines (verbatim-copy of the previously-unversioned janitor + the evidence gate); diffed against the live ~/.hermes/scripts/kanban_janitor.py to confirm only the intended change.

Workaround by orion-cc. Deploy after merge = Mac-gated sync to ~/.hermes/scripts/kanban_janitor.py.

@Brecht-H Brecht-H marked this pull request as ready for review May 16, 2026 22:48
…leted_at

The kanban janitor is a SECOND phantom-done path. Its
explicit_completion_evidence() completed_at_present branch returned a
CLOSE decision purely because completed_at was set — with NO requirement
of a real result/summary/comment. build_scan() collected it and
apply_closes() raw-UPDATEd status='done', stamping a tautological
placeholder ("task has completed_at but non-terminal status") as the
task's only "evidence". A brief worker claim that stamps completed_at on
a fresh ready/triage/in_progress task got rubber-stamped to done.

Live proof: task t_5dbfc384 carries a janitor_completed event closing it
~13 min after creation on completed_at_present with result_len=0; it had
to be manually reopened and is now correctly in_review with a real
result.

This is the same pathology hermes-agent PR #1 fixed for complete_task()
(CompletionEvidenceError) — but the janitor bypasses complete_task
entirely with a raw UPDATE.

Changes (scripts/kanban_janitor.py):
- Add has_completion_evidence(task, comments): True iff non-empty result
  OR any non-empty comment body. A bare completed_at is NOT evidence.
  Mirrors kanban_db._has_completion_evidence (janitor has its own
  dataclasses so it needs a local copy of the predicate).
- explicit_completion_evidence(): the completed_at_present close decision
  is now gated on has_completion_evidence; evidence-free completed_at
  tasks fall through (return None -> not closed).
- build_scan(): new phantom_completed_at anomaly category (capped 50)
  for non-terminal, non-running tasks with completed_at but no evidence.
- render_markdown(): new "Phantom completed_at" triage section so
  operators see the anomaly instead of it being silently rubber-stamped.
- apply_closes() unchanged — once evidence-free closes stop being
  emitted, it naturally never closes them.

The janitor was previously an unversioned loose script at
~/.hermes/scripts/kanban_janitor.py; this commit brings it under version
control. Operator deploy step (Mac-gated): after merge, sync
~/.hermes/scripts/kanban_janitor.py from the repo copy.

Confidence: high
Scope-risk: moderate
Machine: orion-terminal
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Brecht-H Brecht-H force-pushed the orion-cc/janitor-completion-evidence branch from b22a106 to fe490ef Compare May 16, 2026 22:53
@Brecht-H Brecht-H merged commit 6a38004 into main May 16, 2026
13 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant