Skip to content

fix(disk-cleanup): restrict cron-output classification to output subtree#32222

Closed
Carry00 wants to merge 1 commit into
NousResearch:mainfrom
Carry00:fix/32164-disk-cleanup-cron-registry
Closed

fix(disk-cleanup): restrict cron-output classification to output subtree#32222
Carry00 wants to merge 1 commit into
NousResearch:mainfrom
Carry00:fix/32164-disk-cleanup-cron-registry

Conversation

@Carry00

@Carry00 Carry00 commented May 25, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #32164.

plugins/disk-cleanup/disk_cleanup.py::guess_category() classified every
file under ~/.hermes/cron/ as "cron-output". The plugin's automatic
cleanup pass later deletes tracked cron-output entries older than 14
days, which means the durable scheduler registry
(~/.hermes/cron/jobs.json) and the tick lock (~/.hermes/cron/.tick.lock)
were eligible for deletion. When jobs.json was removed the scheduler
reported 0 jobs.

Only the per-run artefact subtree cron/output/{job_id}/*.md is
disposable (see cron/jobs.py:5). This change limits the cron-output
category to that subtree and leaves top-level cron/ control-plane files
untracked.

Changes

  • plugins/disk-cleanup/disk_cleanup.pyguess_category() now requires
    rel.parts[1] == "output" before returning "cron-output"; other
    top-level cron/ paths return None.
  • tests/plugins/test_disk_cleanup_plugin.py — replaced the broad
    test_cron_subtree_categorised with four explicit cases.

Test plan

  • scripts/run_tests.sh tests/plugins/test_disk_cleanup_plugin.py — 41 passed
  • New regression cases:
    • cron/output/<job>/run.md"cron-output"
    • cron/jobs.jsonNone
    • cron/.tick.lockNone
    • cron/<any-other-top-level>None

The post_tool_call hook used guess_category() to auto-track every file
under ~/.hermes/cron/ as "cron-output", which is later subject to age-based
auto-deletion (>14d). That made the durable scheduler registry
(cron/jobs.json) and the tick lock (cron/.tick.lock) eligible for removal;
once jobs.json was deleted, Hermes treated the scheduler as having zero
jobs.

Only the per-run artefact subtree at cron/output/{job_id}/*.md is
disposable. Limit the cron-output category to that subtree and leave
top-level cron/ control-plane files untracked.

Regression tests cover:
  cron/output/<job>/run.md  -> "cron-output"
  cron/jobs.json            -> None
  cron/.tick.lock           -> None
  cron/<any-top-level>      -> None

Fixes NousResearch#32164

@hclsys hclsys left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is correct — but heads up, this looks like a duplicate of #32177, which targets the same bug with the same approach and was opened ~2.5h earlier.

Both:

  • Fix #32164 (durable cron/jobs.json + .tick.lock being auto-classified as cron-output and deleted by the 14-day cleanup pass).
  • Edit the same branch in plugins/disk-cleanup/disk_cleanup.py::guess_category() (the if top == "cron" or top == "cronjobs": return "cron-output" at line ~483).
  • Apply the identical narrowing: if len(rel.parts) >= 2 and rel.parts[1] == "output": return "cron-output" else None.
  • Add the same regression tests (cron/output/** → categorised, cron/jobs.json → not categorised).

Because the guard keys on rel.parts[1] == "output" (not parts[0]), both versions also correctly cover the legacy cronjobs/ alias. So the two PRs are functionally equivalent.

Timeline: #32177 (briandevans) opened 2026-05-25T17:14Z; this PR 2026-05-25T19:48Z. To save the maintainer a double review and avoid a merge conflict, it's probably worth consolidating — either closing one in favor of the other, or if there's a meaningful difference I'm missing, calling it out explicitly so reviewers know which to take. The fix itself is right in both.

(No code issue with this PR's implementation — flagging only the duplication.)

@alt-glitch alt-glitch added type/bug Something isn't working comp/plugins Plugin system and bundled plugins comp/cron Cron scheduler and job management P1 High — major feature broken, no workaround labels May 25, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing with #32177 — both fix #32164 (disk-cleanup misclassifying cron/jobs.json as disposable cron-output). Same approach: narrow cron-output classification to cron/output/** subtree only.

@Carry00

Carry00 commented May 27, 2026

Copy link
Copy Markdown
Contributor Author

Closing in favour of #32177 which covers the same fix and was opened earlier. Thanks for the heads-up, @hclsys and @alt-glitch.

@Carry00 Carry00 closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cron Cron scheduler and job management comp/plugins Plugin system and bundled plugins P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

disk-cleanup misclassifies cron/jobs.json as cron-output and can delete the live cron registry

3 participants