fix(disk-cleanup): restrict cron-output classification to output subtree#32222
fix(disk-cleanup): restrict cron-output classification to output subtree#32222Carry00 wants to merge 1 commit into
Conversation
The post_tool_call hook used guess_category() to auto-track every file
under ~/.hermes/cron/ as "cron-output", which is later subject to age-based
auto-deletion (>14d). That made the durable scheduler registry
(cron/jobs.json) and the tick lock (cron/.tick.lock) eligible for removal;
once jobs.json was deleted, Hermes treated the scheduler as having zero
jobs.
Only the per-run artefact subtree at cron/output/{job_id}/*.md is
disposable. Limit the cron-output category to that subtree and leave
top-level cron/ control-plane files untracked.
Regression tests cover:
cron/output/<job>/run.md -> "cron-output"
cron/jobs.json -> None
cron/.tick.lock -> None
cron/<any-top-level> -> None
Fixes NousResearch#32164
hclsys
left a comment
There was a problem hiding this comment.
The fix is correct — but heads up, this looks like a duplicate of #32177, which targets the same bug with the same approach and was opened ~2.5h earlier.
Both:
- Fix #32164 (durable
cron/jobs.json+.tick.lockbeing auto-classified ascron-outputand deleted by the 14-day cleanup pass). - Edit the same branch in
plugins/disk-cleanup/disk_cleanup.py::guess_category()(theif top == "cron" or top == "cronjobs": return "cron-output"at line ~483). - Apply the identical narrowing:
if len(rel.parts) >= 2 and rel.parts[1] == "output": return "cron-output"elseNone. - Add the same regression tests (
cron/output/**→ categorised,cron/jobs.json→ not categorised).
Because the guard keys on rel.parts[1] == "output" (not parts[0]), both versions also correctly cover the legacy cronjobs/ alias. So the two PRs are functionally equivalent.
Timeline: #32177 (briandevans) opened 2026-05-25T17:14Z; this PR 2026-05-25T19:48Z. To save the maintainer a double review and avoid a merge conflict, it's probably worth consolidating — either closing one in favor of the other, or if there's a meaningful difference I'm missing, calling it out explicitly so reviewers know which to take. The fix itself is right in both.
(No code issue with this PR's implementation — flagging only the duplication.)
|
Closing in favour of #32177 which covers the same fix and was opened earlier. Thanks for the heads-up, @hclsys and @alt-glitch. |
Summary
Fixes #32164.
plugins/disk-cleanup/disk_cleanup.py::guess_category()classified everyfile under
~/.hermes/cron/as"cron-output". The plugin's automaticcleanup pass later deletes tracked
cron-outputentries older than 14days, which means the durable scheduler registry
(
~/.hermes/cron/jobs.json) and the tick lock (~/.hermes/cron/.tick.lock)were eligible for deletion. When
jobs.jsonwas removed the schedulerreported
0jobs.Only the per-run artefact subtree
cron/output/{job_id}/*.mdisdisposable (see
cron/jobs.py:5). This change limits thecron-outputcategory to that subtree and leaves top-level
cron/control-plane filesuntracked.
Changes
plugins/disk-cleanup/disk_cleanup.py—guess_category()now requiresrel.parts[1] == "output"before returning"cron-output"; othertop-level
cron/paths returnNone.tests/plugins/test_disk_cleanup_plugin.py— replaced the broadtest_cron_subtree_categorisedwith four explicit cases.Test plan
scripts/run_tests.sh tests/plugins/test_disk_cleanup_plugin.py— 41 passedcron/output/<job>/run.md→"cron-output"cron/jobs.json→Nonecron/.tick.lock→Nonecron/<any-other-top-level>→None