Summary
The bundled disk-cleanup plugin can delete the durable cron registry because it classifies all top-level HERMES_HOME/cron/** paths as disposable cron-output.
That includes ~/.hermes/cron/jobs.json, which is the scheduler's source-of-truth job store.
Once jobs.json is auto-tracked as cron-output, the plugin's automatic cleanup can delete it, and Hermes then treats the missing registry as an empty schedule (0 jobs).
Confirmed root cause
Current plugins/disk-cleanup/disk_cleanup.py::guess_category() logic:
if top == "cron" or top == "cronjobs":
return "cron-output"
This is too broad. cron/output/** contains disposable run artifacts, but top-level cron state does not.
Durable scheduler state in the same directory includes at least:
~/.hermes/cron/jobs.json
~/.hermes/cron/.tick.lock
Why this is destructive
disk-cleanup later deletes tracked cron-output entries during automatic cleanup. If jobs.json has been tracked under that category, the cleanup pass can remove the live cron registry.
After that, Hermes behaves as if there are no scheduled jobs because missing ~/.hermes/cron/jobs.json is interpreted as an empty job list.
Reproduction
- Enable the bundled
disk-cleanup plugin.
- Ensure a cron registry exists at
~/.hermes/cron/jobs.json.
- Cause the plugin to auto-track a path inside top-level
~/.hermes/cron/ via the existing guess_category() path classification.
- Run the plugin's automatic or manual quick cleanup.
- Observe that the cron registry may be deleted and subsequent cron listing shows
0 jobs.
Expected behavior
Only disposable run artifacts under ~/.hermes/cron/output/** should be classified as cron-output.
Top-level cron control-plane files must never be auto-tracked as cleanup candidates.
Actual behavior
Top-level cron files are classified as cron-output, making the scheduler registry eligible for deletion.
Proposed fix
Restrict cron-output classification to the output subtree only, e.g.:
if top == "cron" or top == "cronjobs":
if len(rel.parts) >= 2 and rel.parts[1] == "output":
return "cron-output"
return None
Regression coverage suggested
Add tests that assert:
~/.hermes/cron/output/<job>/run.md -> cron-output
~/.hermes/cron/jobs.json -> None
~/.hermes/cron/.tick.lock -> None
Notes
This is distinct from other cron-loss issues involving:
- profile-fragmented cron stores
- concurrent
jobs.json write races
- permission/mode problems on
jobs.json
- dashboard update flows
Those can also produce missing or invisible jobs, but this bug is specifically about disk-cleanup deleting the registry due to overly broad cron-path classification.
Summary
The bundled
disk-cleanupplugin can delete the durable cron registry because it classifies all top-levelHERMES_HOME/cron/**paths as disposablecron-output.That includes
~/.hermes/cron/jobs.json, which is the scheduler's source-of-truth job store.Once
jobs.jsonis auto-tracked ascron-output, the plugin's automatic cleanup can delete it, and Hermes then treats the missing registry as an empty schedule (0jobs).Confirmed root cause
Current
plugins/disk-cleanup/disk_cleanup.py::guess_category()logic:This is too broad.
cron/output/**contains disposable run artifacts, but top-level cron state does not.Durable scheduler state in the same directory includes at least:
~/.hermes/cron/jobs.json~/.hermes/cron/.tick.lockWhy this is destructive
disk-cleanuplater deletes trackedcron-outputentries during automatic cleanup. Ifjobs.jsonhas been tracked under that category, the cleanup pass can remove the live cron registry.After that, Hermes behaves as if there are no scheduled jobs because missing
~/.hermes/cron/jobs.jsonis interpreted as an empty job list.Reproduction
disk-cleanupplugin.~/.hermes/cron/jobs.json.~/.hermes/cron/via the existingguess_category()path classification.0jobs.Expected behavior
Only disposable run artifacts under
~/.hermes/cron/output/**should be classified ascron-output.Top-level cron control-plane files must never be auto-tracked as cleanup candidates.
Actual behavior
Top-level cron files are classified as
cron-output, making the scheduler registry eligible for deletion.Proposed fix
Restrict cron-output classification to the output subtree only, e.g.:
Regression coverage suggested
Add tests that assert:
~/.hermes/cron/output/<job>/run.md->cron-output~/.hermes/cron/jobs.json->None~/.hermes/cron/.tick.lock->NoneNotes
This is distinct from other cron-loss issues involving:
jobs.jsonwrite racesjobs.jsonThose can also produce missing or invisible jobs, but this bug is specifically about
disk-cleanupdeleting the registry due to overly broad cron-path classification.