Skip to content

fix(cron): stop silent cron-job loss across update (disk-cleanup + snapshot restore)#34840

Merged
teknium1 merged 3 commits into
mainfrom
hermes/hermes-259ceebf
May 29, 2026
Merged

fix(cron): stop silent cron-job loss across update (disk-cleanup + snapshot restore)#34840
teknium1 merged 3 commits into
mainfrom
hermes/hermes-259ceebf

Conversation

@teknium1

@teknium1 teknium1 commented May 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Cron jobs no longer silently vanish after hermes update. Two independent failure paths are closed: the disk-cleanup plugin that was deleting the live registry (root cause), and a missing recovery net during update (defense-in-depth).

Root cause (issue #32164)

plugins/disk-cleanup/disk_cleanup.py::guess_category() classified the entire cron/ tree as disposable cron-output:

if top == "cron" or top == "cronjobs":
    return "cron-output"

Every cron run rewrites cron/jobs.json (to update next_run_at). The disk-cleanup hook tracked that write as cron-output, and its auto-cleanup pass deletes cron-output after 14 days — wiping the live scheduler registry. Hermes then reads the missing file as 0 jobs. (The original #34600 report guessed config migration, but migrate_config() never opens jobs.json — disk-cleanup is the actual emptier.)

Changes

Validation

Check Result
tests/plugins/test_disk_cleanup_plugin.py 41 passed
tests/hermes_cli/test_backup.py 105 passed (6 new)
E2E (real imports, isolated HERMES_HOME) 9/9 — jobs.json/.tick.lock protected, cron/output/ still tracked, snapshot restore fires on loss, no second-guessing genuine clears, corrupt files untouched

Closes #34600. Closes #32164.
Supersedes #34602, #33834, #32478, #32436, #30208 (credit preserved via cherry-pick).

Co-authored-by: sweetcornna 96944678+ymylive@users.noreply.github.com
Co-authored-by: Bartok9 danielrpike9@gmail.com

Infographic

cron-persistence-two-layer-fix

sweetcornna and others added 3 commits May 29, 2026 13:06
Config-version migrations have been observed to leave cron/jobs.json
valid-but-empty after `hermes update`, silently dropping every scheduled
job (#34600). The existing malformed-shape guards in cron/jobs.py don't
catch this because {"jobs": []} is valid JSON.

Add restore_cron_jobs_if_emptied() as a post-migration safety net: if the
live cron/jobs.json now has zero jobs while the pre-update snapshot held
one or more, restore the snapshot copy in place and warn loudly. The
check is conservative — it only restores on unambiguous evidence of loss
(snapshot had jobs, live file readable-and-empty), so a user who genuinely
cleared their jobs is never second-guessed and an unreadable live file is
left untouched so real corruption still surfaces.

Wired into _cmd_update_impl after migrate_config(), reusing the existing
pre-update quick snapshot (which already captures cron/jobs.json).

Closes #34600
Maps the cherry-picked commit's noreply email to the GitHub login so the
release attribution / CI author check passes.
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/cron Cron scheduler and job management comp/plugins Plugin system and bundled plugins comp/cli CLI entry point, hermes_cli/, setup wizard labels May 29, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Supersedes #34602, #33834, #32478, #32436, #30208. Closes both #34600 and #32164. Root cause: disk-cleanup plugin classified live cron/jobs.json as disposable cron-output. Defense-in-depth: post-update snapshot restore if jobs.json emptied.

@teknium1 teknium1 merged commit 0dc0c5e into main May 29, 2026
23 checks passed
@teknium1 teknium1 deleted the hermes/hermes-259ceebf branch May 29, 2026 20:22
teknium1 pushed a commit that referenced this pull request Jun 4, 2026
)

quick() and dry_run() previously trusted the stored category from
tracked.json without re-validating at delete time. Stale entries from
before #34840 could carry category="cron-output" for cron control-plane
paths (e.g. cron/jobs.json), causing quick() to delete the live
scheduler registry.

Fix:
- Fix guess_category() to only classify cron/output/** as cron-output
  (was classifying ALL cron/* paths, missing the #34840 fix).
- Re-validate cron-output entries via guess_category() at delete time
  in quick() and dry_run(); stale entries that are no longer classified
  as cron-output are skipped and removed from tracked.json.
- Add _is_protected_cron_path() as a hard defense-in-depth guard that
  blocks deletion of cron/cronjobs directories and known control-plane
  files (jobs.json, .tick.lock) regardless of stored category.
- Update test_cron_subtree_categorised to match fixed guess_category
  (only cron/output/* is cron-output, not all of cron/).

Tests: add 5 regression tests in TestStaleCronEntryMigration.
teknium1 pushed a commit that referenced this pull request Jun 4, 2026
)

quick() and dry_run() previously trusted the stored category from
tracked.json without re-validating at delete time. Stale entries from
before #34840 could carry category="cron-output" for cron control-plane
paths (e.g. cron/jobs.json), causing quick() to delete the live
scheduler registry.

Fix:
- Fix guess_category() to only classify cron/output/** as cron-output
  (was classifying ALL cron/* paths, missing the #34840 fix).
- Re-validate cron-output entries via guess_category() at delete time
  in quick() and dry_run(); stale entries that are no longer classified
  as cron-output are skipped and removed from tracked.json.
- Add _is_protected_cron_path() as a hard defense-in-depth guard that
  blocks deletion of cron/cronjobs directories and known control-plane
  files (jobs.json, .tick.lock) regardless of stored category.
- Update test_cron_subtree_categorised to match fixed guess_category
  (only cron/output/* is cron-output, not all of cron/).

Tests: add 5 regression tests in TestStaleCronEntryMigration.
waym0reom3ga pushed a commit to waym0reom3ga/autolycus-agent that referenced this pull request Jun 4, 2026
…sResearch#37721)

quick() and dry_run() previously trusted the stored category from
tracked.json without re-validating at delete time. Stale entries from
before NousResearch#34840 could carry category="cron-output" for cron control-plane
paths (e.g. cron/jobs.json), causing quick() to delete the live
scheduler registry.

Fix:
- Fix guess_category() to only classify cron/output/** as cron-output
  (was classifying ALL cron/* paths, missing the NousResearch#34840 fix).
- Re-validate cron-output entries via guess_category() at delete time
  in quick() and dry_run(); stale entries that are no longer classified
  as cron-output are skipped and removed from tracked.json.
- Add _is_protected_cron_path() as a hard defense-in-depth guard that
  blocks deletion of cron/cronjobs directories and known control-plane
  files (jobs.json, .tick.lock) regardless of stored category.
- Update test_cron_subtree_categorised to match fixed guess_category
  (only cron/output/* is cron-output, not all of cron/).

Tests: add 5 regression tests in TestStaleCronEntryMigration.
Yuki-14544869 pushed a commit to Yuki-14544869/hermes-agent that referenced this pull request Jun 4, 2026
…sResearch#37721)

quick() and dry_run() previously trusted the stored category from
tracked.json without re-validating at delete time. Stale entries from
before NousResearch#34840 could carry category="cron-output" for cron control-plane
paths (e.g. cron/jobs.json), causing quick() to delete the live
scheduler registry.

Fix:
- Fix guess_category() to only classify cron/output/** as cron-output
  (was classifying ALL cron/* paths, missing the NousResearch#34840 fix).
- Re-validate cron-output entries via guess_category() at delete time
  in quick() and dry_run(); stale entries that are no longer classified
  as cron-output are skipped and removed from tracked.json.
- Add _is_protected_cron_path() as a hard defense-in-depth guard that
  blocks deletion of cron/cronjobs directories and known control-plane
  files (jobs.json, .tick.lock) regardless of stored category.
- Update test_cron_subtree_categorised to match fixed guess_category
  (only cron/output/* is cron-output, not all of cron/).

Tests: add 5 regression tests in TestStaleCronEntryMigration.
davidgut1982 pushed a commit to davidgut1982/hermes-agent that referenced this pull request Jun 5, 2026
…sResearch#37721)

quick() and dry_run() previously trusted the stored category from
tracked.json without re-validating at delete time. Stale entries from
before NousResearch#34840 could carry category="cron-output" for cron control-plane
paths (e.g. cron/jobs.json), causing quick() to delete the live
scheduler registry.

Fix:
- Fix guess_category() to only classify cron/output/** as cron-output
  (was classifying ALL cron/* paths, missing the NousResearch#34840 fix).
- Re-validate cron-output entries via guess_category() at delete time
  in quick() and dry_run(); stale entries that are no longer classified
  as cron-output are skipped and removed from tracked.json.
- Add _is_protected_cron_path() as a hard defense-in-depth guard that
  blocks deletion of cron/cronjobs directories and known control-plane
  files (jobs.json, .tick.lock) regardless of stored category.
- Update test_cron_subtree_categorised to match fixed guess_category
  (only cron/output/* is cron-output, not all of cron/).

Tests: add 5 regression tests in TestStaleCronEntryMigration.
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…sResearch#37721)

quick() and dry_run() previously trusted the stored category from
tracked.json without re-validating at delete time. Stale entries from
before NousResearch#34840 could carry category="cron-output" for cron control-plane
paths (e.g. cron/jobs.json), causing quick() to delete the live
scheduler registry.

Fix:
- Fix guess_category() to only classify cron/output/** as cron-output
  (was classifying ALL cron/* paths, missing the NousResearch#34840 fix).
- Re-validate cron-output entries via guess_category() at delete time
  in quick() and dry_run(); stale entries that are no longer classified
  as cron-output are skipped and removed from tracked.json.
- Add _is_protected_cron_path() as a hard defense-in-depth guard that
  blocks deletion of cron/cronjobs directories and known control-plane
  files (jobs.json, .tick.lock) regardless of stored category.
- Update test_cron_subtree_categorised to match fixed guess_category
  (only cron/output/* is cron-output, not all of cron/).

Tests: add 5 regression tests in TestStaleCronEntryMigration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard comp/cron Cron scheduler and job management comp/plugins Plugin system and bundled plugins P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

4 participants