fix(curator): skip archival of skills referenced by active cron jobs (#29912) by friday-james · Pull Request #30108 · NousResearch/hermes-agent

friday-james · 2026-05-21T23:29:32Z

The curator was archiving skills while a cron job that references them was still
enabled — the job would then silently fail on its next scheduled run with no
skill to invoke.

Bug

Reported in issue #29912. The curator's automatic transition phase
(apply_automatic_transitions) and its LLM review prompt had no visibility into
which skills were pinned by the cron scheduler. A skill could be stale-by-age
and get archived even though cron list showed an active job pointing at it.

Changes

cron/jobs.py — get_active_skill_refs(): new single source of truth;
iterates enabled jobs via list_jobs(include_disabled=False) and returns the
set of referenced skill names.
agent/curator.py — _load_cron_protected(): snapshot function called once
at the top of run_curator_review(). Returns an empty set when the cron
subsystem is not installed, None on store error (fail-closed: auto-archive
transitions are skipped for that run rather than proceeding unprotected).
agent/curator.py — apply_automatic_transitions(): accepts the pre-computed
cron_protected set; any skill in that set bypasses the stale/archive
transitions. Handles None (cron unreadable) by logging and skipping.
agent/curator.py — _render_candidate_list(): surfaces the protected set as
a labelled section in the LLM prompt. When the cron store is unreadable, emits
a hard warning instructing the review agent not to archive anything this run.
agent/curator.py — run_curator_review(): snapshot computed once, passed to
both the automatic phase and the LLM prompt to prevent TOCTOU drift.
tools/skill_manager_tool.py — _delete_skill(): hard enforcement layer
inside the curator review fork (is_background_review()). Even if the LLM
tries to delete a cron-referenced skill, the tool call is rejected with a
structured error and a remediation hint.
tests/agent/test_curator.py (+98 LOC): 4 new tests — cron-referenced skill
not archived, _load_cron_protected returns empty set when cron absent, returns
None and warns on runtime error, transitions skip when cron store raises.
tests/agent/test_curator_backup.py (+2/-2 LOC): lambda stub updated to
accept the new cron_protected kwarg in apply_automatic_transitions.
tests/cron/test_jobs.py (+41 LOC): TestGetActiveSkillRefs — 6 tests
covering empty scheduler, single-skill jobs, multi-skill jobs, disabled job
filtering, and missing skill key handling.
tests/tools/test_skill_manager_tool.py (+73 LOC): TestDeleteSkillCronGuard — 4
tests covering blocked deletion inside curator fork, allowed when skill is not
referenced, pass-through outside the curator fork, and structured failure
response on cron store error.

Python compatibility

from cron.jobs import ... with sys.modules["cron.jobs"] = None raises plain
ImportError on Python 3.11 (not ModuleNotFoundError). Both guards now use
except ImportError as e and check e.name to distinguish "subsystem absent"
from "broken dependency", covering all supported Python versions (≥ 3.11).

Validation

Targeted test files via python3 -m pytest:

tests/agent/test_curator.py
tests/agent/test_curator_backup.py
tests/cron/test_jobs.py
tests/tools/test_skill_manager_tool.py

252 passed.

E2E: skill aged past archive_after_days, cron job enabled referencing it →
curator run completes, skill directory intact, run.json shows archived: 0,
LLM prompt contains the "Cron-protected skills" section with the skill name.

Closes #29912

…ousResearch#29912) Closes NousResearch#29912. The curator could archive a skill that an enabled cron job still depended on, causing the next scheduled run to silently fail. Three coordinated layers close the gap: 1. `cron/jobs.py` — add `get_active_skill_refs()`: calls `list_jobs(include_disabled=False)` (single source of truth) and returns the set of skill names referenced by any enabled job. 2. `agent/curator.py` — add `_load_cron_protected()`: wraps the import + call behind a two-tier error boundary. `ModuleNotFoundError` → cron not installed, return `set()`. Any other exception → cron present but broken, return `None` (fail-closed signal). `apply_automatic_transitions()` and `run_curator_review()` snapshot cron-protected refs once and propagate the `None` signal: transitions are skipped entirely when the cron store is unreadable. `_render_candidate_list()` emits a visible LLM warning and adds a "Cron-protected skills" section to the prompt so the model won't archive a live dependency. 3. `tools/skill_manager_tool.py` — harden `_delete_skill()` as a last-resort guard: inside a curator background-review fork (`is_background_review()` ContextVar), reject any delete of a cron-referenced skill with a structured error before touching disk. Same two-tier import boundary; fail-closed on cron-store errors. All three layers share `get_active_skill_refs()` as their single source of truth.

The handler uses `except Exception` (to catch broken-import errors like RuntimeError during module init), but the inline comment still said "Other ImportError". Align the comment with the actual code.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…otected kwarg The stub in test_real_run_takes_pre_snapshot used `lambda now=None: {...}` which did not accept the new `cron_protected` keyword argument added by the fix for NousResearch#29912. When run_curator_review called `apply_automatic_transitions(now=start, cron_protected=cron_protected)` the stub raised TypeError. Update to `lambda now=None, cron_protected=None: {...}`.

On Python ≤ 3.11, setting sys.modules["cron.jobs"] = None causes `from cron.jobs import ...` to raise plain ImportError instead of ModuleNotFoundError. Both _load_cron_protected() and _delete_skill() were only catching ModuleNotFoundError, so on Python 3.11 the "not installed" path would fall through to the fail-closed handler. Switch to `except ImportError as e` and check e.name to distinguish "module absent" (name in cron/cron.jobs → empty set / pass) from "dependency broken" (other name → fail-closed). Also update test comment to document the Python version behaviour difference. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

When cron.jobs is installed but get_active_skill_refs is absent (e.g. partial upgrade / old module version), Python raises plain ImportError with e.name == "cron.jobs" — indistinguishable from a missing module if we only check e.name. Both _load_cron_protected() and the _delete_skill() cron guard were returning "not installed" (empty set / pass) in this case, silently disabling the protection. Fix: catch ModuleNotFoundError and ImportError separately. Only ModuleNotFoundError with e.name in ("cron", "cron.jobs") means the subsystem is absent; plain ImportError always means fail-closed. Also fix trailing \n inside a string appended to a join-joined list in _render_candidate_list (produces double blank line). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…uard non-iterable skills field Two bugs: 1. curator.py _run_llm_review() created the review AIAgent without setting _memory_write_origin = "background_review", so is_background_review() returned False inside all curator-LLM tool calls. The _delete_skill() cron protection guard (added in this PR) was dead code for curator- initiated deletes. Fix: set the attribute before run_conversation(). 2. cron/jobs.py _normalize_skill_list() called list(skills) in the else branch with no TypeError guard. A corrupted job record with a non- iterable skills field (e.g. integer from hand-edited JSON) raised TypeError that escaped through get_active_skill_refs() and was caught by _load_cron_protected()'s except-Exception handler, returning None and suppressing all auto-archive transitions for the entire run. Fix: wrap list(skills) in try/except TypeError, treating bad records as having no skills rather than halting the whole snapshot. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

- curator.py: use BACKGROUND_REVIEW constant (from tools.skill_provenance) instead of the bare string literal when setting _memory_write_origin, so it stays in sync with is_background_review() automatically. - curator.py: add exc_info=True to all WARNING calls in _load_cron_protected so import/runtime errors include a traceback for easier diagnosis. - skill_manager_tool.py: collapse three identical except branches in the _delete_skill cron guard into one except Exception as e with an isinstance check, eliminating the duplicated log call and error-dict literal. - tests/tools/test_skill_manager_tool.py: replace manual sys.modules save/restore with monkeypatch.setitem for correct teardown semantics. - cron/jobs.py: log a WARNING when _normalize_skill_list swallows TypeError so operators know to repair a malformed job record. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

…chain Add a comment clarifying that conversation_loop.py reads _memory_write_origin and calls set_current_write_origin() at the start of run_conversation(), which is what sets the ContextVar that is_background_review() checks in tool handlers such as _delete_skill(). The mechanism is correct; the comment makes the indirect path visible to reviewers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

…ules monkeypatch.setitem only restores sys.modules["cron.jobs"]; if the import machinery ever sets cron.jobs on the parent package, the attribute leaks into later tests. Patch both together so monkeypatch teardown restores the complete import state. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

When a job record has a malformed non-iterable 'skills' field but a valid legacy 'skill' field, the TypeError handler was dropping the skill entirely, causing get_active_skill_refs() to miss the dependency and potentially allowing curator to archive an in-use skill. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

…skill_list Asserts that a job with a non-iterable 'skills' field (e.g., an integer from corrupted JSON) still contributes its legacy 'skill' value to get_active_skill_refs(), and that a WARNING is logged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

… docstring - test_curator.py: also patch cron.jobs on the parent package so monkeypatch teardown restores the full import state (same fix already applied to test_skill_manager_tool.py) - _load_cron_protected() docstring: empty set means "no protected skills" (cron absent OR installed with no enabled skill refs), not just "absent" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 21, 2026 23:29

fix: correct stale ImportError comment in _delete_skill cron guard

2740564

The handler uses `except Exception` (to catch broken-import errors like RuntimeError during module init), but the inline comment still said "Other ImportError". Align the comment with the actual code.

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder comp/cron Cron scheduler and job management tool/skills Skills system (list, view, manage) P1 High — major feature broken, no workaround labels May 21, 2026

Copilot AI reviewed May 21, 2026

friday-james and others added 2 commits May 22, 2026 00:04

friday-james requested a review from Copilot May 22, 2026 00:41