feat(curator): split archived into consolidated vs pruned with model + heuristic classification by teknium1 · Pull Request #17941 · NousResearch/hermes-agent

teknium1 · 2026-04-30T12:14:42Z

Summary

Users watching curator runs saw consolidated skills (content absorbed into a new umbrella) listed under "Skills archived" and interpreted that as pruning. They then hermes curator restored them and ended up with confusingly duplicated skillsets — the restored original plus its absorbed copy inside the umbrella.

What changed

Every skill that disappeared during a run is now classified using two signals:

Model-authored structured YAML block (new) — the curator prompt requires a fenced YAML block at the end of its final response:

consolidations:
  - from: anthropic-api
    into: llm-providers
    reason: duplicate content, now a subsection
prunings:
  - name: random-old-notes
    reason: pre-curator junk, no overlap with live skills

Gives us intent + rationale the tool calls never capture.

Tool-call heuristic (ground-truth audit) — scans this run's skill_manage calls for write/patch/create on a surviving skill referencing a removed skill's name. Catches omission (model forgot to list a real consolidation) and hallucination (model named an umbrella that doesn't exist).

_reconcile_classification() merges them:

Model wins on rationale when its umbrella exists post-run
Model hallucination → fall back to heuristic's finding, or prune if no evidence either
Heuristic-only finding gets tagged (detected via tool-call audit) in the report
Model-declared pruning with rationale surfaces the reason verbatim

What the report looks like now

### Consolidated into umbrella skills (3)

- `anthropic-api` → merged into `llm-providers` — near-duplicate of openai-api, now a subsection
- `gemini-api` → merged into `llm-providers`  _(detected via tool-call audit (model omitted from structured block))_
- `openai-api` → merged into `llm-providers` — merged with sibling into umbrella

### Pruned — archived for staleness (2)

- `ghost-skill`
- `random-old-notes` — pre-curator notes, no overlap with live skills

run.json schema

counts.consolidated_this_run / counts.pruned_this_run (new)
consolidated: [{name, into, source, reason, evidence?, model_claimed_into?}]
pruned: [{name, source, reason}]
pruned_names: [names] — flat list for quick scans / legacy consumers
archived: [names] — the union, preserved for backward compat

Validation

66 unit tests pass (test_curator_classification.py + existing).
E2E with a realistic 5-skill mixed run: model declares 2 valid consolidations + 1 hallucinated umbrella, heuristic catches a 3rd consolidation the model forgot, 1 pure prune. All four paths render correctly.

Replaces the heuristic-only approach Teknium saw was confusing; the model's intent shows up as rationale in the report.

…orts Users who watched a curator run saw skills like 'anthropic-api' listed under 'Skills archived' and interpreted that as pruning — but the curator had actually absorbed those skills into a new umbrella (e.g. 'llm-providers') during the same run. The directory gets archived for safety (all removals are recoverable), but the content still lives under a different name. Users then 'restored' what they thought were deleted skills and ended up with confusingly duplicated skillsets (old-name + absorbed-inside-umbrella). Classify removed skills using this run's skill_manage tool calls: - consolidated: content absorbed into a surviving/newly-created skill (evidenced by a skill_manage write_file/patch/create/edit whose target is a different skill AND whose file_path/content references the removed skill's name) - pruned: archived without consolidation evidence (truly stale) REPORT.md now shows two distinct sections: - 'Consolidated into umbrella skills' — with `removed → merged into umbrella` - 'Pruned — archived for staleness' — pure staleness archives run.json schema additions (backward compatible): - counts.consolidated_this_run, counts.pruned_this_run - consolidated: [{name, into, evidence}, ...] - pruned: [names] - archived: retained as the union for backward compat Also: relabel the auto-transitions 'archived' counter to 'archived (no LLM, pure time-based staleness)' so it's clearly distinct from LLM-pass archives. Tests: 9 new tests in test_curator_classification.py covering consolidation evidence parsing (write_file/patch/create), hyphen/underscore name variants, self-reference rejection, destination-must-exist, mixed runs, and malformed-JSON fallback safety. Existing test_report_md_is_human_readable updated to cover the new section names. E2E: isolated HERMES_HOME, realistic 3-skill run, REPORT.md verified end-to-end.

Extend the consolidated-vs-pruned split with LLM-authored intent: 1. Curator prompt now requires a structured YAML block at the end of the final response (consolidations / prunings with short rationale). 2. _parse_structured_summary() extracts it tolerantly — missing block, malformed YAML, partial lists all fall back to heuristic cleanly. 3. _reconcile_classification() merges model intent with the tool-call heuristic: - Model wins on rationale when its umbrella exists post-run - Model hallucination (umbrella doesn't exist) is downgraded to the heuristic's finding, or pruned if there's no evidence either - Heuristic catches model omission — consolidations the model enumerated tools for but forgot to list get surfaced with a '(detected via tool-call audit)' tag 4. REPORT.md now shows per-row rationale alongside 'removed → umbrella' and flags audit-only rows so the user knows why no reason is shown. Backward compat: run.json's 'archived' field (union) is preserved. 'pruned' is now a list of dicts with {name, source, reason}; 'pruned_names' is the flat-name list for legacy consumers. Tests: 15 new covering YAML parse edge cases (malformed, empty lists, bare-string entries, missing fields), reconciler rules (model wins, hallucination fallback, heuristic catches omission, prune with reason), and an end-to-end report-render test with all four paths exercised.

…+ heuristic classification (NousResearch#17941) * fix(curator): split 'archived' into consolidated vs pruned in run reports Users who watched a curator run saw skills like 'anthropic-api' listed under 'Skills archived' and interpreted that as pruning — but the curator had actually absorbed those skills into a new umbrella (e.g. 'llm-providers') during the same run. The directory gets archived for safety (all removals are recoverable), but the content still lives under a different name. Users then 'restored' what they thought were deleted skills and ended up with confusingly duplicated skillsets (old-name + absorbed-inside-umbrella). Classify removed skills using this run's skill_manage tool calls: - consolidated: content absorbed into a surviving/newly-created skill (evidenced by a skill_manage write_file/patch/create/edit whose target is a different skill AND whose file_path/content references the removed skill's name) - pruned: archived without consolidation evidence (truly stale) REPORT.md now shows two distinct sections: - 'Consolidated into umbrella skills' — with `removed → merged into umbrella` - 'Pruned — archived for staleness' — pure staleness archives run.json schema additions (backward compatible): - counts.consolidated_this_run, counts.pruned_this_run - consolidated: [{name, into, evidence}, ...] - pruned: [names] - archived: retained as the union for backward compat Also: relabel the auto-transitions 'archived' counter to 'archived (no LLM, pure time-based staleness)' so it's clearly distinct from LLM-pass archives. Tests: 9 new tests in test_curator_classification.py covering consolidation evidence parsing (write_file/patch/create), hyphen/underscore name variants, self-reference rejection, destination-must-exist, mixed runs, and malformed-JSON fallback safety. Existing test_report_md_is_human_readable updated to cover the new section names. E2E: isolated HERMES_HOME, realistic 3-skill run, REPORT.md verified end-to-end. * feat(curator): hybrid model-declared + heuristic classification Extend the consolidated-vs-pruned split with LLM-authored intent: 1. Curator prompt now requires a structured YAML block at the end of the final response (consolidations / prunings with short rationale). 2. _parse_structured_summary() extracts it tolerantly — missing block, malformed YAML, partial lists all fall back to heuristic cleanly. 3. _reconcile_classification() merges model intent with the tool-call heuristic: - Model wins on rationale when its umbrella exists post-run - Model hallucination (umbrella doesn't exist) is downgraded to the heuristic's finding, or pruned if there's no evidence either - Heuristic catches model omission — consolidations the model enumerated tools for but forgot to list get surfaced with a '(detected via tool-call audit)' tag 4. REPORT.md now shows per-row rationale alongside 'removed → umbrella' and flags audit-only rows so the user knows why no reason is shown. Backward compat: run.json's 'archived' field (union) is preserved. 'pruned' is now a list of dicts with {name, source, reason}; 'pruned_names' is the flat-name list for legacy consumers. Tests: 15 new covering YAML parse edge cases (malformed, empty lists, bare-string entries, missing fields), reconciler rules (model wins, hallucination fallback, heuristic catches omission, prune with reason), and an end-to-end report-render test with all four paths exercised.

alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) labels Apr 30, 2026

teknium1 changed the title ~~fix(curator): split 'archived' into consolidated vs pruned in run reports~~ feat(curator): split archived into consolidated vs pruned with model + heuristic classification Apr 30, 2026

teknium1 merged commit 8b290a5 into main Apr 30, 2026
11 checks passed

teknium1 deleted the hermes/hermes-e2c9c7c8 branch April 30, 2026 17:31

github-actions Bot mentioned this pull request May 1, 2026

chore: bump NousResearch/hermes-agent version from v2026.4.23 to v2026.4.30 Docker-Hub-sirmark/docker-hermes-agent#4

Merged

subinium mentioned this pull request May 11, 2026

feat(curator): autonomous skill curator — background grading + consolidation + pruning (Hermes v0.12 parity) subinium/CrowClaw#311

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(curator): split archived into consolidated vs pruned with model + heuristic classification#17941

feat(curator): split archived into consolidated vs pruned with model + heuristic classification#17941
teknium1 merged 2 commits into
mainfrom
hermes/hermes-e2c9c7c8

teknium1 commented Apr 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

What the report looks like now

run.json schema

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

teknium1 commented Apr 30, 2026 •

edited

Loading