fix: context_compressor.py - Ghost Skill P0/P1 mitigation (#32106)#32562
fix: context_compressor.py - Ghost Skill P0/P1 mitigation (#32106)#32562dolphin-creator wants to merge 2 commits into
Conversation
|
Competing fix with #32375 for issue #32106 — both add |
|
After reviewing both PRs side by side, I believe they're complementary rather than competing. Identical on context_compressor.py — both separate skill_view from skills_list/skill_manage and add the [SKILL_PRUNED] marker. What each brings the other doesn't have:
Where this PR sits in the issue roadmap (#32106):
Proposal: I'd like to incorporate the tests from #32375 into #32562, then #32375 could be closed as superseded. The combined PR would deliver P0 + P1 with proper test coverage in a single PR. Happy to coordinate — @LeonSGP43 if you're comfortable with this approach, I'll add your tests and credit you in the commit. |
…coverage (NousResearch#32106) - TestToolResultSummaries: skill_view emits [SKILL_PRUNED], skills_list/skill_manage remain metadata-only - TestGuidanceConstants: SKILLS_GUIDANCE includes ## Skill Safety Rule with reload instruction - Credits: test patterns from LeonSGP43 (PR NousResearch#32375), adapted for merged PR
|
Tests added (commit b7aaf2a) — this PR now covers both P0 and P1 with test coverage:
Test patterns adapted from @LeonSGP43 PR #32375 with credit in the commit. All 4 pass. Proposal: Since this PR now includes the tests that were unique to #32375, plus the @alt-glitch happy to coordinate on the merge if this looks good. |
|
Production validation: I've been running these exact P0+P1 patches on my own Hermes Agent instance since May 26th (3 days in production). Zero issues, zero ghost skill loops, zero crashes. The Happy to provide more data points if needed. |
Contexte
Linked issue: #32106
Bug 'Ghost Skill' : Lors de la compression contextuelle, les skills chargés via
skill_view()sont tronqués et réduits à des placeholders.L'LLM interprète ces placeholders comme des skills disponibles, alors que leur contenu est perdu, ce qui génère des réponses hallucinées. La règle
SOUL.mdest contournée car le modèle croit le skill chargé.Solution (Quick-Win P0 + P1)
Ce fix injecte deux couches de sécurité minimalistes pour briser la boucle d'hallucination sans modifier l'architecture de compression.
P0 — Marqueur explicite
[SKILL_PRUNED]Fichier :
agent/context_compressor.pyAjout d'un suffixe explicite lors du pruning d'un skill compressé :
Cela force le modèle à reconnaître que le skill est injoignable.
P1 — Règle système d'invalidation
Fichier :
agent/prompt_builder.pyInjection d'une règle
## Skill Safety RuledansSKILLS_GUIDANCE:[SKILL_PRUNED], le skill est considéré comme inaccessible.skill_view(name='...')avant toute action dépendant du skill.Impact