Skip to content

feat(system-prompt): HERMES_INTENT_OVERRIDE narrowing (closes #97, advances #89 Direction A)#98

Merged
PowerCreek merged 1 commit into
mainfrom
intent-override-narrowing-97
May 25, 2026
Merged

feat(system-prompt): HERMES_INTENT_OVERRIDE narrowing (closes #97, advances #89 Direction A)#98
PowerCreek merged 1 commit into
mainfrom
intent-override-narrowing-97

Conversation

@PowerCreek

Copy link
Copy Markdown

Summary

Closes #97 (Option A3 — operator-supplied intent override). Advances #89 Direction A: when HERMES_INTENT_OVERRIDE=code is set, the system prompt's stable layer narrows for tool-call-heavy traffic — addresses the prompt-saturation symptom on mid-tier coding models documented in #89.

What narrows under code intent

Block Action Why
SOUL.md Skip Largest single contributor; falls back to short DEFAULT_AGENT_IDENTITY floor
HERMES_AGENT_HELP_GUIDANCE Skip Off-topic for tool-call traffic
SKILLS_GUIDANCE Skip Per-tool block, off-topic for code
KANBAN_GUIDANCE Skip Worker-lifecycle, off-topic for code
SESSION_SEARCH_GUIDANCE Skip Off-topic for code
skills_prompt (the big one) Skip Biggest single contributor when many skills loaded
MEMORY_GUIDANCE Keep Small + sometimes useful even for code
TOOL_USE_ENFORCEMENT_GUIDANCE Keep Critical for tool emission
Per-model operational guidance Keep Model-quality-specific
Env / platform hints Keep Execution-environment essentials
context + volatile layers Untouched Out of scope per #97

Other intents (confer / planning / exploration / refinement / generic) are recognized as valid but pass through without narrowing in v1 — keeps the door open for per-intent shape later.

Intent vocabulary

Matches devagentic#240's intent_classifier 6-key enum exactly, so the same operator-side classifier wired into devagentic's R5 dispatch hook can also drive hermes-side prompt narrowing without a second vocabulary.

Doctor probe

New _check_intent_override_env surfaces the active override at hermes doctor:

  • Silent when unset
  • check_ok when valid (with a narrowing-active note for code)
  • check_warn with the full valid-keys list when typo'd

Mirrors the silent-when-irrelevant pattern from PR #95 / #96.

Test plan

  • 22 new prompt-narrowing tests pass — resolver enum + normalization, per-section drops under code, pass-through for non-code intents, typo falls back, byte-count regression, counter-case (default still includes), case-insensitive
  • 6 new doctor-probe tests pass
  • 258 total green across affected suites — no regression in the existing prompt-shape pins
  • After merge + container rebuild + ENV set: coding-groq / coding-gpt54 empirically emit tool calls with the narrowed prompt (the Research: default Hermes system prompt overwhelms tool-emission on mid-tier coding models #89 symptom)

Composition with #89 / #97

  • A3 (this PR) — operator pins intent via env, narrowing is static per deployment
  • A1 (port classifier) + A2 (devagentic GraphQL surface for classifyIntent) — deferred follow-ups; only matter when dynamic per-turn classification is needed on the hermes side
  • devagentic#240 — the classifier output drives R5 dispatch on the devagentic side; this PR makes the same vocabulary actionable hermes-side too

…vances #89 Direction A)

Operator-supplied intent override (Option A3 from #97). When
``HERMES_INTENT_OVERRIDE=code`` is set, the system prompt's
``stable`` layer narrows for tool-call-heavy traffic — addresses
#89's prompt-saturation symptom on mid-tier coding models.

## What narrows under code intent

| Block | Action | Why |
|---|---|---|
| SOUL.md | Skip | Largest single contributor; falls back to short DEFAULT_AGENT_IDENTITY floor |
| HERMES_AGENT_HELP_GUIDANCE | Skip | Off-topic for tool-call traffic |
| SKILLS_GUIDANCE | Skip | Per-tool block, off-topic for code |
| KANBAN_GUIDANCE | Skip | Worker-lifecycle, off-topic for code |
| SESSION_SEARCH_GUIDANCE | Skip | Off-topic for code |
| skills_prompt (the big one) | Skip | Biggest contributor when many skills loaded |
| MEMORY_GUIDANCE | **Keep** | Small + sometimes useful even for code |
| TOOL_USE_ENFORCEMENT_GUIDANCE | **Keep** | Critical for tool emission |
| Per-model operational guidance | **Keep** | Model-quality-specific |
| Env / platform hints | **Keep** | Execution-environment essentials |
| nous-subscription + computer-use + alibaba | **Keep** | Operational invariants |
| ``context`` + ``volatile`` layers | **Untouched** | Out of scope per #97 |

Other intents (``confer`` / ``planning`` / ``exploration`` /
``refinement`` / ``generic``) are recognized as valid but pass
through without narrowing in v1 (keeps the door open for per-
intent shape later).

## Intent vocabulary

Matches devagentic#240's ``intent_classifier`` 6-key enum exactly,
so the same operator-side classifier that's wired into devagentic's
R5 dispatch hook can also drive hermes-side prompt narrowing
without a second vocabulary.

## Doctor probe

New ``_check_intent_override_env`` probe surfaces the active
override at ``hermes doctor`` time — silent when unset, check_ok
when valid (with a narrowing-active note for ``code``), check_warn
with the full valid-keys list when typo'd. Mirrors the silent-
when-irrelevant pattern from PR #95 / #96.

## Tests

- 22 new prompt-narrowing tests in
  ``tests/agent/test_system_prompt_intent_override.py``: resolver
  enum + normalization (5), per-section drops under code (7),
  pass-through for non-code intents (5), typo falls back (1),
  byte-count regression (1), default-still-includes counter-case (1),
  case-insensitive (1), runtime-vs-doctor-config sanity (1).
- 6 new doctor-probe tests in
  ``tests/hermes_cli/test_doctor_intent_override_probe.py``:
  silent-when-unset / silent-when-empty / code-ok-with-narrowing-note /
  non-code-valid-pass-through / typo-warn-with-valid-sample /
  case-insensitive.
- 258 total green across affected suites (system-prompt + prompt-
  builder + restore + doctor + provider-env + tools-subset). No
  regression in the existing prompt-shape pins.

## Composition note

Option A1 (port classifier) + A2 (devagentic GraphQL surface) are
deferred per the #97 sequencing — A3 unblocks deployment-specific
narrowing immediately; A1/A2 only matter when dynamic per-turn
classification is needed on the hermes side. The classifier output
on the devagentic side (NousResearch#240) drives R5 dispatch decisions there.
@PowerCreek PowerCreek merged commit dcc61ee into main May 25, 2026
@PowerCreek PowerCreek deleted the intent-override-narrowing-97 branch May 25, 2026 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Direction A scoping: per-intent system-prompt narrowing (#89 follow-up; composes with devagentic#237)

1 participant