Skip to content

feat(plugins): skill-router — auto-load skills on first action tool call#19492

Closed
Cyrene963 wants to merge 18 commits into
NousResearch:mainfrom
Cyrene963:feature/skill-router-plugin
Closed

feat(plugins): skill-router — auto-load skills on first action tool call#19492
Cyrene963 wants to merge 18 commits into
NousResearch:mainfrom
Cyrene963:feature/skill-router-plugin

Conversation

@Cyrene963

@Cyrene963 Cyrene963 commented May 4, 2026

Copy link
Copy Markdown

Closing — keyword-based approach rejected. Replaced by Skill Evaluation Gate in PR #18316.

Nitrogen and others added 17 commits May 3, 2026 19:46
Merged changes from upstream PRs NousResearch#17380 and NousResearch#17349:
- SUMMARY_PREFIX: memory is ALWAYS authoritative, never background reference
- memory_manager: 'informational background data' -> 'authoritative reference data'
- Recompaction detection: shrink protect_first_n to avoid stale first exchange
- Compression note: memory remains fully authoritative after compaction
- Backward-compatible regex for new/old memory labels
- Regression tests for recompaction behavior
Implements three-layer hybrid skill selection:
- Layer 1: Fast rules (0 token) - greetings, simple questions
- Layer 2: Task patterns (0 token) - debug/github/system/research/etc.
- Layer 3: AI inference (future) - complex tasks

Integrates with prompt_builder.py build_skills_system_prompt_semantic().
Falls back to FTS5 when hybrid selection has no match.

Token savings: 99.2% vs broadcast, 93.2% vs FTS5-only.
Based on实测 data from 39 test conversations.
Community PRs applied:
- NousResearch#18596: Enable secret redaction by default (SECURITY)
- NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400
- NousResearch#18607: Emergency compression before max_iterations exhaustion
- NousResearch#18603: Compression fallback to main model on 413 rate limit
- NousResearch#18638: Pass threshold_percent on model switch
- NousResearch#18663: Strip extra_content from tool_calls for strict APIs
- NousResearch#18618: Forward explicit_api_key to OpenRouter
- NousResearch#18632: Show cache tokens in /insights breakdown
- NousResearch#18614: Add idempotency guard for patch duplicate loops
- NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode
- NousResearch#18616: Allow ZWJ emoji in context files
- NousResearch#18582: Reload .env on /restart
- NousResearch#18547: Stabilize system prompt prefix for KV cache reuse
- NousResearch#18692: Strip FTS5 operators from session search truncation terms

Fix: Add order_by_last_active=True to list_sessions_rich call
(pre-existing commit 142b4bf code sync)
- MiMo models don't receive tool-use enforcement guidance (not in TOOL_USE_ENFORCEMENT_MODELS)
- Skill system is 'advisory' not 'enforcement' - LLM can ignore loaded skills
- No mechanism to verify response compliance with skill rules

1. Add 'mimo' to TOOL_USE_ENFORCEMENT_MODELS
2. Create MIMO_MODEL_EXECUTION_GUIDANCE for MiMo-specific enforcement
3. Add mandatory_skills config (skills that MUST be loaded before factual responses)
4. Add skill_enforcement config (verify responses against loaded skills)
5. Add _verify_skill_compliance() method for runtime verification

- agent/prompt_builder.py: Add MiMo to enforcement models, create MIMO guidance
- run_agent.py: Add config support, mandatory skills injection, compliance verification

- MiMo models now receive enforcement guidance (same as GPT/Gemini)
- Factual responses must verify from official sources
- Mandatory skills are loaded before any factual response
- Compliance verification catches violations before sending

- PR NousResearch#18316 (semantic skill retrieval)
- PR NousResearch#17380 (memory authority preservation)
- Issue: MiMo hallucination without verification
…kpoints

Addresses the 'having rules != following rules' problem where the agent
has skills and memory loaded but fails to follow their rules during
execution. The plugin hooks into pre_tool_call and triggers compliance
checkpoints every N action tool calls.

How it works:
- Tracks 'action tool' calls per session (terminal, write_file, patch,
  browser_*, delegate_task, cronjob, execute_code, etc.)
- Every 8 action calls, blocks with a COMPLIANCE CHECKPOINT message
- Agent must call skill_view/hindsight_recall/session_search to acknowledge
- Counter resets after acknowledgment
- Non-action tools (read_file, search_files, web_search) don't count
- Per-session isolation (different sessions tracked independently)

Complements PR NousResearch#18316 (hybrid skill selector):
- PR NousResearch#18316 = which skills to inject into system prompt
- This plugin = periodic verification that injected rules are followed

Configurable via _CHECKPOINT_INTERVAL constant (default: 8).

Tested scenarios:
- First 7 action tools pass, 8th triggers checkpoint
- Acknowledgment resets counter, next 8 pass
- read_file/search_files don't increment counter
- Sessions are independently tracked
When a gateway or CLI session uses /model to switch providers, the new
provider's credential_pool was silently dropped — ModelSwitchResult had
no such field, so switch_model() discarded what resolve_runtime_provider
returned.  The agent was created with credential_pool=None, making
_recover_with_credential_pool() a no-op on 429/402 errors.

Fix (combining the approach from NousResearch#16701 with CLI coverage):

- hermes_cli/model_switch.py: Add credential_pool field to
  ModelSwitchResult, capture it from resolve_runtime_provider() in
  both the explicit-provider and same-provider re-resolve paths.

- gateway/run.py: Propagate credential_pool from the result into
  session overrides and in-place agent.switch_model().  Update
  _session_model_overrides type hint from Dict[str, str] to
  Dict[str, Any] to accommodate the CredentialPool instance.

- cli.py: Both _apply_model_switch_result and _handle_model_switch
  pass result.credential_pool to agent.switch_model() and update
  self._credential_pool.

- run_agent.py: Accept credential_pool parameter in switch_model()
  and update self._credential_pool when provided.

Based on the analysis in NousResearch#16701 (briandevans).  Closes NousResearch#16678.
- Remove regex-based fact_verification_gate (keyword matching)
- Add auto session_search alongside hindsight prefetch
- Inject retrieval results as system message (higher attention)
- Config: agent.auto_context.enabled, .session_limit
- Semantic search via hindsight embeddings, NOT keyword matching
…cation

Adds  command that:
1. Reads config.yaml to find Telegram home channel chat_id
2. Falls back to largest memory directory detection
3. Automatically migrates data and creates symlinks
4. Sets up global MEMORY.md/USER.md for CLI access

Addresses feedback on NousResearch#9308 overlap:
- NousResearch#9308 auto-detects owner at gateway layer (Honcho only)
- This provides auto-setup for owner + manual mapping for any user
- Works with all memory providers, not just Honcho

Commands: auto-setup, map, unmap, list, resolve, migrate
CLAUDE_CODE_OAUTH_TOKEN is an OAuth token scoped to Claude Code. When
Hermes runs in the same environment as Claude Code (common on dev
machines), it picks up this token and tries to use it for Anthropic
API requests. The Anthropic API rejects it with HTTP 400 because the
token is bound to Claude Code's OAuth client, not Hermes.

This causes silent auth failures or crash loops when the user has
Claude Code installed but hasn't explicitly configured Anthropic
credentials for Hermes.

Remove CLAUDE_CODE_OAUTH_TOKEN from:
- auth.py: Anthropic provider api_key_env_vars
- providers.py: Anthropic HermesOverlay extra_env_vars
- web_server.py: auth status endpoint fallback

The token is already in the terminal subprocess blocklist (local.py)
and marked as implicit in setup.py, so this completes the separation.
Users who want Anthropic access in Hermes should set ANTHROPIC_API_KEY
explicitly.

Fixes NousResearch#15080
Two related fixes for custom_providers model switching:

1. validate_requested_model() now recognizes custom:<name> slugs
   (e.g. custom:volcengine) as custom endpoints, not generic providers.
   Previously only the bare 'custom' slug matched the relaxed validation
   branch, causing model validation to fail with 'not found in provider
   listing' for all named custom providers.

2. switch_model() now consults the custom_providers list when deciding
   whether to override a validation rejection. If the requested model
   matches the entry's 'model' field or any key in its 'models' dict,
   the switch is accepted even when the remote /v1/models endpoint does
   not list it.

Both changes are covered by existing tests (86 passed).
…h#19286)

Add _validate_provider_credentials() to check if a provider has usable
credentials (API key, credential pool, or env var) before attempting
to create a client. This prevents silent 404 failures when a provider
is configured but has no API key.

In _resolve_auto() Step 1, now validates credentials before calling
resolve_provider_client(). If no credentials found, logs a warning
and immediately falls back to secondary providers (Step 2).

Fixes NousResearch#19286
…tion

Solves the 'agent has skills but doesn't load them' problem.
Forces a skill-loading checkpoint on the FIRST action tool call.

Pattern: same as skill-enforcer but fires once at session start.
Complementary: router loads skills, enforcer ensures they're followed.

Enable: add 'skill-router' to plugins.enabled in config.yaml
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins tool/skills Skills system (list, view, manage) labels May 4, 2026
…ment)

Core code change: analyze user's first message during system prompt
construction and inject matching skill content directly. Agent doesn't
need to call skill_view — the content is already there.

How it works:
1. _build_system_prompt now accepts user_message parameter
2. New module agent/skill_auto_inject.py scans skill triggers
3. Chinese phrase matching (2+ char sequences) for trigger detection
4. Top 2 matching skills injected into <auto_loaded_skills> block
5. Only fires on first turn (system prompt is cached after that)

Combined enforcement chain:
- Level 4: Core code auto-injects skills (THIS COMMIT)
- Level 3: skill-router plugin blocks first action
- Level 3: skill-enforcer plugin periodic compliance
- Level 1: Skill rules in SKILL.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants