feat(plugins): skill-router — auto-load skills on first action tool call#19492
Closed
Cyrene963 wants to merge 18 commits into
Closed
feat(plugins): skill-router — auto-load skills on first action tool call#19492Cyrene963 wants to merge 18 commits into
Cyrene963 wants to merge 18 commits into
Conversation
Merged changes from upstream PRs NousResearch#17380 and NousResearch#17349: - SUMMARY_PREFIX: memory is ALWAYS authoritative, never background reference - memory_manager: 'informational background data' -> 'authoritative reference data' - Recompaction detection: shrink protect_first_n to avoid stale first exchange - Compression note: memory remains fully authoritative after compaction - Backward-compatible regex for new/old memory labels - Regression tests for recompaction behavior
Implements three-layer hybrid skill selection: - Layer 1: Fast rules (0 token) - greetings, simple questions - Layer 2: Task patterns (0 token) - debug/github/system/research/etc. - Layer 3: AI inference (future) - complex tasks Integrates with prompt_builder.py build_skills_system_prompt_semantic(). Falls back to FTS5 when hybrid selection has no match. Token savings: 99.2% vs broadcast, 93.2% vs FTS5-only. Based on实测 data from 39 test conversations.
Community PRs applied: - NousResearch#18596: Enable secret redaction by default (SECURITY) - NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400 - NousResearch#18607: Emergency compression before max_iterations exhaustion - NousResearch#18603: Compression fallback to main model on 413 rate limit - NousResearch#18638: Pass threshold_percent on model switch - NousResearch#18663: Strip extra_content from tool_calls for strict APIs - NousResearch#18618: Forward explicit_api_key to OpenRouter - NousResearch#18632: Show cache tokens in /insights breakdown - NousResearch#18614: Add idempotency guard for patch duplicate loops - NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode - NousResearch#18616: Allow ZWJ emoji in context files - NousResearch#18582: Reload .env on /restart - NousResearch#18547: Stabilize system prompt prefix for KV cache reuse - NousResearch#18692: Strip FTS5 operators from session search truncation terms Fix: Add order_by_last_active=True to list_sessions_rich call (pre-existing commit 142b4bf code sync)
- MiMo models don't receive tool-use enforcement guidance (not in TOOL_USE_ENFORCEMENT_MODELS) - Skill system is 'advisory' not 'enforcement' - LLM can ignore loaded skills - No mechanism to verify response compliance with skill rules 1. Add 'mimo' to TOOL_USE_ENFORCEMENT_MODELS 2. Create MIMO_MODEL_EXECUTION_GUIDANCE for MiMo-specific enforcement 3. Add mandatory_skills config (skills that MUST be loaded before factual responses) 4. Add skill_enforcement config (verify responses against loaded skills) 5. Add _verify_skill_compliance() method for runtime verification - agent/prompt_builder.py: Add MiMo to enforcement models, create MIMO guidance - run_agent.py: Add config support, mandatory skills injection, compliance verification - MiMo models now receive enforcement guidance (same as GPT/Gemini) - Factual responses must verify from official sources - Mandatory skills are loaded before any factual response - Compliance verification catches violations before sending - PR NousResearch#18316 (semantic skill retrieval) - PR NousResearch#17380 (memory authority preservation) - Issue: MiMo hallucination without verification
…kpoints Addresses the 'having rules != following rules' problem where the agent has skills and memory loaded but fails to follow their rules during execution. The plugin hooks into pre_tool_call and triggers compliance checkpoints every N action tool calls. How it works: - Tracks 'action tool' calls per session (terminal, write_file, patch, browser_*, delegate_task, cronjob, execute_code, etc.) - Every 8 action calls, blocks with a COMPLIANCE CHECKPOINT message - Agent must call skill_view/hindsight_recall/session_search to acknowledge - Counter resets after acknowledgment - Non-action tools (read_file, search_files, web_search) don't count - Per-session isolation (different sessions tracked independently) Complements PR NousResearch#18316 (hybrid skill selector): - PR NousResearch#18316 = which skills to inject into system prompt - This plugin = periodic verification that injected rules are followed Configurable via _CHECKPOINT_INTERVAL constant (default: 8). Tested scenarios: - First 7 action tools pass, 8th triggers checkpoint - Acknowledgment resets counter, next 8 pass - read_file/search_files don't increment counter - Sessions are independently tracked
When a gateway or CLI session uses /model to switch providers, the new provider's credential_pool was silently dropped — ModelSwitchResult had no such field, so switch_model() discarded what resolve_runtime_provider returned. The agent was created with credential_pool=None, making _recover_with_credential_pool() a no-op on 429/402 errors. Fix (combining the approach from NousResearch#16701 with CLI coverage): - hermes_cli/model_switch.py: Add credential_pool field to ModelSwitchResult, capture it from resolve_runtime_provider() in both the explicit-provider and same-provider re-resolve paths. - gateway/run.py: Propagate credential_pool from the result into session overrides and in-place agent.switch_model(). Update _session_model_overrides type hint from Dict[str, str] to Dict[str, Any] to accommodate the CredentialPool instance. - cli.py: Both _apply_model_switch_result and _handle_model_switch pass result.credential_pool to agent.switch_model() and update self._credential_pool. - run_agent.py: Accept credential_pool parameter in switch_model() and update self._credential_pool when provided. Based on the analysis in NousResearch#16701 (briandevans). Closes NousResearch#16678.
- Remove regex-based fact_verification_gate (keyword matching) - Add auto session_search alongside hindsight prefetch - Inject retrieval results as system message (higher attention) - Config: agent.auto_context.enabled, .session_limit - Semantic search via hindsight embeddings, NOT keyword matching
…cation Adds command that: 1. Reads config.yaml to find Telegram home channel chat_id 2. Falls back to largest memory directory detection 3. Automatically migrates data and creates symlinks 4. Sets up global MEMORY.md/USER.md for CLI access Addresses feedback on NousResearch#9308 overlap: - NousResearch#9308 auto-detects owner at gateway layer (Honcho only) - This provides auto-setup for owner + manual mapping for any user - Works with all memory providers, not just Honcho Commands: auto-setup, map, unmap, list, resolve, migrate
CLAUDE_CODE_OAUTH_TOKEN is an OAuth token scoped to Claude Code. When Hermes runs in the same environment as Claude Code (common on dev machines), it picks up this token and tries to use it for Anthropic API requests. The Anthropic API rejects it with HTTP 400 because the token is bound to Claude Code's OAuth client, not Hermes. This causes silent auth failures or crash loops when the user has Claude Code installed but hasn't explicitly configured Anthropic credentials for Hermes. Remove CLAUDE_CODE_OAUTH_TOKEN from: - auth.py: Anthropic provider api_key_env_vars - providers.py: Anthropic HermesOverlay extra_env_vars - web_server.py: auth status endpoint fallback The token is already in the terminal subprocess blocklist (local.py) and marked as implicit in setup.py, so this completes the separation. Users who want Anthropic access in Hermes should set ANTHROPIC_API_KEY explicitly. Fixes NousResearch#15080
Two related fixes for custom_providers model switching: 1. validate_requested_model() now recognizes custom:<name> slugs (e.g. custom:volcengine) as custom endpoints, not generic providers. Previously only the bare 'custom' slug matched the relaxed validation branch, causing model validation to fail with 'not found in provider listing' for all named custom providers. 2. switch_model() now consults the custom_providers list when deciding whether to override a validation rejection. If the requested model matches the entry's 'model' field or any key in its 'models' dict, the switch is accepted even when the remote /v1/models endpoint does not list it. Both changes are covered by existing tests (86 passed).
…h#19286) Add _validate_provider_credentials() to check if a provider has usable credentials (API key, credential pool, or env var) before attempting to create a client. This prevents silent 404 failures when a provider is configured but has no API key. In _resolve_auto() Step 1, now validates credentials before calling resolve_provider_client(). If no credentials found, logs a warning and immediately falls back to secondary providers (Step 2). Fixes NousResearch#19286
…tion Solves the 'agent has skills but doesn't load them' problem. Forces a skill-loading checkpoint on the FIRST action tool call. Pattern: same as skill-enforcer but fires once at session start. Complementary: router loads skills, enforcer ensures they're followed. Enable: add 'skill-router' to plugins.enabled in config.yaml
…ment) Core code change: analyze user's first message during system prompt construction and inject matching skill content directly. Agent doesn't need to call skill_view — the content is already there. How it works: 1. _build_system_prompt now accepts user_message parameter 2. New module agent/skill_auto_inject.py scans skill triggers 3. Chinese phrase matching (2+ char sequences) for trigger detection 4. Top 2 matching skills injected into <auto_loaded_skills> block 5. Only fires on first turn (system prompt is cached after that) Combined enforcement chain: - Level 4: Core code auto-injects skills (THIS COMMIT) - Level 3: skill-router plugin blocks first action - Level 3: skill-enforcer plugin periodic compliance - Level 1: Skill rules in SKILL.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closing — keyword-based approach rejected. Replaced by Skill Evaluation Gate in PR #18316.