Skip to content

feat(plugins): add llm-switch example plugin for local server management#3672

Open
crxssrazr93 wants to merge 1 commit into
NousResearch:mainfrom
crxssrazr93:feat/llm-switch-plugin
Open

feat(plugins): add llm-switch example plugin for local server management#3672
crxssrazr93 wants to merge 1 commit into
NousResearch:mainfrom
crxssrazr93:feat/llm-switch-plugin

Conversation

@crxssrazr93

Copy link
Copy Markdown

Summary

Example plugin demonstrating the lifecycle hooks activated in #3542. Auto-manages a local llama-server (or any OpenAI-compatible server) when the active model matches a locally configured model name.

This is a plugin-only PR — no core changes. All hook infrastructure was already merged in #3542.

Supersedes #2930 (which included core hook patches before #3542 was merged).

What it does

  • pre_llm_call hook: Detects when the active model name matches a key in models.yaml. If the correct server isn't running, starts it automatically before the LLM call proceeds.
  • on_session_end hook: Kills the server when the session ends.
  • switch_local_llm tool: Mid-session model switching — the agent calls this when asked "switch to the code model". Swaps the server behind the scenes while the endpoint stays the same.
  • Declarative YAML config: Define models with GGUF paths, context sizes, KV cache quantization, and sampling params. Replaces shell scripts.

Example models.yaml

server:
  binary: llama-server
  models_dir: ~/llama-models
  port: 8080
  gpu_layers: 99
  flash_attention: true

models:
  write:
    description: "SEO articles and content briefs"
    gguf: qwen3.5-9b/Qwen3.5-9B-UD-Q6_K_XL.gguf
    context: 49152
    kv_cache: { key: q8_0, value: q4_0 }
    sampling: { temp: 0.7, top_p: 0.8, top_k: 20 }

  code:
    description: "Agentic coding and tool calling"
    gguf: omnicoder-9b/omnicoder-9b-q4_k_m.gguf
    context: 65536
    sampling: { temp: 0.6, top_p: 0.95 }

User flow

  1. hermes model → select custom provider → pick model name matching models.yaml key
  2. Start chatting → pre_llm_call hook auto-starts the server on first message
  3. Mid-session: "switch to the code model" → agent calls switch_local_llm tool → server swaps
  4. Exit → on_session_end kills server

Relationship to other PRs

Changes

6 new files in docs/llm-switch-plugin-example/:

File Purpose
plugin.yaml Manifest
__init__.py Registration, hook handlers, tool handler
schemas.py Tool schema for switch_local_llm
server.py Pure Python server lifecycle (start, stop, health check)
models.yaml.example Example config with full schema documentation
README.md Setup instructions, usage, and config reference

Testing

  • Plugin is self-contained — copy to ~/.hermes/plugins/llm-switch/, add models.yaml, verify with /plugins
  • No existing code is modified — zero regression risk

Platforms tested

  • Linux (Arch)

Example plugin demonstrating the lifecycle hooks activated in NousResearch#3542.
Auto-manages a local llama-server (or any OpenAI-compatible server) when
the active model matches a locally configured model name.

Features:
- pre_llm_call hook: auto-starts the correct server on first message
  when hermes is configured with a local model name
- on_session_end hook: kills the server on exit
- switch_local_llm tool: mid-session model switching — the agent swaps
  the server when asked ("switch to the code model")
- Declarative YAML config for model definitions (GGUF paths, context
  sizes, KV cache quantization, sampling params) replacing shell scripts

The plugin is self-contained in docs/llm-switch-plugin-example/ with a
README, example config, and full implementation. Users copy it to
~/.hermes/plugins/llm-switch/ to install.

Complements NousResearch#3360 and NousResearch#3548 which restore /model as a slash command —
once merged, /model custom:write would trigger the pre_llm_call hook
to auto-start the right server seamlessly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins labels May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants