Feature Request: User-Configurable Multi-Model Routing with Capability Categories and Evaluation Feedback

## Summary

Enable end users to configure multiple LLMs across defined capability categories (e.g., speed, intelligence, uncensored, low-cost, reasoning-heavy), and allow tools to request models based on declared requirements rather than relying on a single developer-defined model.

This would introduce a flexible model-routing layer where:

* Users assign models to capability categories.
* Tools specify their needs (e.g., “fast + cheap” vs “high reasoning”).
* The runtime resolves the appropriate model dynamically.
* Optional evaluation metrics help refine model selection over time.

---

## Motivation

Currently, tool developers implicitly choose which model is used. However, different users have different priorities:

* Some prioritize cost efficiency.
* Some prioritize maximum reasoning depth.
* Some need uncensored models.
* Some want ultra-low latency.
* Some may run local models for privacy.

Allowing users to define model assignments per capability category increases:

* Flexibility
* Transparency
* Performance tuning
* Cost control
* Adaptability to new models

It also decouples tool design from specific model vendors.

---

## Proposed Architecture

### 1. Model Capability Categories

Allow users to define models per category, for example:

```yaml
models:
  fast:
    - gpt-4o-mini
    - mistral-small
  reasoning:
    - gpt-4o
    - claude-opus
  uncensored:
    - local-llama
  cheap:
    - gpt-4o-mini
```

These categories are user-configurable.

---

### 2. Tool-Level Model Requirements

When a tool calls the LLM, it declares its needs:

```python
call_llm(
    task="parse structured JSON",
    requirements={
        "speed": "high",
        "reasoning": "low"
    }
)
```

The runtime then selects an appropriate model based on user configuration.

This prevents overusing large models when smaller ones are sufficient.

---

### 3. Dynamic Category Resolution and User-Driven Assignment

Tools should be able to dynamically request capability categories or reasoning levels without requiring that every possible category be predefined by the framework. If a tool requests a capability that has not yet been mapped by the user (e.g., `"deep_reasoning_level_3"` or `"creative_uncensored_longform"`), the system should gracefully fall back to the default model. At the same time, this unresolved request should appear in the user configuration as an “unassigned capability.” The user can then choose to link that capability to an existing category, assign a specific model, or define a new routing rule. This creates a feedback loop where the system evolves based on actual tool demands rather than requiring exhaustive upfront configuration. Over time, the model routing layer becomes shaped organically by real usage patterns instead of rigid developer assumptions. This approach allows capability taxonomies to emerge from real-world tool usage rather than being hardcoded, making the routing layer extensible and future-proof.

---

### 4. Evaluation Layer (Optional but Powerful)

Add an optional evaluation mode where:

* The LLM (or secondary model) evaluates output correctness.
* Success/failure stats are logged per tool-model pairing.
* Developers can analyze performance tradeoffs.

Example stored metrics:

```json
{
  "tool": "json_parser",
  "model": "gpt-4o-mini",
  "success_rate": 0.94,
  "avg_latency": 120ms,
  "avg_cost": 0.0003
}
```

This would allow:

* Data-driven model routing
* Automatic optimization
* Tool-specific model recommendations

---

## Benefits

* Decouples tool logic from fixed model assumptions
* Empowers users to control cost, performance, censorship level
* Enables adaptive routing strategies
* Future-proofs the agent against rapid model evolution
* Creates a foundation for self-optimizing agents

---

## Open Questions

* Should routing be rule-based, weighted, or hybrid?
* Should evaluation be user opt-in?
* Should tools declare “minimum viable intelligence” levels?
* Should there be fallback chains if a preferred model fails?

---

## Why This Matters

As model ecosystems diversify (open weights, closed APIs, local models, etc.), a single-model architecture becomes limiting.

A user-configurable routing layer positions Hermes Agent as:

* Vendor-neutral
* Cost-aware
* Performance-tunable
* Adaptable to future model ecosystems

This also aligns with the philosophy of modular, agentic systems rather than monolithic LLM binding.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: User-Configurable Multi-Model Routing with Capability Categories and Evaluation Feedback #157

Summary

Motivation

Proposed Architecture

1. Model Capability Categories

2. Tool-Level Model Requirements

3. Dynamic Category Resolution and User-Driven Assignment

4. Evaluation Layer (Optional but Powerful)

Benefits

Open Questions

Why This Matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: User-Configurable Multi-Model Routing with Capability Categories and Evaluation Feedback #157

Description

Summary

Motivation

Proposed Architecture

1. Model Capability Categories

2. Tool-Level Model Requirements

3. Dynamic Category Resolution and User-Driven Assignment

4. Evaluation Layer (Optional but Powerful)

Benefits

Open Questions

Why This Matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions