Problem Statement
Currently, Hermes determines model capabilities (vision, reasoning, tools) through:
- The models.dev database (https://models.dev/api.json)
- Hardcoded patterns in the codebase
This works well for popular cloud models, but breaks for custom/local models that:
- Are not listed in models.dev
- Run on private/internal API endpoints
- Have custom fine-tunes with different capabilities than the base model
- Use proxies/gateways that report generic model names
Current behavior
When using a custom provider with a local model:
custom_providers:
- name: my-local-vllm
base_url: http://localhost:8000/v1
api_key: dummy
models:
my-llava-model:
context_length: 8192
Hermes has no way to know that my-llava-model supports vision, so:
vision_analyze tool falls back to auxiliary vision model (extra API call)
browser_vision uses auxiliary model instead of native passthrough
- Reasoning parameters are not sent even if the model supports it
- Tool schemas may be incorrectly filtered
Users discover these limitations through degraded performance or cryptic errors, not upfront configuration.
Proposed Solution
Extend the custom_providers configuration to allow explicit capability declaration per model:
custom_providers:
- name: my-local-vllm
base_url: http://localhost:8000/v1
api_key: dummy
models:
my-llava-model:
context_length: 8192
capabilities:
vision: true # Supports image input
reasoning: false # Does not support reasoning params
tools: true # Supports function calling
streaming: true # Supports streaming output
my-reasoning-model:
context_length: 32768
capabilities:
vision: false
reasoning: true # Supports reasoning_effort parameter
tools: true
streaming: true
Design Details
1. Configuration Schema
Extend _VALID_CUSTOM_PROVIDER_FIELDS in hermes_cli/config.py to accept a capabilities dict:
_VALID_CUSTOM_PROVIDER_FIELDS = {
"name", "base_url", "api_key", "api_mode", "model", "models",
"context_length", "rate_limit_delay", # existing
# Proposed new fields inside models.<model>:
# capabilities.vision: bool
# capabilities.reasoning: bool
# capabilities.tools: bool
# capabilities.streaming: bool
}
2. Capability Resolution Order
When determining model capabilities, Hermes should check in this priority:
- Explicit config override (
custom_providers[].models.<model>.capabilities)
- models.dev database (if available)
- Built-in pattern matching (regex on model name)
- Conservative defaults (tools: true, vision/reasoning: false)
3. Integration Points
Modify these functions to respect config overrides:
-
agent/models_dev.py:get_model_capabilities()
- Add parameter to accept config override
- Check custom_providers before falling back to models.dev
-
run_agent.py:AIAgent._check_native_vision_support()
- Check config override before pattern matching
-
run_agent.py:AIAgent._supports_reasoning_extra_body()
- Check config override for reasoning capability
4. Validation & UX
- Config validation: Warn if
capabilities contains unknown keys
- hermes doctor: Check that declared capabilities match detected ones (warn on mismatch)
- Setup wizard: When adding custom provider, optionally ask about capabilities
$ hermes setup custom
...
Does my-llava-model support vision/multimodal? [y/N]: y
Does it support tool calling? [Y/n]: y
Does it support reasoning parameters? [y/N]: n
Benefits
- Correct behavior for local models: LLaVA, Qwen-VL, local fine-tunes work correctly
- Reduced API costs: No unnecessary auxiliary model calls for vision
- Better UX: Users configure once, Hermes behaves correctly everywhere
- Foundation for future features: Enables smart routing based on declared capabilities
Backwards Compatibility
- Fully backwards compatible: Existing configs without
capabilities continue to work
- Optional field: All capability fields default to
null (auto-detect)
- Explicit over implicit: Only declared capabilities override auto-detection
# Old config continues to work unchanged
custom_providers:
- name: legacy-config
base_url: http://localhost:8000/v1
models:
some-model:
context_length: 4096
# capabilities omitted - auto-detect as before
Related Issues
Implementation Notes
This feature builds on the foundation laid by:
The implementation should follow the same pattern used for context_length override.
Checklist
Problem Statement
Currently, Hermes determines model capabilities (vision, reasoning, tools) through:
This works well for popular cloud models, but breaks for custom/local models that:
Current behavior
When using a custom provider with a local model:
Hermes has no way to know that
my-llava-modelsupports vision, so:vision_analyzetool falls back to auxiliary vision model (extra API call)browser_visionuses auxiliary model instead of native passthroughUsers discover these limitations through degraded performance or cryptic errors, not upfront configuration.
Proposed Solution
Extend the
custom_providersconfiguration to allow explicit capability declaration per model:Design Details
1. Configuration Schema
Extend
_VALID_CUSTOM_PROVIDER_FIELDSinhermes_cli/config.pyto accept acapabilitiesdict:2. Capability Resolution Order
When determining model capabilities, Hermes should check in this priority:
custom_providers[].models.<model>.capabilities)3. Integration Points
Modify these functions to respect config overrides:
agent/models_dev.py:get_model_capabilities()run_agent.py:AIAgent._check_native_vision_support()run_agent.py:AIAgent._supports_reasoning_extra_body()4. Validation & UX
capabilitiescontains unknown keysBenefits
Backwards Compatibility
capabilitiescontinue to worknull(auto-detect)Related Issues
Implementation Notes
This feature builds on the foundation laid by:
The implementation should follow the same pattern used for
context_lengthoverride.Checklist
DEFAULT_CONFIGschema for custom_providers_VALID_CUSTOM_PROVIDER_FIELDSget_model_capabilities()to check config overriderun_agent.pyrun_agent.pyvalidate_config_structure()hermes doctorchecks