Describe the bug
When the primary provider hits a rate limit, Hermes falls back to a model whose provider is defined in custom_providers. However, the fallback request is sent to the primary provider's base_url instead of the fallback provider's own base_url, causing errors.
To Reproduce
-
Configure a tentative primary provider that you know will hit rate limiting (e.g., Nous Portal with rate_limit: True or in a busy account):
model:
provider: nous
model: deepseek/deepseek-v4-pro
base_url: https://inference-api.nousresearch.com/v1
-
Define a fallback model whose provider lives in custom_providers:
fallback_model:
provider: aliyun-singapore
model: qwen3.6-plus
custom_providers:
- name: aliyun-singapore
base_url: https://dashscope.aliyuncs.com/compatible-mode/v1
api_key: ${ALIYUN_SINGAPORE_API_KEY}
models:
qwen3.6-plus:
context_length: 1000000
-
Trigger a rate limit on the primary provider.
-
Observe the fallback request.
Expected behavior
The fallback request should be sent to https://dashscope.aliyuncs.com/compatible-mode/v1 (the aliyun-singapore provider's own base_url), using Authorization: Bearer <aliyun-singapore-api-key>.
Actual behavior
The fallback request is sent to https://inference-api.nousresearch.com/v1 (the primary nous provider's base_url), which naturally doesn't have the model qwen3.6-plus, resulting in:
⚠️ API call failed (attempt 1/3): NotFoundError [HTTP 404]
🔌 Provider: aliyun-singapore Model: qwen3.6-plus
🌐 Endpoint: https://inference-api.nousresearch.com/v1
📝 Error: HTTP 404: Model 'qwen3.6-plus' not found. The requested model does not exist in our configuration or OpenRouter catalog.
Root cause hypothesis
The fallback resolution logic correctly identifies the provider name (aliyun-singapore), but fails to look up its base_url from the custom_providers list. Instead it inherits the primary provider's base_url. This likely affects all fallback providers defined in custom_providers — they all end up hitting the primary provider's endpoint with an incompatible model name.
Workaround
Use custom_openrouter as the fallback provider (OpenRouter resolves model routing itself, so the wrong base_url to Nous is harmless for OpenRouter models):
fallback_model:
provider: custom_openrouter
model: deepseek/deepseek-v4-flash
But this is16 only a workaround. The root issue remains.
Environment
- Hermes Agent: git HEAD
- Config version:
_config_version: 22
-alan macOS + CLI/TUI mode
Describe the bug
When the primary provider hits a rate limit, Hermes falls back to a model whose provider is defined in
custom_providers. However, the fallback request is sent to the primary provider'sbase_urlinstead of the fallback provider's ownbase_url, causing errors.To Reproduce
Configure a tentative primary provider that you know will hit rate limiting (e.g., Nous Portal with
rate_limit: Trueor in a busy account):Define a fallback model whose provider lives in
custom_providers:Trigger a rate limit on the primary provider.
Observe the fallback request.
Expected behavior
The fallback request should be sent to
https://dashscope.aliyuncs.com/compatible-mode/v1(thealiyun-singaporeprovider's ownbase_url), usingAuthorization: Bearer <aliyun-singapore-api-key>.Actual behavior
The fallback request is sent to
https://inference-api.nousresearch.com/v1(the primarynousprovider'sbase_url), which naturally doesn't have the modelqwen3.6-plus, resulting in:Root cause hypothesis
The fallback resolution logic correctly identifies the provider name (
aliyun-singapore), but fails to look up itsbase_urlfrom thecustom_providerslist. Instead it inherits the primary provider'sbase_url. This likely affects all fallback providers defined incustom_providers— they all end up hitting the primary provider's endpoint with an incompatible model name.Workaround
Use
custom_openrouteras the fallback provider (OpenRouter resolves model routing itself, so the wrongbase_urlto Nous is harmless for OpenRouter models):But this is16 only a workaround. The root issue remains.
Environment
_config_version: 22-alan macOS + CLI/TUI mode