Problem Description
While developing the Gemini Image Gen plugin for Hermes, the system automatically switched the active Provider/Model after a configuration update and Gateway restart.
Steps to Reproduce
- Complete development and validation of the
Gemini Image Gen plugin
- Register the plugin and provide the Google API Key using natural language — Hermes confirms the key was written to
.env and prompts for Gateway restart
- Execute
systemctl restart hermes-gateway
- After Gateway restart, the system does not load only the new API Key — it unexpectedly switches the default Provider or Model
Error Log
ERROR: API call failed after 3 retries. HTTP 404: 404 page not found
provider=openrouter model=anthropic/claude-sonnet-4-6-20250514
Root Cause Analysis
- TaskRouter Auto-Routing:
TaskRouter (task_router.py) automatically selects the best Provider/Model
- Wrong Routing Target: TaskRouter's configuration routed requests to OpenRouter's
claude-sonnet-4-6-20250514 model
- Missing API Key: No OpenRouter API Key was configured, causing 404 failure
- Silent Failure: The 404 was captured by error handling, triggering fallback/retry instead of alerting the user
Root Cause
A Provider being "configured" and "available" are two different things:
- TaskRouter believes
openrouter is available → routes to it
- openrouter has no API key → 404 silent failure
- Agent continues with wrong configuration, triggering fallback/switch
Current Fix (Implemented)
1. agent/auxiliary_client.py — Disabled OpenRouter fallback
python
("openrouter", try_openrouter), # DISABLED
2. agent/task_router.py — Replaced all Provider lists
"openrouter" → "minimax-cn"
openrouter, → minimax-cn,
Prevention Recommendations
1. Gateway Startup: Validate Provider Availability
python
for provider in ['openrouter', 'gemini', 'minimax-cn', 'anthropic']:
key = os.environ.get(f'{provider.upper()}_API_KEY')
if not key:
logger.warning(f"Provider '{provider}' has no API key — skipping from routing")
2. TaskRouter: Filter Providers Without Valid Keys
python
def _get_available_providers():
return [p for p in ALL_PROVIDERS if _has_valid_key(p)]
3. Validate Key Immediately After Writing to .env
bash
hermes doctor --test-key google
4. Improve Prompt Messages
After writing to .env, suggest key validation before prompting restart.
5. Distinguish 404 from Missing Key vs. Resource Not Found
Surface "API key missing" errors to the user immediately instead of triggering retry/fallback.
Labels: bug, provider, configuration
Problem Description
While developing the
Gemini Image Genplugin for Hermes, the system automatically switched the active Provider/Model after a configuration update and Gateway restart.Steps to Reproduce
Gemini Image Genplugin.envand prompts for Gateway restartsystemctl restart hermes-gatewayError Log
ERROR: API call failed after 3 retries. HTTP 404: 404 page not found
provider=openrouter model=anthropic/claude-sonnet-4-6-20250514
Root Cause Analysis
TaskRouter(task_router.py) automatically selects the best Provider/Modelclaude-sonnet-4-6-20250514modelRoot Cause
A Provider being "configured" and "available" are two different things:
openrouteris available → routes to itCurrent Fix (Implemented)
1.
agent/auxiliary_client.py— Disabled OpenRouter fallbackpython
("openrouter", try_openrouter), # DISABLED
2.
agent/task_router.py— Replaced all Provider lists"openrouter"→"minimax-cn"openrouter,→minimax-cn,Prevention Recommendations
1. Gateway Startup: Validate Provider Availability
python
for provider in ['openrouter', 'gemini', 'minimax-cn', 'anthropic']:
key = os.environ.get(f'{provider.upper()}_API_KEY')
if not key:
logger.warning(f"Provider '{provider}' has no API key — skipping from routing")
2. TaskRouter: Filter Providers Without Valid Keys
python
def _get_available_providers():
return [p for p in ALL_PROVIDERS if _has_valid_key(p)]
3. Validate Key Immediately After Writing to .env
bash
hermes doctor --test-key google
4. Improve Prompt Messages
After writing to
.env, suggest key validation before prompting restart.5. Distinguish 404 from Missing Key vs. Resource Not Found
Surface "API key missing" errors to the user immediately instead of triggering retry/fallback.
Labels:
bug,provider,configuration