Bug Description
After using /model kimi-for-coding in an interactive session, the first turn works correctly, but the second turn fails with 404 because _ensure_runtime_credentials() silently overwrites api_mode from anthropic_messages back to chat_completions.
Steps to Reproduce
- Start a session with any provider (e.g., MiniMax)
- Run
/model kimi-for-coding
- Send a message — it works
- Send a second message — 404 error
Root Cause
In cli.py, _ensure_runtime_credentials() calls resolve_runtime_provider() which returns api_mode='chat_completions' as the default for kimi-coding. This value is then unconditionally written to self.api_mode at line 3256:
# cli.py ~L3221
resolved_api_mode = runtime.get("api_mode", self.api_mode)
# ...
# cli.py ~L3256
self.api_mode = resolved_api_mode
The /model command had correctly set api_mode="anthropic_messages", but the per-turn credential refresh destroys it.
Expected Behavior
When a user explicitly switches to a model that requires a specific transport mode (e.g., kimi-for-coding → anthropic_messages), that transport mode should persist across turns.
Actual Behavior
api_mode reverts to chat_completions on every turn after _ensure_runtime_credentials(), causing OpenAI-format traffic to be sent to an Anthropic-format endpoint. Kimi returns 404.
Environment
- Hermes Agent: latest main (as of 2026-04-29)
- Provider:
kimi-coding
- Model:
kimi-for-coding
- Transport:
anthropic_messages
Local Workaround
We patched cli.py with a provider-specific transport lock after line 3262 and in _resolve_turn_agent_config():
if self.provider == "kimi-coding" or self.model == "kimi-for-coding":
self.provider = "kimi-coding"
self.model = "kimi-for-coding"
self.base_url = "https://api.kimi.com/coding/"
self.api_mode = "anthropic_messages"
self._explicit_base_url = "https://api.kimi.com/coding/"
This works but is a hack. A proper fix would be for resolve_runtime_provider() to preserve an explicitly-configured api_mode when the provider/model match, or for _ensure_runtime_credentials() to treat api_mode as sticky.
Impact
This is not Kimi-specific. Any future provider requiring a non-default api_mode (e.g., Codex codex_responses, or other Anthropic-adapter endpoints) will hit the same pattern. The failure is silent — users see 404s and assume the endpoint is broken.
Suggested Fix
Option A: resolve_runtime_provider() should accept an explicit_api_mode parameter and return it when the provider/model match.
Option B: _ensure_runtime_credentials() should compare the resolved api_mode against the current self.api_mode and only overwrite when the model/provider has actually changed.
Bug Description
After using
/model kimi-for-codingin an interactive session, the first turn works correctly, but the second turn fails with 404 because_ensure_runtime_credentials()silently overwritesapi_modefromanthropic_messagesback tochat_completions.Steps to Reproduce
/model kimi-for-codingRoot Cause
In
cli.py,_ensure_runtime_credentials()callsresolve_runtime_provider()which returnsapi_mode='chat_completions'as the default forkimi-coding. This value is then unconditionally written toself.api_modeat line 3256:The
/modelcommand had correctly setapi_mode="anthropic_messages", but the per-turn credential refresh destroys it.Expected Behavior
When a user explicitly switches to a model that requires a specific transport mode (e.g.,
kimi-for-coding→anthropic_messages), that transport mode should persist across turns.Actual Behavior
api_modereverts tochat_completionson every turn after_ensure_runtime_credentials(), causing OpenAI-format traffic to be sent to an Anthropic-format endpoint. Kimi returns 404.Environment
kimi-codingkimi-for-codinganthropic_messagesLocal Workaround
We patched
cli.pywith a provider-specific transport lock after line 3262 and in_resolve_turn_agent_config():This works but is a hack. A proper fix would be for
resolve_runtime_provider()to preserve an explicitly-configuredapi_modewhen the provider/model match, or for_ensure_runtime_credentials()to treatapi_modeas sticky.Impact
This is not Kimi-specific. Any future provider requiring a non-default
api_mode(e.g., Codexcodex_responses, or other Anthropic-adapter endpoints) will hit the same pattern. The failure is silent — users see 404s and assume the endpoint is broken.Suggested Fix
Option A:
resolve_runtime_provider()should accept anexplicit_api_modeparameter and return it when the provider/model match.Option B:
_ensure_runtime_credentials()should compare the resolvedapi_modeagainst the currentself.api_modeand only overwrite when the model/provider has actually changed.