Summary
The auxiliary client's "custom endpoint" resolution has three redundant code paths that each inline the same (api_mode → client type) dispatch. Each time someone discovers api_mode: anthropic_messages is broken on one path, a PR patches only that path. After #7648 (path 1) and #15059 (path 2), path 3 is still broken. The deeper issue: the field name and docstring of path 3 (explicit_base_url) hardcode the wrong assumption that "custom == OpenAI-compatible" — so even when api_mode: anthropic_messages is correctly threaded through, the field rejects it by design.
Symptom
auxiliary.<task> config with inline base_url + api_mode: anthropic_messages:
auxiliary:
vision:
provider: custom
model: "moonshotai/kimi-k2.6"
base_url: "https://zenmux.ai/api/anthropic"
api_mode: anthropic_messages
key_env: ZENMUX_API_KEY
Outbound HTTP captured with httpx spy:
POST https://zenmux.ai/api/anthropic/chat/completions ← OpenAI path
BODY: {"messages":[{"role":"user","content":[{"type":"text",...},
{"type":"image","source":{"type":"base64",...}}]}],
"model":"moonshotai/kimi-k2.6","max_tokens":2000} ← Anthropic body
Anthropic-format body sent to the OpenAI path. The upstream gateway returns 404 invalid_model (or 403/500 depending on which fallback the gateway routes to). A direct curl to the same endpoint with /v1/messages and the same model + key succeeds — so the model and credentials work; only the SDK choice is wrong.
The Three Paths
agent/auxiliary_client.py has three independent code paths that all do the same thing — "given a (base_url, api_key, api_mode), construct the right SDK client" — and each had to be fixed separately:
| # |
Path |
Triggered by |
api_mode dispatch |
Status |
| 1 |
_try_custom_endpoint() (~L1159) |
OPENAI_BASE_URL env / main model.base_url (no explicit_base_url arg) |
inline if custom_mode == "anthropic_messages": |
✅ fixed by #7648 / shipped in #12846 |
| 2 |
named-custom branch in resolve_provider_client (~L1880) |
provider: myrelay + providers.myrelay.{...} |
inline if entry_api_mode == "anthropic_messages": |
✅ fixed by #15059 (closes #15033) |
| 3 |
explicit-base-url branch in resolve_provider_client (L1822-1853) |
auxiliary.<task>.base_url flowed in via explicit_base_url |
none — always builds plain OpenAI client |
❌ still broken |
resolve_vision_provider_client (L2236-2247) feeds path 3 unconditionally when an auxiliary.<task> config has a base_url:
if resolved_base_url:
client, final_model = resolve_provider_client(
"custom",
...
explicit_base_url=resolved_base_url,
api_mode=resolved_api_mode, # ← param is plumbed through but the
# callee ignores it for anthropic_messages
)
And the callee (L1822-1853):
if provider == "custom":
if explicit_base_url:
...
client = OpenAI(api_key=custom_key, base_url=_clean_base, **extra)
client = _wrap_if_needed(client, final_model, custom_base) # only handles codex_responses
return ...
_wrap_if_needed only checks api_mode == "codex_responses". There is no anthropic_messages branch here — even though the same file's path 1 and path 2 both have one.
Why Patching Path 3 Isn't Enough
Look at the field:
explicit_base_url: Optional[str] = None, # docstring: "Optional direct OpenAI-compatible endpoint"
The field name and docstring hardcode the wrong invariant. "custom" providers are not necessarily OpenAI-compatible — they can be Anthropic-format gateways (Zenmux, MiniMax, Zhipu GLM, LiteLLM proxies in anthropic mode), Bedrock-compatible proxies, etc. The auxiliary.<task> config has supported api_mode: anthropic_messages for a while; the field just refuses to honor it.
This is not a one-PR oversight — the codebase structurally encodes "custom = OpenAI" through:
- The field name
explicit_base_url (vs. e.g. endpoint_url + separate api_mode)
- The docstring "Optional direct OpenAI-compatible endpoint"
_wrap_if_needed only doing Codex wrapping (no Anthropic branch)
OPENAI_API_KEY/OPENAI_BASE_URL env-var fallbacks (path 1) which prime the assumption that this is an OpenAI shape
Even if you patch L1822-1853 to add an anthropic_messages branch tomorrow, you'll have repeated the same per-path patching pattern that #7648 and #15059 already did. The next gateway-format extension (gemini-native, bedrock-converse-style proxies, etc.) will hit the same bug shape on all three paths again.
Sibling Bug: key_env not parsed by _resolve_task_provider_model
The example config above uses key_env: ZENMUX_API_KEY. After path 3 is fixed and the request is correctly sent to /v1/messages, the request fails with 403 access_denied because the x-api-key header is the literal string no-key-required — the ZENMUX_API_KEY env var is never read for auxiliary.<task> configs.
agent/auxiliary_client.py:_resolve_task_provider_model (L2684-2693):
cfg_api_key = str(task_config.get("api_key", "")).strip() or None
cfg_api_mode = str(task_config.get("api_mode", "")).strip() or None
# no equivalent for cfg_key_env
Compare to _get_named_custom_provider (which path 2 uses) — it correctly reads key_env:
custom_key = custom_entry.get("api_key", "").strip()
custom_key_env = custom_entry.get("key_env", "").strip()
if not custom_key and custom_key_env:
custom_key = os.getenv(custom_key_env, "").strip()
So key_env works in providers: dict / custom_providers: list (path 2), but is silently dropped in auxiliary.<task> configs (path 1 and path 3). Same root cause as the api_mode duplication — three configuration surfaces, each parsing fields independently, drift inevitable.
Suggested Refactor
A single helper in agent/auxiliary_client.py:
def _build_custom_endpoint_client(
*, base_url: str, api_key: str, api_mode: Optional[str],
model: str, async_mode: bool,
) -> Tuple[Any, str]:
"""Sole source of truth for 'custom endpoint → SDK client' dispatch.
Honors api_mode in {None, "chat_completions", "anthropic_messages",
"codex_responses"}. All three resolution paths must call this rather
than re-inlining the dispatch.
"""
...
Three callers (the three paths) all use it. explicit_base_url either gets renamed to something neutral (endpoint_url) with the OpenAI-only assumption removed from the docstring, or it gets retired in favor of having callers pass through to the helper directly.
For key_env, fold the api_key/key_env resolution into a small helper too, and make _resolve_task_provider_model and _get_named_custom_provider both use it.
Related
Reproducer
Repo HEAD: origin/main 755a2804 (2026-04-26). Python 3.11.14, anthropic SDK 0.94.0.
- Configure
~/.hermes/config.yaml auxiliary.vision with the YAML above.
- Set
ZENMUX_API_KEY (or any anthropic-format gateway key).
- Run
vision_analyze_tool directly, or send a sticker through the gateway to trigger _handle_sticker.
- Observe
404 invalid_model (or 403/500) and outbound POST to <base_url>/chat/completions.
To monkey-patch httpx and capture the actual outbound request:
import httpx
_orig = httpx.AsyncClient.send
async def _spy(self, request, **kw):
print(f">>> {request.method} {request.url}")
print(f" body[:500]: {(request.content or b'')[:500].decode(errors='replace')}")
return await _orig(self, request, **kw)
httpx.AsyncClient.send = _spy
Environment
- Hermes Agent:
origin/main HEAD 755a2804 (2026-04-26)
- Python 3.11.14
- macOS / Darwin 25.4.0
- Gateway: zenmux.ai (
/api/anthropic route, anthropic-format)
Summary
The auxiliary client's "custom endpoint" resolution has three redundant code paths that each inline the same
(api_mode → client type)dispatch. Each time someone discoversapi_mode: anthropic_messagesis broken on one path, a PR patches only that path. After #7648 (path 1) and #15059 (path 2), path 3 is still broken. The deeper issue: the field name and docstring of path 3 (explicit_base_url) hardcode the wrong assumption that "custom == OpenAI-compatible" — so even whenapi_mode: anthropic_messagesis correctly threaded through, the field rejects it by design.Symptom
auxiliary.<task>config with inlinebase_url+api_mode: anthropic_messages:Outbound HTTP captured with httpx spy:
Anthropic-format body sent to the OpenAI path. The upstream gateway returns
404 invalid_model(or403/500depending on which fallback the gateway routes to). A directcurlto the same endpoint with/v1/messagesand the same model + key succeeds — so the model and credentials work; only the SDK choice is wrong.The Three Paths
agent/auxiliary_client.pyhas three independent code paths that all do the same thing — "given a (base_url, api_key, api_mode), construct the right SDK client" — and each had to be fixed separately:_try_custom_endpoint()(~L1159)OPENAI_BASE_URLenv / mainmodel.base_url(noexplicit_base_urlarg)if custom_mode == "anthropic_messages":resolve_provider_client(~L1880)provider: myrelay+providers.myrelay.{...}if entry_api_mode == "anthropic_messages":resolve_provider_client(L1822-1853)auxiliary.<task>.base_urlflowed in viaexplicit_base_urlresolve_vision_provider_client(L2236-2247) feeds path 3 unconditionally when anauxiliary.<task>config has abase_url:And the callee (L1822-1853):
_wrap_if_neededonly checksapi_mode == "codex_responses". There is noanthropic_messagesbranch here — even though the same file's path 1 and path 2 both have one.Why Patching Path 3 Isn't Enough
Look at the field:
The field name and docstring hardcode the wrong invariant. "custom" providers are not necessarily OpenAI-compatible — they can be Anthropic-format gateways (Zenmux, MiniMax, Zhipu GLM, LiteLLM proxies in anthropic mode), Bedrock-compatible proxies, etc. The
auxiliary.<task>config has supportedapi_mode: anthropic_messagesfor a while; the field just refuses to honor it.This is not a one-PR oversight — the codebase structurally encodes "custom = OpenAI" through:
explicit_base_url(vs. e.g.endpoint_url+ separateapi_mode)_wrap_if_neededonly doing Codex wrapping (no Anthropic branch)OPENAI_API_KEY/OPENAI_BASE_URLenv-var fallbacks (path 1) which prime the assumption that this is an OpenAI shapeEven if you patch L1822-1853 to add an
anthropic_messagesbranch tomorrow, you'll have repeated the same per-path patching pattern that #7648 and #15059 already did. The next gateway-format extension (gemini-native, bedrock-converse-style proxies, etc.) will hit the same bug shape on all three paths again.Sibling Bug:
key_envnot parsed by_resolve_task_provider_modelThe example config above uses
key_env: ZENMUX_API_KEY. After path 3 is fixed and the request is correctly sent to/v1/messages, the request fails with403 access_deniedbecause thex-api-keyheader is the literal stringno-key-required— theZENMUX_API_KEYenv var is never read forauxiliary.<task>configs.agent/auxiliary_client.py:_resolve_task_provider_model(L2684-2693):Compare to
_get_named_custom_provider(which path 2 uses) — it correctly readskey_env:So
key_envworks inproviders:dict /custom_providers:list (path 2), but is silently dropped inauxiliary.<task>configs (path 1 and path 3). Same root cause as the api_mode duplication — three configuration surfaces, each parsing fields independently, drift inevitable.Suggested Refactor
A single helper in
agent/auxiliary_client.py:Three callers (the three paths) all use it.
explicit_base_urleither gets renamed to something neutral (endpoint_url) with the OpenAI-only assumption removed from the docstring, or it gets retired in favor of having callers pass through to the helper directly.For
key_env, fold theapi_key/key_envresolution into a small helper too, and make_resolve_task_provider_modeland_get_named_custom_providerboth use it.Related
_try_custom_endpoint); fix shipped in fix(anthropic): complete third-party Anthropic-compatible provider support #12846Reproducer
Repo HEAD:
origin/main755a2804(2026-04-26). Python 3.11.14, anthropic SDK 0.94.0.~/.hermes/config.yamlauxiliary.visionwith the YAML above.ZENMUX_API_KEY(or any anthropic-format gateway key).vision_analyze_tooldirectly, or send a sticker through the gateway to trigger_handle_sticker.404 invalid_model(or403/500) and outbound POST to<base_url>/chat/completions.To monkey-patch httpx and capture the actual outbound request:
Environment
origin/mainHEAD755a2804(2026-04-26)/api/anthropicroute, anthropic-format)