Skip to content

[Bug]: 'custom = OpenAI' hardcoded assumption: explicit_base_url path ignores api_mode (sibling of #7661/#15033, plus key_env never parsed in aux config) #16254

@EurFelux

Description

@EurFelux

Summary

The auxiliary client's "custom endpoint" resolution has three redundant code paths that each inline the same (api_mode → client type) dispatch. Each time someone discovers api_mode: anthropic_messages is broken on one path, a PR patches only that path. After #7648 (path 1) and #15059 (path 2), path 3 is still broken. The deeper issue: the field name and docstring of path 3 (explicit_base_url) hardcode the wrong assumption that "custom == OpenAI-compatible" — so even when api_mode: anthropic_messages is correctly threaded through, the field rejects it by design.

Symptom

auxiliary.<task> config with inline base_url + api_mode: anthropic_messages:

auxiliary:
  vision:
    provider: custom
    model: "moonshotai/kimi-k2.6"
    base_url: "https://zenmux.ai/api/anthropic"
    api_mode: anthropic_messages
    key_env: ZENMUX_API_KEY

Outbound HTTP captured with httpx spy:

POST https://zenmux.ai/api/anthropic/chat/completions    ← OpenAI path
BODY: {"messages":[{"role":"user","content":[{"type":"text",...},
                   {"type":"image","source":{"type":"base64",...}}]}],
       "model":"moonshotai/kimi-k2.6","max_tokens":2000}    ← Anthropic body

Anthropic-format body sent to the OpenAI path. The upstream gateway returns 404 invalid_model (or 403/500 depending on which fallback the gateway routes to). A direct curl to the same endpoint with /v1/messages and the same model + key succeeds — so the model and credentials work; only the SDK choice is wrong.

The Three Paths

agent/auxiliary_client.py has three independent code paths that all do the same thing — "given a (base_url, api_key, api_mode), construct the right SDK client" — and each had to be fixed separately:

# Path Triggered by api_mode dispatch Status
1 _try_custom_endpoint() (~L1159) OPENAI_BASE_URL env / main model.base_url (no explicit_base_url arg) inline if custom_mode == "anthropic_messages": ✅ fixed by #7648 / shipped in #12846
2 named-custom branch in resolve_provider_client (~L1880) provider: myrelay + providers.myrelay.{...} inline if entry_api_mode == "anthropic_messages": ✅ fixed by #15059 (closes #15033)
3 explicit-base-url branch in resolve_provider_client (L1822-1853) auxiliary.<task>.base_url flowed in via explicit_base_url none — always builds plain OpenAI client still broken

resolve_vision_provider_client (L2236-2247) feeds path 3 unconditionally when an auxiliary.<task> config has a base_url:

if resolved_base_url:
    client, final_model = resolve_provider_client(
        "custom",
        ...
        explicit_base_url=resolved_base_url,
        api_mode=resolved_api_mode,        # ← param is plumbed through but the
                                           #   callee ignores it for anthropic_messages
    )

And the callee (L1822-1853):

if provider == "custom":
    if explicit_base_url:
        ...
        client = OpenAI(api_key=custom_key, base_url=_clean_base, **extra)
        client = _wrap_if_needed(client, final_model, custom_base)   # only handles codex_responses
        return ...

_wrap_if_needed only checks api_mode == "codex_responses". There is no anthropic_messages branch here — even though the same file's path 1 and path 2 both have one.

Why Patching Path 3 Isn't Enough

Look at the field:

explicit_base_url: Optional[str] = None,    # docstring: "Optional direct OpenAI-compatible endpoint"

The field name and docstring hardcode the wrong invariant. "custom" providers are not necessarily OpenAI-compatible — they can be Anthropic-format gateways (Zenmux, MiniMax, Zhipu GLM, LiteLLM proxies in anthropic mode), Bedrock-compatible proxies, etc. The auxiliary.<task> config has supported api_mode: anthropic_messages for a while; the field just refuses to honor it.

This is not a one-PR oversight — the codebase structurally encodes "custom = OpenAI" through:

  • The field name explicit_base_url (vs. e.g. endpoint_url + separate api_mode)
  • The docstring "Optional direct OpenAI-compatible endpoint"
  • _wrap_if_needed only doing Codex wrapping (no Anthropic branch)
  • OPENAI_API_KEY/OPENAI_BASE_URL env-var fallbacks (path 1) which prime the assumption that this is an OpenAI shape

Even if you patch L1822-1853 to add an anthropic_messages branch tomorrow, you'll have repeated the same per-path patching pattern that #7648 and #15059 already did. The next gateway-format extension (gemini-native, bedrock-converse-style proxies, etc.) will hit the same bug shape on all three paths again.

Sibling Bug: key_env not parsed by _resolve_task_provider_model

The example config above uses key_env: ZENMUX_API_KEY. After path 3 is fixed and the request is correctly sent to /v1/messages, the request fails with 403 access_denied because the x-api-key header is the literal string no-key-required — the ZENMUX_API_KEY env var is never read for auxiliary.<task> configs.

agent/auxiliary_client.py:_resolve_task_provider_model (L2684-2693):

cfg_api_key = str(task_config.get("api_key", "")).strip() or None
cfg_api_mode = str(task_config.get("api_mode", "")).strip() or None
# no equivalent for cfg_key_env

Compare to _get_named_custom_provider (which path 2 uses) — it correctly reads key_env:

custom_key = custom_entry.get("api_key", "").strip()
custom_key_env = custom_entry.get("key_env", "").strip()
if not custom_key and custom_key_env:
    custom_key = os.getenv(custom_key_env, "").strip()

So key_env works in providers: dict / custom_providers: list (path 2), but is silently dropped in auxiliary.<task> configs (path 1 and path 3). Same root cause as the api_mode duplication — three configuration surfaces, each parsing fields independently, drift inevitable.

Suggested Refactor

A single helper in agent/auxiliary_client.py:

def _build_custom_endpoint_client(
    *, base_url: str, api_key: str, api_mode: Optional[str],
    model: str, async_mode: bool,
) -> Tuple[Any, str]:
    """Sole source of truth for 'custom endpoint → SDK client' dispatch.

    Honors api_mode in {None, "chat_completions", "anthropic_messages",
    "codex_responses"}. All three resolution paths must call this rather
    than re-inlining the dispatch.
    """
    ...

Three callers (the three paths) all use it. explicit_base_url either gets renamed to something neutral (endpoint_url) with the OpenAI-only assumption removed from the docstring, or it gets retired in favor of having callers pass through to the helper directly.

For key_env, fold the api_key/key_env resolution into a small helper too, and make _resolve_task_provider_model and _get_named_custom_provider both use it.

Related

Reproducer

Repo HEAD: origin/main 755a2804 (2026-04-26). Python 3.11.14, anthropic SDK 0.94.0.

  1. Configure ~/.hermes/config.yaml auxiliary.vision with the YAML above.
  2. Set ZENMUX_API_KEY (or any anthropic-format gateway key).
  3. Run vision_analyze_tool directly, or send a sticker through the gateway to trigger _handle_sticker.
  4. Observe 404 invalid_model (or 403/500) and outbound POST to <base_url>/chat/completions.

To monkey-patch httpx and capture the actual outbound request:

import httpx
_orig = httpx.AsyncClient.send
async def _spy(self, request, **kw):
    print(f">>> {request.method} {request.url}")
    print(f"    body[:500]: {(request.content or b'')[:500].decode(errors='replace')}")
    return await _orig(self, request, **kw)
httpx.AsyncClient.send = _spy

Environment

  • Hermes Agent: origin/main HEAD 755a2804 (2026-04-26)
  • Python 3.11.14
  • macOS / Darwin 25.4.0
  • Gateway: zenmux.ai (/api/anthropic route, anthropic-format)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsarea/configConfig system, migrations, profilescomp/agentCore agent loop, run_agent.py, prompt buildertype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions