fix(models): accept Gemini + Anthropic in gateway /model picker (#12532) by briandevans · Pull Request #12585 · NousResearch/hermes-agent

briandevans · 2026-04-19T14:14:44Z

TL;DR

Gateway /model picker rejects both Gemini and Anthropic models while CLI hermes model works with the same credentials. Two distinct bugs in validate_requested_model:

Gemini: /models returns IDs like models/gemini-2.5-flash — curated list and user input use bare gemini-2.5-flash → set membership drops every model.
Anthropic: generic probe sends Authorization: Bearer without anthropic-version → Anthropic's native /v1/models returns 4xx → fetch_api_models yields None → lands in the generic "couldn't reach API" hard-reject.

Both fixes mirror existing patterns already in this function (Bedrock / Alibaba catalog fall-through for Anthropic; symmetric normalization for the Gemini prefix drift).

Root cause details

Gemini

Gemini's OpenAI-compat endpoint at generativelanguage.googleapis.com/v1beta/openai/models returns IDs in the native Gemini format (models/gemini-2.5-flash). The curated _PROVIDER_MODELS["gemini"] list and the picker UI both use the bare ID. requested_for_lookup in set(api_models) therefore fails for every known Gemini model.

Anthropic

probe_api_models constructs headers at hermes_cli/models.py:1776-1780:

headers: dict[str, str] = {}
if api_key:
    headers["Authorization"] = f"Bearer {api_key}"
if normalized.startswith(COPILOT_BASE_URL):
    headers.update(copilot_default_headers())

Anthropic's API requires x-api-key (for API keys) or Authorization: Bearer with anthropic-version header — never Bearer alone without a version. So the probe silently 4xxs, returning None, and the request lands in the "couldn't reach API" hard-reject branch.

_fetch_anthropic_models (line 1338) already does the right thing — it's called during provider_model_ids("anthropic") for picker population, but never for per-request validation.

Fix

Two surgical additions to validate_requested_model:

1. Gemini prefix strip — before the set-membership check:

if api_models is not None:
    if normalized == "gemini":
        api_models = [
            m[len("models/"):] if isinstance(m, str) and m.startswith("models/") else m
            for m in api_models
        ]
    if requested_for_lookup in set(api_models):
        ...

The normalized list also drives auto-correct + suggestions, so "Similar models: gemini-2.5-flash" surfaces bare IDs the user can actually type — not the useless models/... literals.

2. Anthropic catalog fall-through — after the existing Bedrock block, same pattern as the #12287 Alibaba fix:

if normalized == "anthropic":
    try:
        catalog = provider_model_ids("anthropic")
    except Exception:
        catalog = []
    if catalog:
        if requested in set(catalog) or requested_for_lookup in set(catalog):
            return {"accepted": True, "persist": True, "recognized": True, "message": None}
        suggestions = get_close_matches(requested, catalog, n=3, cutoff=0.4)
        return {
            "accepted": True,
            "persist": True,
            "recognized": False,
            "message": f"Note: `{requested}` was not found in the Anthropic catalog. "
                       f"It may still work if it's a newer model the catalog doesn't list yet. ...",
        }
    # empty / exception → fall through to the generic reject

provider_model_ids("anthropic") already handles live-fetch (via _fetch_anthropic_models with the correct x-api-key + anthropic-version headers) and falls back to the curated static list on failure.

Behaviour matrix

Scenario	Before	After
`/model gemini:gemini-2.5-flash` via picker	REJECT — not in `{models/gemini-2.5-flash, …}`	accept (quiet)
`/model anthropic:claude-opus-4-7` via picker	REJECT — "couldn't reach Anthropic API"	accept (quiet)
Unknown Gemini ID (`gemini-hypothetical`)	reject with `models/…`-leaked suggestions	reject with bare-ID suggestions
Unknown Anthropic model	"couldn't reach Anthropic API"	accept with warning + close-match suggestions
Other provider (`zai`) with unreachable API	reject (unchanged)	reject (unchanged)
Other provider with `models/`-prefixed API response	mismatch on bare ID (unchanged)	mismatch on bare ID (unchanged)
Anthropic catalog empty (import failure)	hard-reject	fall through to hard-reject
Anthropic catalog raises	hard-reject	fall through to hard-reject

Narrow scope — explicitly not changed

probe_api_models auth headers. Still Bearer-only. Teaching the generic probe about Anthropic's header requirements would entangle provider-specific details in the transport layer; the catalog fall-through is less invasive.
Other providers that might return models/-prefixed IDs. The strip is gated on normalized == "gemini". Pinned by a custom canary.
Reporter's Option A (skip validation entirely when the model was chosen from a curated picker). That's a gateway-side refactor of _on_model_selected/switch_model; this PR keeps the fix at the validator layer so it benefits CLI, gateway, and any future caller uniformly.

Regression coverage

Two new test classes, 11 cases:

TestValidateGeminiModelsPrefix (4 cases):

test_bare_id_accepted_despite_models_prefix_from_api — reporter's repro
test_all_curated_gemini_ids_resolve — 4 IDs across pro/flash/flash-lite/preview
test_unknown_gemini_id_surfaces_suggestions_not_generic_reject — pins that suggestions don't leak the models/ prefix
test_prefix_strip_limited_to_gemini_provider — custom provider canary

TestValidateAnthropicNoModelsEndpoint (7 cases):

test_curated_claude_model_accepted — reporter's repro
test_all_tiers_resolve — opus/sonnet/haiku
test_unknown_model_accepted_with_warning
test_empty_catalog_falls_through_to_generic_reject
test_catalog_lookup_exception_falls_through
test_unknown_model_includes_close_match_suggestion
test_other_providers_still_hard_reject_when_api_unreachable — zai canary

6 of 11 fail on clean origin/main (6fb69229) with assert False is True on result["accepted"]. The 5 passing tests pin preserved behaviour (canaries + defensive fall-throughs).

Validation

source venv/bin/activate
python -m pytest \
  tests/hermes_cli/test_model_validation.py::TestValidateGeminiModelsPrefix \
  tests/hermes_cli/test_model_validation.py::TestValidateAnthropicNoModelsEndpoint -q
# 11 passed

Broader model-switch / normalize suites (6 files) → 159 passed, 0 failures.

Relation to prior PRs

This PR uses the exact same fall-through pattern as #12287 (Alibaba DashScope coding endpoint). That pattern in turn mirrors the pre-existing Bedrock branch. Each provider whose /models endpoint is structurally inaccessible gets its own small catalog fall-through — no broad widening of the validator's trust surface.

_{Co-authored via LLM assistance; I've reviewed every line and am responsible for correctness.}

…Research#12532) The gateway ``/model`` picker calls ``validate_requested_model``, which probes the provider's ``/models`` endpoint. Two distinct failures drop Gemini and Anthropic models from that flow: * **Gemini**: the OpenAI-compat endpoint at ``generativelanguage.googleapis.com/v1beta/openai/models`` returns IDs prefixed with ``models/`` (e.g. ``models/gemini-2.5-flash``) — native Gemini-API convention. Our curated list and user input use the bare ID, so the set-membership check drops every known Gemini model. * **Anthropic**: the generic ``probe_api_models`` helper sends ``Authorization: Bearer`` without the ``anthropic-version`` header, so Anthropic's native ``/v1/models`` returns 4xx and ``fetch_api_models`` yields ``None``. The request lands in the generic "could not reach API" hard-reject, even though ``_fetch_anthropic_models`` (with the correct ``x-api-key`` + ``anthropic-version`` headers) works elsewhere in the codebase. Both paths cause the gateway picker to fail while ``hermes model`` (which skips validation) works fine with the same credentials. Reporter: NousResearch#12532. Fix --- Two surgical additions to ``hermes_cli.models.validate_requested_model``: 1. Strip the ``models/`` prefix from the probed listing when ``normalized == "gemini"``, before the set-membership check. The rest of the strict branch (auto-correction, suggestions, reject path) reuses the normalized list — suggestions therefore surface bare IDs the user can actually type. 2. When ``api_models is None`` and ``normalized == "anthropic"``, fall back to ``provider_model_ids("anthropic")``. That helper internally uses ``_fetch_anthropic_models`` with the correct headers and falls back to the curated static list when the live fetch also fails — identical pattern to the existing Bedrock (``#bedrock``) and Alibaba (NousResearch#12272 / PR NousResearch#12287) fall-throughs. Narrow scope — explicitly not changed ------------------------------------- * **``probe_api_models`` auth headers.** Still ``Bearer``-only. Adding Anthropic-specific headers to the generic probe is out of scope; the catalog fall-through is the less invasive fix and keeps the generic probe provider-agnostic. * **Other providers whose /models returns prefixed IDs.** The strip is gated on ``normalized == "gemini"`` so no other provider's behaviour changes. Pinned by a ``custom`` canary test. * **Other providers' hard-reject on unreachable API.** Still reject. Pinned by a ``zai`` canary test. * **Reporter's Option A** (skip validation entirely when the model was chosen from a curated picker). That's a gateway-side refactor (``_on_model_selected`` → ``switch_model``); this PR keeps the fix at the validator layer, which also covers CLI direct invocations and future callers. Regression coverage ------------------- ``tests/hermes_cli/test_model_validation.py`` gets two new classes: * ``TestValidateGeminiModelsPrefix`` (4 cases) — bare ID acceptance, all curated Gemini IDs resolve, unknown IDs surface suggestions that don't leak the ``models/`` prefix, and a canary pinning that the strip is gated on gemini. * ``TestValidateAnthropicNoModelsEndpoint`` (7 cases) — curated Claude model accepted, all three tiers (opus/sonnet/haiku) resolve, unknown models accepted with warning, empty-catalog + exception fall through to the original generic reject, close-match suggestions surface on typos, and a ``zai`` canary preserving the generic reject for other providers. 6 of the 11 fail on clean ``origin/main`` (``6fb69229``) with ``assert False is True`` on ``result["accepted"]`` — the exact reporter symptom. The 5 remaining tests pin preserved behaviour (canaries + defensive fall-throughs). Validation ---------- ``source venv/bin/activate && python -m pytest tests/hermes_cli/test_model_validation.py::TestValidateGeminiModelsPrefix tests/hermes_cli/test_model_validation.py::TestValidateAnthropicNoModelsEndpoint -q`` → **11 passed**. Broader model-switch / normalize suites (6 files) → **159 passed, 0 failures**. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Fixes gateway /model picker validation for Gemini and Anthropic by aligning model-ID normalization with provider responses and adding provider-specific fallback validation when the generic /models probe is structurally incompatible.

Changes:

Normalize Gemini /models results by stripping the models/ prefix before membership checks and suggestions.
Add an Anthropic-specific “catalog fall-through” path when live /models probing returns None.
Add regression tests covering Gemini prefix normalization and Anthropic fallback behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`hermes_cli/models.py`	Updates `validate_requested_model()` to normalize Gemini IDs and to fall back to the Anthropic catalog when `/models` probing fails.
`tests/hermes_cli/test_model_validation.py`	Adds regression tests for the Gemini `models/` prefix mismatch and Anthropic probe failure fallback.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    if normalized == "anthropic":
+        try:
+            catalog = provider_model_ids("anthropic")
+        except Exception:
+            catalog = []
+        if catalog:


+    def test_unknown_gemini_id_surfaces_suggestions_not_generic_reject(self):
+        """The prefix-strip fix must not accidentally route unknown Gemini
+        models into the generic "couldn't reach API" branch.  The post-
+        strip list must be used for suggestions too."""
+        result = self._validate("gemini-hypothetical")
+        # Strict branch rejects unknown IDs (even after strip) — same as
+        # any other provider whose /models endpoint responded.
+        assert result["accepted"] is False
+        # Suggestions must reference the bare (post-strip) IDs, not the
+        # raw ``models/…`` strings — otherwise the UI would surface
+        # "Similar models: models/gemini-2.5-flash" which is useless
+        # advice because that literal can't be typed into the picker.
+        if "Similar models" in (result["message"] or ""):
+            assert "models/" not in result["message"]
+


+        # Either accepted-as-recognized or accepted-with-suggestions is fine;
+        # the point is that we proceed + offer context.
+        assert result["accepted"] is True


… review (follow-up to NousResearch#12532) Addresses all 3 Copilot inline comments on NousResearch#12585: 1. **Second live network call on the failure path** (line 2216). ``provider_model_ids("anthropic")`` internally calls ``_fetch_anthropic_models`` with a 5s timeout. Calling it from the ``validate_requested_model`` fallback — which is already a failure path the user is waiting on — could stack a second 5s hang after the probe's 5s. Switched to reading ``_PROVIDER_MODELS["anthropic"]`` directly. The static list is the source of truth the picker populates from, so it's guaranteed to contain any ID a user could have selected. No env-based credential discovery, no network call. 2. **Gemini suggestions test was a no-op on the empty branch** (test line 631). The original ``if "Similar models" in …`` guard meant the assertion silently passed whenever suggestions weren't generated. Renamed to ``test_unknown_gemini_id_surfaces_bare_id_suggestions`` and switched to a deliberately-close input (``gemini-2.5-flash-nano``) that reliably fires suggestions at cutoff=0.5 without hitting the auto-correct at cutoff=0.9 — pre-computed the ratio matrix to pick this value. Now asserts (1) ``accepted is False``, (2) "Similar models" present, (3) no ``models/`` leak, (4) a known bare ID is in the list. 3. **Anthropic suggestion test didn't actually check suggestions** (test line 762). The original only asserted ``accepted``. Changed input to ``claude-opus-4-7-preview`` (close to ``claude-opus-4-7`` but not exact) and now asserts (1) accepted, (2) ``recognized is False``, (3) "Similar models" in message, (4) the exact closest match surfaces. Also collapsed the redundant ``test_missing_catalog_falls_through_to_generic_reject`` — ``test_empty_catalog_falls_through_to_generic_reject`` already covers the defensive fall-through, and after the switch to direct ``_PROVIDER_MODELS`` access there's no longer a ``provider_model_ids``-raising path to cover separately. Test plumbing: the Anthropic fixture now patches ``_PROVIDER_MODELS`` via ``patch.dict`` instead of patching ``provider_model_ids``, matching the new code path. Validation ---------- ``source venv/bin/activate && python -m pytest tests/hermes_cli/test_model_validation.py::TestValidateGeminiModelsPrefix tests/hermes_cli/test_model_validation.py::TestValidateAnthropicNoModelsEndpoint -q`` → **10 passed** (was 11; consolidated the redundant catalog test). Broader model-switch / normalize suites (6 files) → **158 passed, 0 failures**. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

briandevans · 2026-04-19T14:40:28Z

Thanks for the catches @copilot-pull-request-reviewer — all 3 addressed in 457ebe11.

1. Second 5s network call on the failure path. You're right — the fallback is already a failure path and stacking another 5s Anthropic probe on top would double the worst-case latency. Switched from provider_model_ids("anthropic") to reading _PROVIDER_MODELS["anthropic"] directly. The static list is the source of truth the picker populates from, so any ID a user could have chosen is guaranteed to be in it; no network, no env-based credential discovery, no timeout risk.

2. Gemini suggestions test was a no-op. Right — the if "Similar models" in … guard silently passed whenever suggestions weren't generated. Pre-computed the ratio matrix and found gemini-2.5-flash-nano as the input that reliably fires suggestions at cutoff=0.5 without hitting auto-correct at cutoff=0.9. Renamed to test_unknown_gemini_id_surfaces_bare_id_suggestions and now asserts:

accepted is False
"Similar models" present in the message
No models/ leak in the suggestions text
A known bare ID (gemini-2.5-flash) is in the list

3. Anthropic close-match suggestion test only checked accepted. Switched input to claude-opus-4-7-preview (close to claude-opus-4-7 but not exact) and now asserts recognized is False, "Similar models" present, and that the exact closest match surfaces.

Also collapsed the now-redundant test_catalog_lookup_exception_falls_through since the direct-dict access can't raise; test_empty_catalog_falls_through_to_generic_reject already covers the defensive fall-through path.

Test fixture now patches _PROVIDER_MODELS via patch.dict instead of provider_model_ids, matching the new code path.

10 passed on branch; broader model-switch suites (6 files) → 158 passed, 0 regressions.

@briandevans

Salvage of the Gemini-specific piece from PR #12585 by @briandevans. Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed with 'models/' (native Gemini-API convention), so set-membership against curated bare IDs drops every model. Strip the prefix before comparison. The Anthropic static-catalog piece of #12585 was subsumed by #12618's _fetch_anthropic_models() branch landing earlier in the same salvage PR. Full branch cherry-pick was skipped because it also carried unrelated catalog-version regressions.

teknium1 · 2026-04-24T12:48:34Z

Thanks @briandevans. The Gemini-prefix piece from your PR landed in #15136 (commit 7f26cea) with your authorship preserved via --author. The Anthropic piece was subsumed by @H-Ali13381's #12618 (commit 2303dd8) which uses _fetch_anthropic_models() directly instead of falling through to provider_model_ids("anthropic") — same outcome, one less indirection.

The branch itself had unrelated catalog-version regressions (old OpenRouter snapshot, missing Xiaomi mimo-v2.5 entries, etc.) so I couldn't cherry-pick it cleanly — the Gemini fix was reapplied directly with --author attribution. Closing as merged.

@briandevans

…2532) Salvage of the Gemini-specific piece from PR NousResearch#12585 by @briandevans. Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed with 'models/' (native Gemini-API convention), so set-membership against curated bare IDs drops every model. Strip the prefix before comparison. The Anthropic static-catalog piece of NousResearch#12585 was subsumed by NousResearch#12618's _fetch_anthropic_models() branch landing earlier in the same salvage PR. Full branch cherry-pick was skipped because it also carried unrelated catalog-version regressions.

@briandevans

…2532) Salvage of the Gemini-specific piece from PR NousResearch#12585 by @briandevans. Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed with 'models/' (native Gemini-API convention), so set-membership against curated bare IDs drops every model. Strip the prefix before comparison. The Anthropic static-catalog piece of NousResearch#12585 was subsumed by NousResearch#12618's _fetch_anthropic_models() branch landing earlier in the same salvage PR. Full branch cherry-pick was skipped because it also carried unrelated catalog-version regressions.

@briandevans

…2532) Salvage of the Gemini-specific piece from PR NousResearch#12585 by @briandevans. Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed with 'models/' (native Gemini-API convention), so set-membership against curated bare IDs drops every model. Strip the prefix before comparison. The Anthropic static-catalog piece of NousResearch#12585 was subsumed by NousResearch#12618's _fetch_anthropic_models() branch landing earlier in the same salvage PR. Full branch cherry-pick was skipped because it also carried unrelated catalog-version regressions.

@briandevans

…2532) Salvage of the Gemini-specific piece from PR NousResearch#12585 by @briandevans. Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed with 'models/' (native Gemini-API convention), so set-membership against curated bare IDs drops every model. Strip the prefix before comparison. The Anthropic static-catalog piece of NousResearch#12585 was subsumed by NousResearch#12618's _fetch_anthropic_models() branch landing earlier in the same salvage PR. Full branch cherry-pick was skipped because it also carried unrelated catalog-version regressions.

@briandevans

…2532) Salvage of the Gemini-specific piece from PR NousResearch#12585 by @briandevans. Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed with 'models/' (native Gemini-API convention), so set-membership against curated bare IDs drops every model. Strip the prefix before comparison. The Anthropic static-catalog piece of NousResearch#12585 was subsumed by NousResearch#12618's _fetch_anthropic_models() branch landing earlier in the same salvage PR. Full branch cherry-pick was skipped because it also carried unrelated catalog-version regressions.

@briandevans

…2532) Salvage of the Gemini-specific piece from PR NousResearch#12585 by @briandevans. Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed with 'models/' (native Gemini-API convention), so set-membership against curated bare IDs drops every model. Strip the prefix before comparison. The Anthropic static-catalog piece of NousResearch#12585 was subsumed by NousResearch#12618's _fetch_anthropic_models() branch landing earlier in the same salvage PR. Full branch cherry-pick was skipped because it also carried unrelated catalog-version regressions.

@briandevans

…2532) Salvage of the Gemini-specific piece from PR NousResearch#12585 by @briandevans. Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed with 'models/' (native Gemini-API convention), so set-membership against curated bare IDs drops every model. Strip the prefix before comparison. The Anthropic static-catalog piece of NousResearch#12585 was subsumed by NousResearch#12618's _fetch_anthropic_models() branch landing earlier in the same salvage PR. Full branch cherry-pick was skipped because it also carried unrelated catalog-version regressions.

Copilot AI review requested due to automatic review settings April 19, 2026 14:14

Copilot started reviewing on behalf of briandevans April 19, 2026 14:15 View session

Copilot AI reviewed Apr 19, 2026

View reviewed changes

briandevans mentioned this pull request Apr 20, 2026

fix(run_agent): preserve dotted Bedrock inference-profile model IDs (#11976) #11992

Closed

13 tasks

cryanskl mentioned this pull request Apr 20, 2026

[Bug]: Anthropic provider OAuth integration has multiple issues with credential management and subscription routing #12905

Closed

1 task

teknium1 mentioned this pull request Apr 24, 2026

fix(models): consolidated validation + picker — anthropic_messages, native Anthropic, Gemini prefix, OpenAI catalog #15136

Merged

teknium1 closed this Apr 24, 2026

teknium1 mentioned this pull request Apr 24, 2026

[Bug]: Regression: /model command validation blocks switching to Gemini models after native adapter update #13186

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(models): accept Gemini + Anthropic in gateway /model picker (#12532)#12585

fix(models): accept Gemini + Anthropic in gateway /model picker (#12532)#12585
briandevans wants to merge 2 commits into
NousResearch:mainfrom
briandevans:fix/gemini-anthropic-model-validation

briandevans commented Apr 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

briandevans commented Apr 19, 2026

Uh oh!

teknium1 commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

briandevans commented Apr 19, 2026

TL;DR

Root cause details

Gemini

Anthropic

Fix

Behaviour matrix

Narrow scope — explicitly not changed

Regression coverage

Validation

Relation to prior PRs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

briandevans commented Apr 19, 2026

Uh oh!

teknium1 commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants