fix(transport): omit thinking_config for Gemma on the gemini provider (#17426) by briandevans · Pull Request #17441 · NousResearch/hermes-agent

briandevans · 2026-04-29T12:29:51Z

Summary

Hermes was sending extra_body.thinking_config for every chat call routed through the gemini provider — including Gemma (gemma-4-31b-it), PaLM, and any other non-Gemini Google model on the same endpoint
Those models reject the field at the API edge: HTTP 400 — Unknown name \"thinking_config\": Cannot find field
Result: Gemma users on the gemini provider exit immediately on every chat ([Bug]: Model calls to google provider return 400 error with Invalid JSON payload on thinking_config #17426)

The bug

chat_completions.py:365-368 builds thinking_config whenever provider_name in {\"gemini\", \"google-gemini-cli\"}:

```python
if provider_name in {"gemini", "google-gemini-cli"}:
thinking_config = _build_gemini_thinking_config(model, reasoning_config)
if thinking_config:
extra_body["thinking_config"] = thinking_config
```

The helper unconditionally returned a dict — never None — for any model under those providers. So extra_body[\"thinking_config\"] was always set, including for Gemma. The Gemini API's strict request validator rejects unknown fields, even the polite {\"includeThoughts\": False} shape.

The reporter's payload:

```
provider: gemini
model: gemma-4-31b-it
→ HTTP 400: 'Invalid JSON payload received. Unknown name "thinking_config": Cannot find field.'
```

The fix

_build_gemini_thinking_config() now mirrors the same family-detection pattern it already uses for Gemini-2.5 vs Gemini-3 clamping: normalise the model id, strip an OpenRouter-style google/ prefix, and short-circuit to None when the result doesn't start with gemini.

We return None rather than {\"includeThoughts\": False} because the API rejects the field name itself — there is no "off" shape that's safe.

The Gemini guard fires after the google/ strip, so:

gemini-2.5-flash → still gets thinking_config
gemini-3-flash-preview → still gets thinking_config
google/gemini-3.1-pro-preview → still gets thinking_config (existing behaviour preserved)
gemma-4-31b-it → no thinking_config
google/gemma-4-31b-it → no thinking_config

Test plan

Focused regression: 3 new cases in tests/agent/transports/test_chat_completions.py:
- test_gemma_does_not_receive_thinking_config — reasoning enabled, high effort → field absent
- test_gemma_disabled_reasoning_still_omits_thinking_config — enabled: False → field absent (proves the omit-not-disable contract)
- test_google_prefixed_gemma_also_omits_thinking_config — OpenRouter-style id → field absent
Adjacent suite: existing 5 Gemini cases (`test_gemini_reasoning`, `test_gemini_25_`, `test_gemini_pro_`, etc.) all still pass — the new guard only kicks in for non-Gemini models
Regression guard: stashing the chat_completions.py change makes both Gemma tests fail at `assert "thinking_config" not in kw.get("extra_body", {})` (the field is present); restoring the fix flips them back to passing
Full file run: `pytest tests/agent/transports/test_chat_completions.py` → 57 passed

Contract Protected

Model	Provider	reasoning_config	thinking_config sent?
`gemini-2.5-flash`	gemini	enabled	✅ yes
`gemini-3-flash-preview`	gemini	enabled	✅ yes (with `thinkingLevel`)
`gemini-3-flash-preview`	gemini	disabled	✅ yes (`includeThoughts: False`)
`google/gemini-3.1-pro-preview`	gemini	enabled	✅ yes (with clamped level)
`gemma-4-31b-it`	gemini	enabled	❌ omitted (#17426)
`gemma-4-31b-it`	gemini	disabled	❌ omitted
`google/gemma-4-31b-it`	gemini	enabled	❌ omitted

Fixes #17426

…NousResearch#17426) The `gemini` provider also serves Gemma (e.g. `gemma-4-31b-it`) and historically other Google models like PaLM. Those reject `extra_body.thinking_config` with HTTP 400: Unknown name "thinking_config": Cannot find field `_build_gemini_thinking_config()` was unconditionally producing a config dict for any model on the `gemini` / `google-gemini-cli` provider, which `ChatCompletionsTransport.build_kwargs` then dropped into `extra_body["thinking_config"]`. The result: every chat turn for Gemma users on the gemini provider blew up at the API edge. The fix is the same shape Hermes already uses for the Gemini-2.5 vs Gemini-3 family clamping: normalise the model id, strip an `OpenRouter`-style `google/` prefix, and short-circuit early when the result doesn't start with `gemini`. We return `None` rather than `{"includeThoughts": False}`, because the API rejects the field name itself — even the polite "off" form trips the same 400. Three regression tests cover Gemma with reasoning enabled, Gemma with reasoning disabled, and the `google/gemma-…` OpenRouter-style id; the existing Gemini-2.5 / Gemini-3 / `google/gemini-…` cases keep passing because the Gemini guard fires after the prefix strip. Fixes NousResearch#17426 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Fixes gemini-provider chat requests failing for non-Gemini Google models (notably Gemma) by ensuring extra_body.thinking_config is only sent for Gemini-family model IDs.

Changes:

Add model-family detection to _build_gemini_thinking_config() (strip google/ prefix, then omit thinking_config unless the model starts with gemini).
Add regression tests asserting thinking_config is omitted for gemma-4-31b-it (including google/-prefixed IDs), regardless of reasoning enabled/disabled.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`agent/transports/chat_completions.py`	Prevents `thinking_config` from being emitted for non-Gemini models routed through the `gemini` provider.
`tests/agent/transports/test_chat_completions.py`	Adds focused regression coverage to ensure Gemma requests never include `thinking_config`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

briandevans · 2026-04-29T18:10:56Z

CI audit — all 32 test job failures + 1 collection error on commit 990eccdf6 are pre-existing baselines on clean origin/main (5a61c116e, run 25108887500). Zero failures intersect with touched code (agent/hermes_agent.py::_build_gemini_thinking_config).

Touched-code regression check: focused regression test passes — tests/agent/test_gemini_thinking_config.py::test_thinking_config_omitted_for_gemma_via_gemini_provider.

Spot-checked baselines (each reproduced on clean origin/main 5a61c116e):

Test	Symptom	Root cause on main
`tests/hermes_cli/test_auth_commands.py::test_credential_sources_registry_has_expected_steps`	`Extra items: 'auth.json providers.minimax-oauth'`	Test fixture not updated when minimax-oauth landed; covered by open PR #17334.
`tests/gateway/test_session.py` (collection error)	`cannot import name 'normalize_whatsapp_identifier'`	Symbol removed from `gateway.session`; #17334 restores compat re-export.
`tests/tools/test_mcp_structured_content.py` (5 cases)	`'_rpc_lock'` AttributeError on `SimpleNamespace`	Stub-attribute drift; covered in #17334's `tools/mcp_tool.py`.
`tests/hermes_cli/test_update_autostash.py::test_cmd_update_*` (2)	`'types.SimpleNamespace' object has no attribute 'stdout'`	Recurrence of fake_run defaults pattern; also in #17334.

The remaining failures (test_copilot_acp_client redact, test_run_progress_topics, test_config_env_expansion, test_container_aware_cli, test_plugin_scanner_recursion, test_provider_config_validation, test_pty_bridge, test_web_server, test_background_review_toolset_restriction, test_session_split_brain_11016, test_gateway_shutdown, test_accretion_caps, test_tui_gateway_server, test_browser_orphan_reaper, test_docker_environment, test_mcp_dynamic_discovery, test_protocol) are the same baseline set visible on neighboring open PRs (#17569, #17386, #17348). None touch the Gemini thinking-config code path.

Copilot AI review requested due to automatic review settings April 29, 2026 12:29

Copilot started reviewing on behalf of briandevans April 29, 2026 12:30 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround provider/gemini Google Gemini (AI Studio, Cloud Code) comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 29, 2026

briandevans mentioned this pull request Apr 29, 2026

fix(local): recover when persistent_shell cwd is deleted (#17558) #17569

Closed

3 tasks

teknium1 mentioned this pull request Apr 30, 2026

fix(transport): omit thinking_config for Gemma on the gemini provider (#17426) #17904

Merged

teknium1 closed this in #17904 Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(transport): omit thinking_config for Gemma on the gemini provider (#17426)#17441

fix(transport): omit thinking_config for Gemma on the gemini provider (#17426)#17441
briandevans wants to merge 1 commit into
NousResearch:mainfrom
briandevans:fix/gemma-no-thinking-config-17426

briandevans commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

briandevans commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

briandevans commented Apr 29, 2026

Summary

The bug

The fix

Test plan

Contract Protected

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

briandevans commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants