fix: strip reasoning_content for OpenAI-compatible providers (Groq, Cerebras) by mathias3 · Pull Request #1 · mathias3/hermes-agent

mathias3 · 2026-05-18T10:18:50Z

Problem

OpenAI-compatible custom providers (Groq, Cerebras, NVIDIA NIM) reject assistant messages containing reasoning_content or reasoning fields with HTTP 400:

HTTP 400: 'reasoning_content' is unsupported

This occurs when:

History contains reasoning_content from prior thinking-mode providers (DeepSeek, Kimi)
The reasoning field is promoted to reasoning_content unconditionally

Root Cause

In copy_reasoning_content_for_api(), both existing reasoning_content copying and reasoning field promotion were unconditional, ignoring whether the active provider actually supports/requires thinking-mode fields.

Fix

Gate both operations on agent._needs_thinking_reasoning_pad():

Existing reasoning_content is only copied for thinking-mode providers (DeepSeek, Kimi)
reasoning field promotion only happens for thinking-mode providers
Custom providers (Groq, Cerebras, etc.) receive clean messages without unsupported fields

Testing

Added/updated regression tests in test_deepseek_reasoning_content_echo.py:

Groq/custom strips existing reasoning_content ✓
Groq/custom does not promote reasoning to reasoning_content ✓
Cerebras/custom strips both fields ✓
DeepSeek/Kimi behavior preserved ✓

pytest tests/run_agent/test_deepseek_reasoning_content_echo.py -q
# 44 passed

Providers Affected

Groq (groq/compound, llama-3.3-70b-versatile, etc.)
Cerebras (qwen-3-235b-a22b-instruct-2507, etc.)
NVIDIA NIM
Any custom OpenAI-compatible provider that doesn't support reasoning_content

Fixes

Fixes HTTP 400 errors when using Groq/Cerebras with conversation history from thinking-mode providers.

…Groq, Cerebras)

Four findings from Copilot's review on PR NousResearch#22891, all in the AX elements-array cap added by 22fa1ed: 1. The truncation note ("response truncated to N of M elements") was appended unconditionally — including in the som/vision multimodal path, whose response carries a screenshot rather than an `elements` array. The note described a payload field that wasn't present. Moved the note into the AX-text branch where the array actually appears. 2. `_format_elements(cap.elements)` ran on the full untrimmed list with its own `max_lines=40` cap, so a caller passing `max_elements=10` would see summary lines referencing `NousResearch#11..NousResearch#40` even though the JSON `elements` array only held #1..NousResearch#10. Format on `visible_elements` instead so the summary indices always exist in the response. 3. `_coerce_max_elements` enforced a lower bound but no upper bound, so `max_elements=10_000_000` silently disabled the safeguard and reintroduced the original context-blow-up. Added a hard cap (`_MAX_ALLOWED_MAX_ELEMENTS = 1000`) that clamps oversized values. 4. The schema string said "Default 100" but the property carried no `default` field, and claimed `max_elements` had no effect on som/ vision while the image-missing fallback path can still return an elements array. Added `"default": 100`, `"maximum": 1000`, and clarified the fallback-path wording. Each finding gets a regression test: - test_capture_ax_clamps_oversized_max_elements_to_hard_cap - test_capture_ax_summary_indices_match_returned_elements - test_capture_multimodal_summary_omits_truncation_note - test_schema_max_elements_documents_default_and_upper_bound Verified with `pytest tests/tools/test_computer_use.py` (53 passed, including the 5 new cases). Confirmed each new test fails on the pre-fix code path before applying the production change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…NousResearch#31416) PR NousResearch#31416 (avoid persisting borrowed credential secrets) added sanitize_borrowed_credential_payload, which strips access_token from any auth.json pool entry whose (provider, source) isn't in the _PERSISTABLE_PROVIDER_SOURCES allowlist. (copilot, gh_cli) is borrowed (not in the allowlist), so the test fixture's pre-seeded access_token now gets stripped at load_pool() time, leaving the pool empty. resolve_target('1') then fails with 'No credential #1. Provider: copilot.' Fix: align the test with the new contract. At runtime, copilot tokens are hydrated by resolve_copilot_token() — mock that path so the pool gets an entry the test can remove. The behavior under test (suppression of gh_cli + env variants on remove) is unchanged. CI repro on origin/main HEAD; reproduced locally with stock checkout.

…s reached After key #1 is marked exhausted the retry still called the API with key #1 due to env-var bias in _get_cached_client / resolve_api_key_provider_credentials. Fix: peek the pool and pass the active entry's key as explicit_api_key. Secondary: api_key_hint in mark_exhausted_and_rotate pins the correct entry under concurrent CLI+gateway calls; _is_payment_error matches GoUsageLimitError; extract_api_error_context parses "Resets in Xhr Ymin".

…ookies Mission-control style deploys reverse-proxy the dashboard at a path prefix (e.g. mission-control.tilos.com/hermes/* -> :9119) and inject X-Forwarded-Prefix: /hermes on every request. The SPA mount already honoured this for asset URLs and the bootstrap __HERMES_BASE_PATH__, but the OAuth gate didn't: 1. The gate's Location: header to /login and the 401 envelope's login_url were built bare ("/login?next=..."). Under a /hermes prefix the browser follows that to mission-control.tilos.com/login which the proxy doesn't route to the dashboard. 2. _redirect_uri (the OAuth callback URL handed to the IDP) used request.url_for() which doesn't honour X-Forwarded-Prefix (Starlette/uvicorn only proxy_headers Host + Proto + For). The IDP redirects back to /auth/callback instead of /hermes/auth/ callback → 404 in the user's browser. 3. Cookies were set with Path=/ which leaks them to other apps on the same origin and won't be sent back on requests under the prefix in the first place. Fix threads the normalised prefix through every boundary: * New hermes_cli/dashboard_auth/prefix.py — single source of truth for X-Forwarded-Prefix parsing. web_server._normalise_prefix becomes a re-export so the SPA mount, the gate, and the cookies helper all agree. * middleware._unauth_response builds login_url = f"{prefix}/login". * routes._redirect_uri splices the prefix into the path component of the IDP-bound URL (with full validation of the header). * cookies.{set,clear}_{session,pkce}_cookie now take prefix="". Path attribute switches to /hermes when set; cookie name switches name variant (see below). Every caller passes the request's normalised prefix. Cookie hardening (Teknium's lesser-note #1 in the PR review): adopt the __Host- / __Secure- cookie name prefixes per draft-west-cookie- prefixes. The variant is selected from (use_https, prefix): * Loopback HTTP → bare "hermes_session_at" (both prefixes require Secure, incompatible with HTTP). * HTTPS, direct deploy (Path=/) → "__Host-hermes_session_at". Strongest spec: bound to exact origin, no Domain attribute, Secure required. * HTTPS, behind a proxy prefix (Path=/hermes) → "__Secure-hermes_session_at". __Host- forbids Path != "/"; the explicit Path=/hermes covers same-origin app isolation. Setter and reader BOTH consult the prefix because the cookie *name* changes — a reader that looked up the bare name when the setter wrote __Secure- would never find the value. The reader falls back across all three variants so a request whose shape changed mid-session (e.g. post-deploy from no-prefix to /hermes) still picks up the existing cookie until it expires. Test coverage: - tests/hermes_cli/test_dashboard_auth_prefix.py — new file. 11 tests pinning: • Location: /hermes/login on the gate's HTML redirect • 401 envelope login_url carries the prefix • Malformed X-Forwarded-Prefix is ignored (header-injection defence; the script-tag value is normalised to empty string) • _redirect_uri splices /hermes into the path (the property that prevents the IDP-returns-to-404 failure) • PKCE cookie uses Path=/hermes + __Secure- when proxied • Session cookies use __Host- when direct, __Secure- when proxied, bare on loopback HTTP • End-to-end round trip with hand-managed PKCE cookie carriage (TestClient can't simulate a Path=/hermes cookie automatically) - tests/hermes_cli/test_dashboard_auth_cookies.py — rewritten to pin each (use_https, prefix) shape produces its expected cookie name, plus reader-side coverage that __Host- and __Secure- variants are both recognised. - Existing tests across middleware / 401-reauth / etc. updated to match the new cookie names (substring contains instead of startswith). Mutation-tested: reverting _unauth_response to build the bare "/login" URL trips exactly the two tests that pin the prefix carriage, confirming the suite discriminates the regression.

mathias3 force-pushed the maciej/groq-cerebras-reasoning-content-fix branch from 8812ceb to b2c99af Compare May 18, 2026 10:28

fix(agent): strip reasoning_content for OpenAI-compatible providers (…

1bf8714

…Groq, Cerebras)

mathias3 force-pushed the maciej/groq-cerebras-reasoning-content-fix branch from b2c99af to 1bf8714 Compare May 18, 2026 10:33

mathias3 merged this pull request into main May 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: strip reasoning_content for OpenAI-compatible providers (Groq, Cerebras)#1

fix: strip reasoning_content for OpenAI-compatible providers (Groq, Cerebras)#1
mathias3 merged 1 commit into
mainfrom
maciej/groq-cerebras-reasoning-content-fix

mathias3 commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mathias3 commented May 18, 2026

Problem

Root Cause

Fix

Testing

Providers Affected

Fixes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant