Skip to content

fix: strip reasoning_content for OpenAI-compatible providers (Groq, Cerebras)#1

Merged
mathias3 merged 1 commit into
mainfrom
maciej/groq-cerebras-reasoning-content-fix
May 19, 2026
Merged

fix: strip reasoning_content for OpenAI-compatible providers (Groq, Cerebras)#1
mathias3 merged 1 commit into
mainfrom
maciej/groq-cerebras-reasoning-content-fix

Conversation

@mathias3

Copy link
Copy Markdown
Owner

Problem

OpenAI-compatible custom providers (Groq, Cerebras, NVIDIA NIM) reject assistant messages containing reasoning_content or reasoning fields with HTTP 400:

HTTP 400: 'reasoning_content' is unsupported

This occurs when:

  1. History contains reasoning_content from prior thinking-mode providers (DeepSeek, Kimi)
  2. The reasoning field is promoted to reasoning_content unconditionally

Root Cause

In copy_reasoning_content_for_api(), both existing reasoning_content copying and reasoning field promotion were unconditional, ignoring whether the active provider actually supports/requires thinking-mode fields.

Fix

Gate both operations on agent._needs_thinking_reasoning_pad():

  • Existing reasoning_content is only copied for thinking-mode providers (DeepSeek, Kimi)
  • reasoning field promotion only happens for thinking-mode providers
  • Custom providers (Groq, Cerebras, etc.) receive clean messages without unsupported fields

Testing

Added/updated regression tests in test_deepseek_reasoning_content_echo.py:

  • Groq/custom strips existing reasoning_content
  • Groq/custom does not promote reasoning to reasoning_content
  • Cerebras/custom strips both fields ✓
  • DeepSeek/Kimi behavior preserved ✓
pytest tests/run_agent/test_deepseek_reasoning_content_echo.py -q
# 44 passed

Providers Affected

  • Groq (groq/compound, llama-3.3-70b-versatile, etc.)
  • Cerebras (qwen-3-235b-a22b-instruct-2507, etc.)
  • NVIDIA NIM
  • Any custom OpenAI-compatible provider that doesn't support reasoning_content

Fixes

Fixes HTTP 400 errors when using Groq/Cerebras with conversation history from thinking-mode providers.

@mathias3 mathias3 force-pushed the maciej/groq-cerebras-reasoning-content-fix branch from 8812ceb to b2c99af Compare May 18, 2026 10:28
@mathias3 mathias3 force-pushed the maciej/groq-cerebras-reasoning-content-fix branch from b2c99af to 1bf8714 Compare May 18, 2026 10:33
@mathias3 mathias3 merged this pull request into main May 19, 2026
mathias3 pushed a commit that referenced this pull request May 25, 2026
Four findings from Copilot's review on PR NousResearch#22891, all in the AX
elements-array cap added by 22fa1ed:

1. The truncation note ("response truncated to N of M elements") was
   appended unconditionally — including in the som/vision multimodal
   path, whose response carries a screenshot rather than an `elements`
   array. The note described a payload field that wasn't present.
   Moved the note into the AX-text branch where the array actually
   appears.

2. `_format_elements(cap.elements)` ran on the full untrimmed list with
   its own `max_lines=40` cap, so a caller passing `max_elements=10`
   would see summary lines referencing `NousResearch#11..NousResearch#40` even though the JSON
   `elements` array only held #1..NousResearch#10. Format on `visible_elements`
   instead so the summary indices always exist in the response.

3. `_coerce_max_elements` enforced a lower bound but no upper bound,
   so `max_elements=10_000_000` silently disabled the safeguard and
   reintroduced the original context-blow-up. Added a hard cap
   (`_MAX_ALLOWED_MAX_ELEMENTS = 1000`) that clamps oversized values.

4. The schema string said "Default 100" but the property carried no
   `default` field, and claimed `max_elements` had no effect on som/
   vision while the image-missing fallback path can still return an
   elements array. Added `"default": 100`, `"maximum": 1000`, and
   clarified the fallback-path wording.

Each finding gets a regression test:

- test_capture_ax_clamps_oversized_max_elements_to_hard_cap
- test_capture_ax_summary_indices_match_returned_elements
- test_capture_multimodal_summary_omits_truncation_note
- test_schema_max_elements_documents_default_and_upper_bound

Verified with `pytest tests/tools/test_computer_use.py` (53 passed,
including the 5 new cases). Confirmed each new test fails on the
pre-fix code path before applying the production change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mathias3 pushed a commit that referenced this pull request May 28, 2026
…NousResearch#31416)

PR NousResearch#31416 (avoid persisting borrowed credential secrets) added
sanitize_borrowed_credential_payload, which strips access_token from
any auth.json pool entry whose (provider, source) isn't in the
_PERSISTABLE_PROVIDER_SOURCES allowlist.

(copilot, gh_cli) is borrowed (not in the allowlist), so the test
fixture's pre-seeded access_token now gets stripped at load_pool()
time, leaving the pool empty. resolve_target('1') then fails with
'No credential #1. Provider: copilot.'

Fix: align the test with the new contract. At runtime, copilot tokens
are hydrated by resolve_copilot_token() — mock that path so the pool
gets an entry the test can remove. The behavior under test
(suppression of gh_cli + env variants on remove) is unchanged.

CI repro on origin/main HEAD; reproduced locally with stock checkout.
mathias3 pushed a commit that referenced this pull request May 28, 2026
…s reached

After key #1 is marked exhausted the retry still called the API with key #1
due to env-var bias in _get_cached_client / resolve_api_key_provider_credentials.
Fix: peek the pool and pass the active entry's key as explicit_api_key.
Secondary: api_key_hint in mark_exhausted_and_rotate pins the correct entry
under concurrent CLI+gateway calls; _is_payment_error matches GoUsageLimitError;
extract_api_error_context parses "Resets in Xhr Ymin".
mathias3 pushed a commit that referenced this pull request May 28, 2026
…ookies

Mission-control style deploys reverse-proxy the dashboard at a path
prefix (e.g. mission-control.tilos.com/hermes/* -> :9119) and inject
X-Forwarded-Prefix: /hermes on every request. The SPA mount already
honoured this for asset URLs and the bootstrap __HERMES_BASE_PATH__,
but the OAuth gate didn't:

  1. The gate's Location: header to /login and the 401 envelope's
     login_url were built bare ("/login?next=..."). Under a /hermes
     prefix the browser follows that to mission-control.tilos.com/login
     which the proxy doesn't route to the dashboard.
  2. _redirect_uri (the OAuth callback URL handed to the IDP) used
     request.url_for() which doesn't honour X-Forwarded-Prefix
     (Starlette/uvicorn only proxy_headers Host + Proto + For). The
     IDP redirects back to /auth/callback instead of /hermes/auth/
     callback → 404 in the user's browser.
  3. Cookies were set with Path=/ which leaks them to other apps on
     the same origin and won't be sent back on requests under the
     prefix in the first place.

Fix threads the normalised prefix through every boundary:

  * New hermes_cli/dashboard_auth/prefix.py — single source of truth
    for X-Forwarded-Prefix parsing. web_server._normalise_prefix
    becomes a re-export so the SPA mount, the gate, and the cookies
    helper all agree.
  * middleware._unauth_response builds login_url = f"{prefix}/login".
  * routes._redirect_uri splices the prefix into the path component
    of the IDP-bound URL (with full validation of the header).
  * cookies.{set,clear}_{session,pkce}_cookie now take prefix="".
    Path attribute switches to /hermes when set; cookie name switches
    name variant (see below). Every caller passes the request's
    normalised prefix.

Cookie hardening (Teknium's lesser-note #1 in the PR review): adopt
the __Host- / __Secure- cookie name prefixes per draft-west-cookie-
prefixes. The variant is selected from (use_https, prefix):

  * Loopback HTTP → bare "hermes_session_at" (both prefixes require
    Secure, incompatible with HTTP).
  * HTTPS, direct deploy (Path=/) → "__Host-hermes_session_at".
    Strongest spec: bound to exact origin, no Domain attribute, Secure
    required.
  * HTTPS, behind a proxy prefix (Path=/hermes) →
    "__Secure-hermes_session_at". __Host- forbids Path != "/"; the
    explicit Path=/hermes covers same-origin app isolation.

Setter and reader BOTH consult the prefix because the cookie *name*
changes — a reader that looked up the bare name when the setter wrote
__Secure- would never find the value. The reader falls back across
all three variants so a request whose shape changed mid-session (e.g.
post-deploy from no-prefix to /hermes) still picks up the existing
cookie until it expires.

Test coverage:

  - tests/hermes_cli/test_dashboard_auth_prefix.py — new file. 11 tests
    pinning:
      • Location: /hermes/login on the gate's HTML redirect
      • 401 envelope login_url carries the prefix
      • Malformed X-Forwarded-Prefix is ignored (header-injection
        defence; the script-tag value is normalised to empty string)
      • _redirect_uri splices /hermes into the path (the property
        that prevents the IDP-returns-to-404 failure)
      • PKCE cookie uses Path=/hermes + __Secure- when proxied
      • Session cookies use __Host- when direct, __Secure- when
        proxied, bare on loopback HTTP
      • End-to-end round trip with hand-managed PKCE cookie carriage
        (TestClient can't simulate a Path=/hermes cookie automatically)
  - tests/hermes_cli/test_dashboard_auth_cookies.py — rewritten to pin
    each (use_https, prefix) shape produces its expected cookie name,
    plus reader-side coverage that __Host- and __Secure- variants are
    both recognised.
  - Existing tests across middleware / 401-reauth / etc. updated to
    match the new cookie names (substring contains instead of
    startswith).

Mutation-tested: reverting _unauth_response to build the bare
"/login" URL trips exactly the two tests that pin the prefix
carriage, confirming the suite discriminates the regression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant