Skip to content

fix(xai-oauth): strip service_tier and add gated slash-enum safety-net (#28490)#32443

Merged
teknium1 merged 2 commits into
mainfrom
fix/28490-xai-service-tier
May 27, 2026
Merged

fix(xai-oauth): strip service_tier and add gated slash-enum safety-net (#28490)#32443
teknium1 merged 2 commits into
mainfrom
fix/28490-xai-service-tier

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

xAI Responses API requests no longer 400 with "Argument not supported: service_tier" when /fast (Priority Processing) state lingers from a prior model switch, and the preflight gains a defense-in-depth slash-enum strip gated on Grok model names.

Salvages PR #28490 (@Nami4D) + tightens the slash-enum scope so we don't silently degrade tool-schema constraints on non-xAI providers.

Two fixes

1. service_tier strip in transports/codex.py:build_kwargs

xAI's /v1/responses rejects service_tier outright. On paper resolve_fast_mode_overrides only emits service_tier for OpenAI fast-eligible models, so on paper it should never reach a Grok request. But two real leak paths exist:

  • Stale self.service_tier on the agent that persists across model switches in the same session.
  • Direct agent.service_tier: priority in config.yaml plumbing through unchanged.

Strip defensively when is_xai_responses=True. Native Codex / GitHub Models / etc. keep their service_tier untouched.

2. Slash-enum safety-net in codex_responses_adapter._preflight_codex_api_kwargs

xAI rejects tool schemas whose enum keyword carries slash-containing values (Qwen/Qwen3.5-0.8B, openai/gpt-oss-20b — common from MCP servers). The main agent loop (chat_completion_helpers.py:483-490) and the auxiliary client (auxiliary_client.py:710-719) both already sanitize, but a future bypass path would silently 400.

The preflight runs on every codex_responses request — adding the strip there catches any future bypass. Critically gated on the model name pattern (grok- / x-ai/grok-) because:

  • Native OpenAI Codex DOES accept slash-containing enums.
  • Stripping the enum keyword removes the constraint entirely; the model can then generate any string.
  • An un-gated strip would silently degrade tool-schema correctness on every codex_responses provider that isn't xAI.

Changes vs. @Nami4D's original

  1. Resolved a merge conflict in transports/codex.py (a timeout-forwarding block landed on main between the PR's branch point and now).
  2. Added the model-name gate around the slash-enum strip (the original PR stripped on all codex_responses requests).
  3. Six new regression tests covering both fixes and both directions (strip-applies-to-xAI + strip-doesn't-apply-elsewhere).

Validation

scripts/run_tests.sh tests/agent/transports/test_codex_transport.py
=== 51 tests passed, 0 failed in 1.1s ===
Test class Tests Locks in
TestCodexTransportXaiServiceTierStrip 3 xAI strips service_tier; native Codex + GitHub Models keep it
TestPreflightSlashEnumStrip 3 Grok + aggregator-prefixed Grok strip slash enums; non-Grok models preserve them

Credit

@Nami4D diagnosed both bugs against api.x.ai with 23 request variants, identified the right intervention points, and shipped a working patch. Salvage just adds the model-name gate to the second fix (to avoid affecting non-xAI providers) and locks the contract in with tests.

Nami4D and others added 2 commits May 25, 2026 23:21
…r slash enums

xAI's /v1/responses endpoint rejects service_tier with HTTP 400
"Argument not supported: service_tier" when users activate /fast mode.

Also add a safety-net strip_slash_enum call in _preflight_codex_api_kwargs
to catch any tool schemas that might slip through the caller-level
sanitization. xAI's Responses API grammar compiler rejects enum values
containing forward slashes (e.g. HuggingFace model IDs like
"Qwen/Qwen3.5-0.8B") with the opaque "Invalid arguments passed to the
model" error.

Fixes the root cause of "Invalid arguments passed to the model" errors
reported by xAI OAuth (SuperGrok) users.
…tests (#28490)

Three additions on top of @Nami4D's salvage:

1. Gate the preflight slash-enum strip on the model name pattern
   (grok-* / x-ai/grok-*).  The original PR stripped slash-containing
   enum values from every codex_responses request, but native Codex
   (OpenAI) and GitHub Models DO accept slash enums — stripping them
   there would silently degrade tool-schema constraints.  xAI is the
   only Responses-API surface that rejects the shape.

2. Resolve the merge conflict in agent/transports/codex.py by
   preserving both the timeout-forwarding block that landed on main
   between the PR's branch point and now AND the new service_tier
   strip.  Behavioural intent of both is preserved.

3. Six new tests in tests/agent/transports/test_codex_transport.py
   covering:
   - TestCodexTransportXaiServiceTierStrip (3 tests): xAI strips
     service_tier from request_overrides; non-xAI codex_responses
     and GitHub Models both KEEP service_tier (regression guards
     so the strip stays xAI-only).
   - TestPreflightSlashEnumStrip (3 tests): Grok and aggregator-
     prefixed Grok model names both trigger the safety-net strip;
     non-Grok models preserve slash enums as a regression guard
     against the strip becoming too broad.

51/51 in tests/agent/transports/test_codex_transport.py.

Co-authored-by: Nami4D <hello@nami4d.tech>
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: fix/28490-xai-service-tier vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9355 on HEAD, 9355 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4953 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have provider/xai xAI (Grok) comp/agent Core agent loop, run_agent.py, prompt builder labels May 26, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Salvages #28490 by @Nami4D — tightens the slash-enum strip scope to be gated on Grok model names only (avoiding silent degradation on non-xAI providers). Also adds defensive service_tier strip for xAI Responses API. Extends merged #28122.

@teknium1 teknium1 merged commit b4eea18 into main May 27, 2026
26 checks passed
@teknium1 teknium1 deleted the fix/28490-xai-service-tier branch May 27, 2026 12:25
teknium1 pushed a commit that referenced this pull request May 29, 2026
…27907)

The xAI tool-schema sanitizers (strip_slash_enum, strip_pattern_and_format)
mutate their input in place — that's their documented contract. The two
call sites (chat_completion_helpers.build_api_kwargs and the auxiliary
client) were passing agent.tools straight through, so the first xAI
request would permanently strip slash-containing enum constraints and
pattern/format keywords from the per-agent tool registry.

Effect: any subsequent non-xAI call from the same agent (auxiliary task
routed to Anthropic, OpenRouter fallback, mid-session model switch) saw
the already-stripped schema with no way for the user to notice from
their config.

Fix: deepcopy tools_for_api before sanitizing at both call sites.

The slash-enum bug itself (xAI 400ing on enums with '/') was fixed
earlier by #32443 (Nami4D) — that PR landed the strip but used the
sanitizers directly without copying. This salvages #27907's correctness
contribution (the deepcopy) while skipping its redundant parallel
sanitizer (strip_xai_incompatible_enum_values is functionally
equivalent to the existing strip_slash_enum) and its preflight-
neutrality argument (we chose model-gated preflight in #32443).

3 new tests in tests/run_agent/test_run_agent_codex_responses.py:

- strips_slash_enum_from_outgoing_request — outgoing kwargs has no
  slash-containing enum values (functional contract preserved).
- does_not_mutate_agent_tools — headline #27907 regression. Snapshot
  agent.tools before build_api_kwargs, assert it survives intact
  after. Pre-fix this assertion would have caught the mutation.
- is_idempotent_across_repeated_calls — three xAI requests in a row
  each strip cleanly AND don't progressively erode the source schema.

344/344 across tests/agent/test_auxiliary_client.py,
tests/agent/transports/test_codex_transport.py,
tests/run_agent/test_run_agent_codex_responses.py, and
tests/tools/test_schema_sanitizer.py.

Co-authored-by: Gabor Barany <barany.gabor@gmail.com>
KKT-OPT pushed a commit to KKT-OPT/hermes-agent that referenced this pull request May 31, 2026
…ousResearch#27907)

The xAI tool-schema sanitizers (strip_slash_enum, strip_pattern_and_format)
mutate their input in place — that's their documented contract. The two
call sites (chat_completion_helpers.build_api_kwargs and the auxiliary
client) were passing agent.tools straight through, so the first xAI
request would permanently strip slash-containing enum constraints and
pattern/format keywords from the per-agent tool registry.

Effect: any subsequent non-xAI call from the same agent (auxiliary task
routed to Anthropic, OpenRouter fallback, mid-session model switch) saw
the already-stripped schema with no way for the user to notice from
their config.

Fix: deepcopy tools_for_api before sanitizing at both call sites.

The slash-enum bug itself (xAI 400ing on enums with '/') was fixed
earlier by NousResearch#32443 (Nami4D) — that PR landed the strip but used the
sanitizers directly without copying. This salvages NousResearch#27907's correctness
contribution (the deepcopy) while skipping its redundant parallel
sanitizer (strip_xai_incompatible_enum_values is functionally
equivalent to the existing strip_slash_enum) and its preflight-
neutrality argument (we chose model-gated preflight in NousResearch#32443).

3 new tests in tests/run_agent/test_run_agent_codex_responses.py:

- strips_slash_enum_from_outgoing_request — outgoing kwargs has no
  slash-containing enum values (functional contract preserved).
- does_not_mutate_agent_tools — headline NousResearch#27907 regression. Snapshot
  agent.tools before build_api_kwargs, assert it survives intact
  after. Pre-fix this assertion would have caught the mutation.
- is_idempotent_across_repeated_calls — three xAI requests in a row
  each strip cleanly AND don't progressively erode the source schema.

344/344 across tests/agent/test_auxiliary_client.py,
tests/agent/transports/test_codex_transport.py,
tests/run_agent/test_run_agent_codex_responses.py, and
tests/tools/test_schema_sanitizer.py.

Co-authored-by: Gabor Barany <barany.gabor@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have provider/xai xAI (Grok) type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants