fix(tools): strip slash-containing enum values for xAI Responses by gbarany · Pull Request #27907 · NousResearch/hermes-agent

gbarany · 2026-05-18T09:45:17Z

What does this PR do?

Fixes a 100%-broken state for any user on the xAI provider (xai/xai-oauth) who also has the Brave Search MCP enabled. Brave's brave_llm_context and brave_place_search tools advertise an HTTP-header accept parameter as {"type": "string", "enum": ["application/json", "*/*"]}. xAI's /v1/responses schema validator rejects any string enum value containing /, replies with HTTP 200 plus a single SSE event: error frame whose message is the opaque "Invalid arguments passed to the model.", then closes the stream. Hermes retries three times and aborts the turn.

The fix is a new strip_xai_incompatible_enum_values() sanitizer in tools/schema_sanitizer.py that drops any enum constraint containing a slash-bearing string value. It's wired into agent/chat_completion_helpers.build_api_kwargs alongside the existing strip_pattern_and_format call (same xAI gate, same deep-copy-then-strip pattern). Other Responses-compatible backends are not touched.

Type of Change

Bug fix (non-breaking change that fixes an issue)
Tests (adding or improving test coverage)

Changes Made

tools/schema_sanitizer.py — new strip_xai_incompatible_enum_values(tools) helper. Walks both OpenAI-format and Responses-format tool entries. If any string enum value contains /, drops the entire enum constraint (rather than narrowing the list, which would silently change the model-visible value space). Logs malformed (non-list) enums at DEBUG. Emits a sibling DEBUG log when the strip fires.
agent/chat_completion_helpers.py — adds the new sanitizer to the existing xAI tool-schema transform pass, on a deep-copied tool list so the shared registry stays valid for non-xAI turns served by the same process.
tests/tools/test_schema_sanitizer.py — 13 new unit tests (OpenAI format, Responses format, drops whole enum on mixed values, nested schemas, no-op when clean, empty list, array items enum, anyOf union, mixed non-string members, all-stripped with sibling default, malformed enum, caller-deepcopy contract, returned list-reference identity).
tests/run_agent/test_run_agent_codex_responses.py — 3 integration tests verifying _build_api_kwargs strips for xAI agents, leaves slash enums untouched for non-xAI, and that preflight remains provider-neutral.
tests/agent/transports/test_codex_transport.py — renamed test class to TestCodexPreflightProviderNeutral, pinning that preflight does not apply provider policy and does not mutate caller-owned schemas.

How to Test

Reproduce against xAI directly without Hermes:

from openai import OpenAI
client = OpenAI(api_key=XAI_OAUTH_TOKEN, base_url="https://api.x.ai/v1")
gen = client.responses.create(
    model="grok-4.3",
    input=[{"role":"user","content":"hi"}],
    instructions="You are Hermes.",
    store=False, stream=True,
    tools=[{"type":"function","name":"x","description":"x","strict":False,
            "parameters":{"type":"object","properties":{
                "accept":{"type":"string","enum":["application/json","*/*"]}}}}],
)
next(iter(gen))  # first SSE frame is event: error, "Invalid arguments passed to the model."

Drop */* (or any value containing /) from the enum and the same call streams response.completed.

In Hermes itself: enable the Brave Search MCP and pick xai-oauth + grok-4.3 — every turn fails on main before this PR and succeeds after.

Confirmed empirically (any string enum value containing / triggers the rejection):

Enum value	xAI accepts?
`"application/json"`	rejected
`"/"`	rejected
`"text/plain"`	rejected
`"text/*"`	rejected
`"a/b"`	rejected
`"foo*bar"`	accepted
`"foo"`, `"bar"`	accepted

Test runs

pytest tests/tools/test_schema_sanitizer.py -v       # all pass (13 new)
pytest tests/run_agent/test_run_agent_codex_responses.py -k "preflight or xai or build_api_kwargs"
                                                      # all pass (3 new)
pytest tests/tools/test_schema_sanitizer.py \
       tests/run_agent/test_run_agent_codex_responses.py \
       tests/agent/transports/test_codex_transport.py
                                                      # all pass

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(tools):, refactor(tools):, test(tools):)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix
I've run the suite on every touched module and all preflight/xAI tests
I've added tests for my changes
I've tested on my platform: macOS 15 (Darwin 25.5.0), Python 3.11, nousresearch/hermes-agent:main

Documentation & Housekeeping

I've updated relevant documentation — N/A (the new sanitizer has a thorough docstring; behavior change is internal)
I've updated cli-config.yaml.example — N/A (no new config keys)
I've updated CONTRIBUTING.md or AGENTS.md — N/A (no workflow/architecture change)
I've considered cross-platform impact — pure Python string manipulation, no platform-specific paths
I've updated tool descriptions/schemas — N/A (no tool added or modified)

Notes for reviewers

The xAI tool-schema policy now lives in one place (chat_completion_helpers.build_api_kwargs) alongside the existing strip_pattern_and_format call — same gate, same deep-copy, same try/except. Preflight remains provider-neutral validation.
On mixed enums (e.g. ["fast", "fast/precise", "precise"]), the entire enum constraint is dropped rather than narrowed to ["fast", "precise"]. Narrowing would silently change the accepted value space for that property in ways a user couldn't see from their schema — dropping the constraint keeps type: string typing, so free-form values still validate, but doesn't pretend the model never knew about the missing values.
Follow-up worth considering separately: surface xAI's event: error frame's message field in the user-visible error along with the offending tool name when available. Today the openai SDK collapses the frame into the generic _StreamErrorEvent summary so neither Hermes nor the user can tell which tool is bad.

Screenshots / Logs

Before (on nousresearch/hermes-agent:main, fresh hello):

API call failed (attempt N/3) error_type=_StreamErrorEvent
provider=xai-oauth base_url=https://api.x.ai/v1 model=grok-4.3
summary=Invalid arguments passed to the model.

Captured wire response (first SSE frame from xAI):

event: error
data: {"sequence_number":0,"type":"error","code":null,
       "message":"Invalid arguments passed to the model.","param":null}

After the patch — same request body succeeds, with the sanitizer logging:

schema_sanitizer: stripped 2 slash-containing enum value(s) from tool schemas (xAI Responses compatibility)

xAI's /v1/responses schema validator rejects any tool-parameter string enum value containing a / character (e.g. "application/json", "*/*", "text/plain"). It returns HTTP 200 with a single SSE "event: error" frame whose message is the opaque "Invalid arguments passed to the model.", then closes the stream, aborting every agent turn. The Brave Search MCP triggers this out of the box: its brave_llm_context and brave_place_search tools advertise an HTTP-header "accept" parameter with enum ["application/json", "*/*"]. Other Responses-compatible backends (OpenAI Codex) accept these enums, so the strip is gated by a new is_xai_responses parameter plumbed from the existing detection in conversation_loop and codex_runtime through the transport into _preflight_codex_api_kwargs. The new strip_xai_incompatible_enum_values helper lives next to strip_pattern_and_format in tools/schema_sanitizer.py and emits a parallel INFO log line when it fires.

… gating - 6 unit tests for the new sanitizer covering OpenAI and Responses tool formats, mixed safe+unsafe enums, nested schemas, and idempotence on clean input. - 2 integration tests on _preflight_codex_api_kwargs verifying the strip runs only when is_xai_responses=True (and is a no-op for openai-codex and other Responses-compatible backends).

Round of fixes from the multi-agent PR review on NousResearch#27907: * Deep-copy normalized_tools before stripping in the xAI branch of _preflight_codex_api_kwargs so the in-place mutation can't leak back through the aliased `parameters` reference into the caller's upstream tool registry (would have affected non-xAI turns served by the same Python process). * Wrap the strip call in try/except + logger.warning so a malformed schema can't crash the request mid-preflight — the request falls back to sending the original tools and xAI's real error surfaces instead of a Hermes-side traceback. * Extract `agent_uses_xai_responses(agent)` in agent.codex_responses_adapter so the predicate stays in sync across conversation_loop.py and codex_runtime.py. Future provider keys land in one place. * Drop the `getattr(agent, '_base_url_hostname', None)` softening — a missing attribute is a real initialization bug and silencing it would hide the very mis-route this PR is meant to fix. * Log non-list (malformed) `enum` values at DEBUG instead of silently ignoring them, and demote the stripped-count log from INFO to DEBUG (fires every turn on affected setups; not a notable event). * Tighten docstring: pin the mutation/return contract and replace the brittle Brave tool-name references with the underlying header-enum shape that triggers the rejection. Tests: * 7 new unit tests covering caller-deepcopy contract, list-reference return, array `items` with enum, anyOf union, mixed-type enum values, all-stripped enum with sibling `default` survival, and the malformed (non-list) enum path. * 4 new helper tests covering `agent_uses_xai_responses` (provider match, hostname-only match, negative providers, missing-attrs tolerance). * 3 transport-wrapper tests pinning that ResponsesApiTransport.preflight_kwargs forwards `is_xai_responses` and that the caller's tool registry is preserved across the call. All previously-passing tests on the touched modules still pass (563 of 577; the 14 unrelated failures in tests/agent/test_anthropic_adapter.py also fail on upstream main from this checkout — they read ambient Claude Code OAuth tokens from the host keychain).

Per the silent-failure-hunter review: getattr fallbacks would silently route a mis-initialized agent (one that skipped agent_init and so never got provider / _base_url_hostname set) as non-xAI, re-triggering the exact opaque-SSE-error bug this PR exists to prevent. Bare attribute access surfaces AttributeError loudly so the real init bug bubbles up. Matches the inline-expression pattern already used in agent/chat_completion_helpers.py:287. Test now asserts pytest.raises(AttributeError) on a Bare() agent stub instead of the previous silent-False behavior.

…boundary Move strip_xai_incompatible_enum_values out of the preflight path and into agent/chat_completion_helpers.build_api_kwargs alongside the existing strip_pattern_and_format call. Both xAI-specific tool-schema transforms now live in the same is_xai_responses block that already existed there, on a deep-copied tool list so the shared registry stays valid for non-xAI turns. Removes the plumbing introduced for the earlier review round: * agent_uses_xai_responses helper in agent/codex_responses_adapter.py * is_xai_responses kwarg on ResponsesApiTransport.preflight_kwargs * is_xai_responses kwarg on _preflight_codex_api_kwargs * derivation + pass-through in conversation_loop.py and codex_runtime.py Preflight is once again provider-neutral validation. xAI compatibility is now owned end-to-end by the request builder, which is the only place that also knows whether the call is bound for xAI. Strip semantics: if any string value in an enum contains "/", drop the entire enum constraint rather than narrowing the list. Preserving the non-slash subset would silently change the model-visible value space for that property; dropping the constraint keeps the type-string typing so free-form values still validate. Tests now exercise the behavior end-to-end through agent._build_api_kwargs instead of probing preflight. The transport-level test class is renamed to TestCodexPreflightProviderNeutral and pins that preflight does not apply provider policy.

…27907) The xAI tool-schema sanitizers (strip_slash_enum, strip_pattern_and_format) mutate their input in place — that's their documented contract. The two call sites (chat_completion_helpers.build_api_kwargs and the auxiliary client) were passing agent.tools straight through, so the first xAI request would permanently strip slash-containing enum constraints and pattern/format keywords from the per-agent tool registry. Effect: any subsequent non-xAI call from the same agent (auxiliary task routed to Anthropic, OpenRouter fallback, mid-session model switch) saw the already-stripped schema with no way for the user to notice from their config. Fix: deepcopy tools_for_api before sanitizing at both call sites. The slash-enum bug itself (xAI 400ing on enums with '/') was fixed earlier by #32443 (Nami4D) — that PR landed the strip but used the sanitizers directly without copying. This salvages #27907's correctness contribution (the deepcopy) while skipping its redundant parallel sanitizer (strip_xai_incompatible_enum_values is functionally equivalent to the existing strip_slash_enum) and its preflight- neutrality argument (we chose model-gated preflight in #32443). 3 new tests in tests/run_agent/test_run_agent_codex_responses.py: - strips_slash_enum_from_outgoing_request — outgoing kwargs has no slash-containing enum values (functional contract preserved). - does_not_mutate_agent_tools — headline #27907 regression. Snapshot agent.tools before build_api_kwargs, assert it survives intact after. Pre-fix this assertion would have caught the mutation. - is_idempotent_across_repeated_calls — three xAI requests in a row each strip cleanly AND don't progressively erode the source schema. 344/344 across tests/agent/test_auxiliary_client.py, tests/agent/transports/test_codex_transport.py, tests/run_agent/test_run_agent_codex_responses.py, and tests/tools/test_schema_sanitizer.py. Co-authored-by: Gabor Barany <barany.gabor@gmail.com>

teknium1 · 2026-05-29T06:30:20Z

Closing — the deepcopy correctness fix from your PR was salvaged into #34416 (#34416) with your authorship preserved (Co-authored-by: trailer on the merge commit 1386a7e4).

What landed:

_copy.deepcopy(tools_for_api) before the sanitizer calls in chat_completion_helpers.build_api_kwargs AND the symmetric site in auxiliary_client.py (which used list(tools) — only a shallow copy — and had the same mutation issue).
3 new tests, including the headline does_not_mutate_agent_tools regression that asserts agent.tools survives intact after a build_api_kwargs call.

What didn't land, and why:

strip_xai_incompatible_enum_values — the existing strip_slash_enum in tools/schema_sanitizer.py is functionally equivalent and was wired up at both call sites by PR fix(xai-oauth): strip service_tier and add gated slash-enum safety-net (#28490) #32443 (@Nami4D) two days before this salvage. We didn't need a parallel implementation.
The preflight-provider-neutrality refactor — PR fix(xai-oauth): strip service_tier and add gated slash-enum safety-net (#28490) #32443 took the opposite design path (model-gated preflight strip as a defense-in-depth safety net). Both designs have merits; we went with the model-gated preflight because it catches future bypass paths that forget to sanitize earlier.

Thank you for the careful empirical work mapping which xAI enum values get rejected (application/json rejected, text/* rejected, foo*bar accepted) and for identifying the mutation as a separate bug from the rejection itself. Without your PR the mutation bug would have stayed on main even post-#32443.

…ousResearch#27907) The xAI tool-schema sanitizers (strip_slash_enum, strip_pattern_and_format) mutate their input in place — that's their documented contract. The two call sites (chat_completion_helpers.build_api_kwargs and the auxiliary client) were passing agent.tools straight through, so the first xAI request would permanently strip slash-containing enum constraints and pattern/format keywords from the per-agent tool registry. Effect: any subsequent non-xAI call from the same agent (auxiliary task routed to Anthropic, OpenRouter fallback, mid-session model switch) saw the already-stripped schema with no way for the user to notice from their config. Fix: deepcopy tools_for_api before sanitizing at both call sites. The slash-enum bug itself (xAI 400ing on enums with '/') was fixed earlier by NousResearch#32443 (Nami4D) — that PR landed the strip but used the sanitizers directly without copying. This salvages NousResearch#27907's correctness contribution (the deepcopy) while skipping its redundant parallel sanitizer (strip_xai_incompatible_enum_values is functionally equivalent to the existing strip_slash_enum) and its preflight- neutrality argument (we chose model-gated preflight in NousResearch#32443). 3 new tests in tests/run_agent/test_run_agent_codex_responses.py: - strips_slash_enum_from_outgoing_request — outgoing kwargs has no slash-containing enum values (functional contract preserved). - does_not_mutate_agent_tools — headline NousResearch#27907 regression. Snapshot agent.tools before build_api_kwargs, assert it survives intact after. Pre-fix this assertion would have caught the mutation. - is_idempotent_across_repeated_calls — three xAI requests in a row each strip cleanly AND don't progressively erode the source schema. 344/344 across tests/agent/test_auxiliary_client.py, tests/agent/transports/test_codex_transport.py, tests/run_agent/test_run_agent_codex_responses.py, and tests/tools/test_schema_sanitizer.py. Co-authored-by: Gabor Barany <barany.gabor@gmail.com>

Gabor Barany added 2 commits May 18, 2026 10:40

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder comp/tools Tool registry, model_tools, toolsets provider/xai xAI (Grok) P3 Low — cosmetic, nice to have labels May 18, 2026

Gabor Barany added 3 commits May 18, 2026 11:13

alt-glitch mentioned this pull request May 18, 2026

fix(xai-responses): strip enum values containing '/' from tool schemas #28021

Closed

3 tasks

teknium1 mentioned this pull request May 29, 2026

fix(xai-sanitize): deepcopy tools_for_api before in-place mutation (#27907) #34416

Merged

teknium1 closed this May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tools): strip slash-containing enum values for xAI Responses#27907

fix(tools): strip slash-containing enum values for xAI Responses#27907
gbarany wants to merge 5 commits into
NousResearch:mainfrom
gbarany:fix/xai-tool-schema-slash-enum

gbarany commented May 18, 2026 •

edited

Loading

Uh oh!

teknium1 commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gbarany commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Type of Change

Changes Made

How to Test

Test runs

Checklist

Code

Documentation & Housekeeping

Notes for reviewers

Screenshots / Logs

Uh oh!

teknium1 commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gbarany commented May 18, 2026 •

edited

Loading