Skip to content

fix(agent): preserve reasoning_content for Xiaomi MiMo thinking mode (#24443)#24603

Open
NilsR0711 wants to merge 2 commits into
NousResearch:mainfrom
NilsR0711:fix/mimo-reasoning-content-echo
Open

fix(agent): preserve reasoning_content for Xiaomi MiMo thinking mode (#24443)#24603
NilsR0711 wants to merge 2 commits into
NousResearch:mainfrom
NilsR0711:fix/mimo-reasoning-content-echo

Conversation

@NilsR0711

@NilsR0711 NilsR0711 commented May 12, 2026

Copy link
Copy Markdown

What

Enrol Xiaomi MiMo into the reasoning_content echo-back enforcement set on both the OpenAI-compatible path (run_agent.py) and the Anthropic-Messages path (agent/anthropic_adapter.py) so multi-turn MiMo conversations stop failing with HTTP 400.

Why

MiMo's OpenAI-compatible API requires reasoning_content on every prior assistant message in thinking mode (see MiMo docs). Without it, the next replay returns:

Error code: 400 - {"error": {"message": "Param Incorrect",
  "param": "The reasoning_content in the thinking mode must be passed back to the API."}}

Hermes already ships the echo-back pattern for DeepSeek V4 thinking and Kimi / Moonshot thinking on both protocol paths; MiMo was simply not in either enforcement gate. That double omission produced four failure modes:

  1. Tool-call turns without SDK-exposed reasoning_content — when MiMo streams reasoning via deltas only, the assembled ChatCompletionMessage has no top-level attribute, so the build-time pad never fired. Persisted history then lacked the field and the next request 400'd.
  2. Cross-provider history — sessions opened under another provider and then /model-switched to MiMo bypassed the tier-2 leak guard, so prior chain-of-thought either leaked or was dropped.
  3. Legacy reasoning_content="" sessions (pre-fix: use non-empty reasoning_content placeholder for DeepSeek V4 Pro thinking mode #17341) replayed the empty string verbatim instead of upgrading it to " ".
  4. Anthropic-Messages path — MiMo running through an Anthropic-Messages-compatible gateway (xiaomimimo.com, OpenRouter mimo- slots, private proxies) had no carve-out: unsigned thinking blocks synthesised from reasoning_content were stripped along with signed Anthropic blocks, and the next replay 400'd.

Fixes #24443.

How

run_agent.py (OpenAI-compatible path)

  • New _needs_mimo_tool_reasoning(), modelled on _needs_kimi_tool_reasoning(). Detection covers four signals:
    • provider == "xiaomi" (Hermes-internal id from hermes_cli/auth.py)
    • base_url host matches xiaomimimo.com
    • model starts with xiaomi/ (catalog form)
    • model starts with mimo- or contains /mimo- (bare + third-party catalog form)
  • _needs_thinking_reasoning_pad() ORs the new detector in. All four existing tiers of _copy_reasoning_content_for_api and the tool-call pad branch in _build_assistant_message activate for MiMo automatically.
  • Inline comments at the three pad sites updated to mention MiMo alongside DeepSeek / Kimi (refs MiMo reasoning models may fail in Hermes because reasoning_content is not preserved in chat history #24443).

agent/anthropic_adapter.py (Anthropic-Messages path)

  • New _model_name_is_xiaomi_mimo() covering three slot shapes: catalog xiaomi/, bare mimo-, and embedded /mimo- for namespaced third-party slots (e.g. openrouter/mimo-v2.5-pro). Intentionally narrow — does not accept mimo_ / xiaomi-mimo- (no published catalog uses those) so lookalikes are not misdetected.
  • New _is_xiaomi_mimo_anthropic_endpoint(), ORed into _preserve_unsigned_thinking alongside _is_kimi_family_endpoint and _is_deepseek_anthropic_endpoint. MiMo now uses the same strip-signed / keep-unsigned policy as Kimi /coding and DeepSeek /anthropic.
  • Comment block at the thinking-block synthesis site updated to mention all three strict-replay providers.

DeepSeek / Kimi / Anthropic-native / MiniMax behaviour is untouched — the three detectors are OR-combined and mutually exclusive in practice. No schema or state migration.

A known gap is documented in the new docstring: sessions routed through a proxy or institutional gateway with a custom hostname AND no Hermes provider id set will not auto-detect; users in that configuration should set provider: xiaomi explicitly.

How to test

pytest tests/run_agent/test_mimo_reasoning_content_echo.py \
       tests/run_agent/test_deepseek_reasoning_content_echo.py \
       tests/agent/test_mimo_anthropic_thinking.py \
       tests/agent/test_deepseek_anthropic_thinking.py \
       tests/agent/test_kimi_coding_anthropic_thinking.py -q

run_agent.py suite

38 hermetic tests in tests/run_agent/test_mimo_reasoning_content_echo.py mirroring the DeepSeek echo suite:

  • Detection (_needs_mimo_tool_reasoning): all four positive signals, case-insensitive provider id, third-party catalog form, plus negative guards for lookalike model names (MiniMax, Mistral, Microsoft Phi-4, mimowave substring).
  • Pad gate: _needs_thinking_reasoning_pad flips on under MiMo and remains off for unrelated providers; pre-existing DeepSeek / Kimi paths unaffected.
  • _copy_reasoning_content_for_api: all four tiers exercised under MiMo (tool-call pad, """ " upgrade, explicit content preserved, reasoning field promoted, cross-provider leak guard injects " ").
  • _build_assistant_message: reasoningreasoning_content backfill, model_extra preservation, tool-call pad with no raw reasoning_content, parametrised matrix covering attr-none / attr-absent / base-url / catalog / negative-non-mimo cases, streamed-reasoning promotion over pad, creation-time pad boundary for text-only turns.

agent/anthropic_adapter.py suite

37 hermetic tests in tests/agent/test_mimo_anthropic_thinking.py mirroring the DeepSeek/Kimi Anthropic-path suites:

  • Detection (_model_name_is_xiaomi_mimo): catalog / bare / /mimo- shapes; case-insensitive; whitespace-tolerant.
  • Negative guards: MiniMax, Mistral, Microsoft Phi-4, mimowave, phi-4-mimo-style, mimo_v2, xiaomi-mimo-v2, empty/non-string inputs — all must NOT match.
  • Endpoint detection (_is_xiaomi_mimo_anthropic_endpoint): host-only, model-only, neither-signal cases.
  • convert_messages_to_anthropic: unsigned thinking blocks preserved on tool-call replay (parametrised over four detection routes); preservation across non-latest assistant turns; signed Anthropic blocks stripped; cache_control stripped from thinking blocks; MiniMax regression guard (a too-liberal MiMo matcher would break MiniMax — explicit negative test).

Result: 101 tests pass across the new file and the pre-existing DeepSeek / Kimi / MiMo echo suites, 0 regressions. ruff check agent/anthropic_adapter.py tests/agent/test_mimo_anthropic_thinking.py run_agent.py tests/run_agent/test_mimo_reasoning_content_echo.py clean.

Platforms tested

  • macOS 15 / Python 3.12 (local)

Hermetic test suite — no platform-specific code paths.

…ousResearch#24443)

MiMo's OpenAI-compatible API requires reasoning_content on every prior
assistant message in thinking mode. Without it, the next replay fails
with HTTP 400 ("The reasoning_content in the thinking mode must be
passed back to the API.").

Hermes already implements the echo-back pattern for DeepSeek V4 and
Kimi / Moonshot thinking; MiMo was simply not enrolled, so three
failure modes coexisted:

  1. Tool-call turns where the SDK did not surface reasoning_content
     as a top-level attribute persisted without the field; the next
     request omitted it.
  2. Cross-provider history (session opened under another provider,
     then /model-switched to MiMo) was not run through the leak guard,
     so prior reasoning was either dropped or echoed unsafely.
  3. Legacy sessions with reasoning_content="" pinned at creation time
     replayed verbatim instead of being upgraded to " ".

Add _needs_mimo_tool_reasoning() and OR it into the existing
_needs_thinking_reasoning_pad() gate. Detection covers four signals:
provider == "xiaomi", xiaomimimo.com host, and the catalog/bare model
naming conventions (xiaomi/mimo-..., mimo-..., vendor/mimo-...).
All four existing tiers in _copy_reasoning_content_for_api and the
build-time pad branch in _build_assistant_message activate
automatically once the gate fires. DeepSeek / Kimi behaviour is
untouched.

Tests: 38 hermetic cases mirroring tests/run_agent/test_deepseek_
reasoning_content_echo.py — detection signals (positive + negative
lookalike guards for MiniMax / Mistral / Microsoft Phi-4), pad-gate
regression, _copy_reasoning_content_for_api tiers 1-4 under MiMo,
_build_assistant_message tool-call pad. 100 tests pass across the
new file and the pre-existing DeepSeek/Kimi echo suites; ruff clean.

Platforms: hermetic, no platform dependency.
@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder provider/xiaomi Xiaomi MiLM P2 Medium — degraded but workaround exists duplicate This issue or pull request already exists labels May 12, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #24465 — all three PRs (#24465, #24603, #24605) enroll Xiaomi MiMo into reasoning_content echo-back enforcement via _needs_mimo_tool_reasoning(). #24465 is the earliest open PR. Fixes #24443.

… path (NousResearch#24443)

Extend the Xiaomi MiMo thinking-mode fix to agent/anthropic_adapter.py so
sessions using the Anthropic Messages protocol (xiaomimimo.com or
xiaomi/ / mimo- model slugs routed through Anthropic-compatible gateways)
also get reasoning_content round-tripped on replayed tool-call turns.

- _model_name_is_xiaomi_mimo: catalog (xiaomi/), bare (mimo-) and
  third-party namespaced (/mimo-) slot shapes. Narrow on purpose - does
  not accept mimo_ / xiaomi-mimo- (no published catalog uses those) so
  lookalikes (MiniMax, Mistral, Phi-4, mimowave) are not misdetected.
- _is_xiaomi_mimo_anthropic_endpoint: ORed into _preserve_unsigned_thinking
  alongside Kimi / DeepSeek.
- 37 new tests in tests/agent/test_mimo_anthropic_thinking.py covering
  positives, negative guards against lookalike families, unsigned-block
  preservation across non-latest assistant turns, signed-block stripping,
  cache_control stripping, and MiniMax regression guard.

Refs NousResearch#24443. Companion fix to run_agent.py (already on this branch).
@NilsR0711

Copy link
Copy Markdown
Author

@alt-glitch — acknowledging the duplicate triage (#24603 / #24605 / #24465 all enroll MiMo into _needs_mimo_tool_reasoning()). Happy to defer to #24465 as the earliest open PR.

Before close, I want to flag three additions in this branch that are not covered by either #24465 or #24605 and that I think are worth folding into the canonical fix before it merges. Posted the detail on #24465 (#24465 (comment)); summary:

  1. Anthropic-Messages path coverage (agent/anthropic_adapter.py). feat: support Xiaomi MiMo thinking-mode reasoning_content replay #24465 already touches this file, but fix(run_agent): enroll Xiaomi MiMo in reasoning_content echo-back enforcement #24605 does not. This branch adds it with explicit MiniMax-regression guards, since a too-liberal MiMo matcher on the Anthropic side would break MiniMax replays. Pushed in 8b3ca1a — 37 new hermetic tests in tests/agent/test_mimo_anthropic_thinking.py.
  2. Third-party namespaced catalog form "/mimo-" in model (catches vendor/sub/mimo-v3 reliably across both protocol paths).
  3. Negative guards as explicit tests against lookalikes (MiniMax, Mistral, Phi-4, mimowave, phi-4-mimo-style).
  4. Legacy reasoning_content="" upgrade path for pre-fix: use non-empty reasoning_content placeholder for DeepSeek V4 Pro thinking mode #17341 sessions.

If the maintainers prefer, I can:

Let me know which path works best. All test artifacts (101/101 green locally) are reproducible from this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists P2 Medium — degraded but workaround exists provider/xiaomi Xiaomi MiLM type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MiMo reasoning models may fail in Hermes because reasoning_content is not preserved in chat history

2 participants