Skip to content

feat: support Xiaomi MiMo thinking-mode reasoning_content replay#24465

Open
Peek-A-Booo wants to merge 4 commits into
NousResearch:mainfrom
Peek-A-Booo:feat/xiaomi-mimo-thinking-replay
Open

feat: support Xiaomi MiMo thinking-mode reasoning_content replay#24465
Peek-A-Booo wants to merge 4 commits into
NousResearch:mainfrom
Peek-A-Booo:feat/xiaomi-mimo-thinking-replay

Conversation

@Peek-A-Booo

Copy link
Copy Markdown

Summary

Add Xiaomi MiMo thinking-mode support for reasoning_content round-trip in multi-turn agent conversations.

MiMo, like Kimi and DeepSeek, requires reasoning_content to be passed back on replayed assistant tool-call messages when thinking mode is enabled. Without this fix, multi-turn conversations with tool calls fail with HTTP 400: "The reasoning_content in the thinking mode must be passed back to the API."

Changes

  • agent/anthropic_adapter.py: Add _model_name_is_xiaomi_mimo() and _is_xiaomi_mimo_anthropic_endpoint() helpers; include MiMo in the _preserve_unsigned_thinking condition alongside Kimi/DeepSeek
  • run_agent.py: Add _needs_xiaomi_tool_reasoning() method with provider, model prefix, and base_url domain detection; include in _needs_thinking_reasoning_pad()
  • Tests: 26 new tests covering model name matching, endpoint detection, thinking preservation, and reasoning padding for MiMo

Detection Logic

Matches MiMo via three signals:

  1. Provider: xiaomi, mimo, xiaomi-mimo
  2. Model prefix: mimo-, mimo_, xiaomi-mimo-, xiaomi_mimo_ (strips namespace prefixes like openrouter/)
  3. Base URL domain: xiaomimimo.com, mimo.com

Test Plan

  • 26 new MiMo-specific tests pass
  • 45 combined regression tests (Kimi + DeepSeek + MiMo) pass
  • Live smoke test with mimo-v2.5-pro via CLI — no 400 errors
  • CI passes

Related

Refs: #13848, #15250, #16748, #17057, #17400

MiMo (like Kimi and DeepSeek) requires reasoning_content to be
round-tripped on assistant tool-call messages when thinking mode is
enabled. Without it, multi-turn agent conversations fail with HTTP 400.

Changes:
- agent/anthropic_adapter.py: Add _model_name_is_xiaomi_mimo() and
  _is_xiaomi_mimo_anthropic_endpoint() helpers; include MiMo in the
  _preserve_unsigned_thinking condition alongside Kimi/DeepSeek
- run_agent.py: Add _needs_xiaomi_tool_reasoning() method with provider,
  model prefix, and base_url domain detection; include in
  _needs_thinking_reasoning_pad()
- tests: 26 new tests covering model name matching, endpoint detection,
  thinking preservation, and reasoning padding for MiMo

Refs: NousResearch#13848, NousResearch#15250, NousResearch#16748, NousResearch#17057, NousResearch#17400
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder provider/xiaomi Xiaomi MiLM labels May 12, 2026
@NilsR0711

Copy link
Copy Markdown

@Peek-A-Booo @alt-glitch — flagging three additions from #24603 (duplicate, will defer to this PR as the earliest) that may be worth folding in before merge. All are battle-tested with hermetic tests in #24603 and can be cherry-picked or rewritten standalone.

1. Third-party namespaced catalog form /mimo-

This PR's _model_name_is_xiaomi_mimo (and the run_agent.py counterpart) match mimo-* and xiaomi-mimo-* as bare/catalog prefixes after rsplit("/", 1). That covers openrouter/mimo-v2.5-pro (because the rsplit-tail starts with mimo-), but not deeper-namespaced slots like vendor/sub/mimo-v3 or nous-portal/mimo-v2 reliably across both paths (the run_agent.py detector here has only prefix matching, not rsplit). #24603 adds an explicit "/mimo-" in model check so any depth of namespacing works on both protocol paths.

2. Explicit negative guards against lookalike model families

Neither #24465 nor #24605 has tests asserting that the matcher does not trigger on MiniMax (minimax-text-01), Mistral (mistralai/mistral-medium), Microsoft Phi-4 (microsoft/phi-4), mimowave-7b, or phi-4-mimo-style. These are realistic adjacent model families that a future careless detector change could accidentally enroll into MiMo's enforcement path — which on the Anthropic side would break MiniMax replays outright (it rejects unsigned thinking blocks). #24603 pins these as explicit regression tests in both tests/run_agent/test_mimo_reasoning_content_echo.py and tests/agent/test_mimo_anthropic_thinking.py.

3. Legacy reasoning_content="" upgrade path

Sessions persisted before #17341 stored reasoning_content="" literally at creation time. On replay, MiMo (like DeepSeek V4 Pro) rejects the empty string with HTTP 400. #24603's _copy_reasoning_content_for_api upgrades """ " when the active provider enforces the echo and the source's literal reasoning_content is the empty string. Without this, users with pre-#17341 history hit 400 on their first MiMo turn even after this PR merges.

Offer

If useful: the negative-guard tests and /mimo- clause are small, self-contained additions; happy to open a follow-up against this branch (or against the merge commit) so the canonical PR carries the full coverage. Tests in #24603 are hermetic and pass alongside the existing DeepSeek/Kimi suites (101/101 green locally).

Closing #24603 in favour of this PR once merged. CC @wesleysimplicio (#24605) in case the same additions are useful there.

@akaDRJ

akaDRJ commented May 13, 2026

Copy link
Copy Markdown

Adding one live repro datapoint from #24726 before I close/defer it here.

I hit the same 400 in a real Hermes gateway session after the primary sub2api-openai provider failed over to Xiaomi MiMo:

Provider: xiaomi
Model: mimo-v2.5-pro
Endpoint: https://token-plan-cn.xiaomimimo.com/anthropic
Error: HTTP 400: Param Incorrect
Details: The reasoning_content in the thinking mode must be passed back to the API.

The request dump shape was slightly different from plain OpenAI tool_calls: the replayed assistant turns were already in Anthropic content-block form:

{"role":"assistant","content":[{"type":"tool_use","id":"call_...","name":"terminal","input":{...}}]}

In that dump there were 29 assistant tool-use turns and 0 had reasoning_content/thinking replay data. A local regression that made the failing dump pass was:

  • detect xiaomimimo.com / mimo-* on the Anthropic adapter path
  • preserve unsigned thinking blocks for MiMo, like Kimi/DeepSeek
  • when a MiMo Anthropic replay assistant turn has tool_use blocks but no preserved thinking, prepend {"type":"thinking","thinking":" "}
  • do not send the top-level Anthropic thinking parameter to MiMo's /anthropic fallback endpoint for that replay shape

Offline verification against the failing dump after that change:

assistant_count 29
tool_use_missing_placeholder []
top_level_has_thinking False
first_assistant_blocks ['thinking', 'tool_use']

I have no need to keep #24726 competing with this PR. If this live replay shape is already covered here, great. If not, it may be worth adding as a small extra regression before merge.

- Add defence-in-depth '/mimo-' substring check for deep-namespaced
  model names (vendor/sub/mimo-v3) in both anthropic_adapter and
  run_agent detectors
- Add negative regression tests: MiniMax, Mistral, Microsoft Phi-4,
  phi-4-mimo-style, mimowave-7b must NOT trigger MiMo enforcement
- Add deep namespace tests: vendor/sub/mimo-v3, a/b/c/xiaomi-mimo-*
- reasoning_content='' upgrade was already covered (test exists)

Addresses review comments from NilsR0711 and akaDRJ.
… thinking blocks

When strict thinking-replay providers (MiMo, Kimi, DeepSeek) replay
assistant turns that are already in Anthropic content-block format with
tool_use blocks but NO thinking blocks at all, inject a single-space
placeholder thinking block so the upstream API does not reject with
HTTP 400.

This covers akaDRJ's live repro (NousResearch#24726): 29 assistant tool-use turns
in Anthropic content-block form, zero with thinking/reasoning data,
from a provider failover to MiMo's /anthropic endpoint.

Also: address NilsR0711 review feedback:
- Add defence-in-depth '/mimo-' substring check for deep-namespaced models
- Add negative regression tests (MiniMax, Mistral, Phi-4, mimowave, etc.)
- Add deep namespace tests (vendor/sub/mimo-v3)
- Test Anthropic tool_use without thinking (new) + non-MiMo negative guard
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have provider/xiaomi Xiaomi MiLM type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants