Skip to content

fix(run_agent): enroll Xiaomi MiMo in reasoning_content echo-back enforcement#24605

Closed
wesleysimplicio wants to merge 1 commit into
NousResearch:mainfrom
wesleysimplicio:fix/cx15-issue-24443-mimo-reasoning-content-echo
Closed

fix(run_agent): enroll Xiaomi MiMo in reasoning_content echo-back enforcement#24605
wesleysimplicio wants to merge 1 commit into
NousResearch:mainfrom
wesleysimplicio:fix/cx15-issue-24443-mimo-reasoning-content-echo

Conversation

@wesleysimplicio

Copy link
Copy Markdown
Contributor

Problem

MiMo reasoning/thinking models fail in multi-turn Hermes conversations with HTTP 400:

Error code: 400 - {
  "error": {
    "message": "Param Incorrect",
    "type": "upstream_error",
    "param": "The reasoning_content in the thinking mode must be passed back to the API.",
    "code": "400"
  }
}

MiMo's OpenAI-compatible API requires reasoning_content to be echoed on every assistant message when continuing a conversation in thinking mode. Hermes already implements this for DeepSeek (#15250) and Kimi (#17400) but MiMo was not enrolled in the enforcement set.

Root cause

_needs_thinking_reasoning_pad() gates all three enforcement branches:

  1. _build_assistant_message — pins reasoning_content at creation time for tool-call turns
  2. _copy_reasoning_content_for_api tier 1 — upgrades stale """ " on replay
  3. _copy_reasoning_content_for_api tier 4 — unconditional " " pad for all assistant turns

The method only OR'd in _needs_deepseek_tool_reasoning() and _needs_kimi_tool_reasoning(). MiMo was missing, so all three branches were skipped for provider="xiaomi" / xiaomimimo.com / mimo-* sessions.

Fix

Add _needs_mimo_tool_reasoning() detecting three signals (mirrors the pattern for Kimi):

  • provider == "xiaomi" (case-insensitive)
  • base_url host matches xiaomimimo.com
  • model name starts with "xiaomi/" or "mimo-" (covers catalog and bare forms)

OR it into _needs_thinking_reasoning_pad(). All existing downstream branches activate for MiMo automatically — no other provider behaviour changes.

Tests

tests/run_agent/test_mimo_reasoning_content_echo.py — 30 hermetic tests, modelled on the existing DeepSeek suite:

  • TestNeedsMimoToolReasoning — all three detection signals, case-insensitive, negative/substring cases
  • TestNeedsThinkingReasoningPadIncludesMimo — MiMo activates pad; DeepSeek/Kimi still active; unrelated provider off
  • TestCopyReasoningContentForApiMimo — poisoned history → " ", real content preserved verbatim, stale "" upgraded, cross-provider leak guard, base URL match, non-MiMo untouched
  • TestBuildAssistantMessageMimo — parametrized over all three detection signals; streamed reasoning promoted over pad; real reasoning_content preserved

All 30 new tests pass. Existing test_deepseek_reasoning_content_echo.py (36 tests) unaffected.

Visual

flowchart TD
    A["_needs_thinking_reasoning_pad()"] --> B["_needs_deepseek_tool_reasoning()"]
    A --> C["_needs_kimi_tool_reasoning()"]
    A --> D["_needs_mimo_tool_reasoning() ← NEW"]
    D --> E{"provider == 'xiaomi'?"}
    D --> F{"base_url host = xiaomimimo.com?"}
    D --> G{"model starts with 'xiaomi/' or 'mimo-'?"}
    E -- yes --> H[enforce echo-back]
    F -- yes --> H
    G -- yes --> H
Loading

Closes #24443

…orcement

MiMo's thinking mode requires reasoning_content to be echoed on every
assistant message in multi-turn conversations, identical to DeepSeek (NousResearch#15250)
and Kimi (NousResearch#17400). Without enrollment, Hermes drops the field from persisted
history and the next request fails with HTTP 400 ("The reasoning_content in
the thinking mode must be passed back to the API").

Add _needs_mimo_tool_reasoning() detecting three signals: provider="xiaomi",
base_url host matching xiaomimimo.com, or model prefix "xiaomi/" / "mimo-".
OR it into _needs_thinking_reasoning_pad() so all downstream branches
(_build_assistant_message pad, stale "" → " " upgrade, cross-provider leak
guard) activate for MiMo automatically. No existing provider behaviour changes.

Closes NousResearch#24443

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 23:26

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes multi-turn failures when using Xiaomi MiMo “thinking/reasoning” models by ensuring reasoning_content is always echoed back on assistant turns during conversation continuation (matching the existing enforcement behavior for DeepSeek and Kimi/Moonshot).

Changes:

  • Add MiMo detection (_needs_mimo_tool_reasoning) and include it in _needs_thinking_reasoning_pad() gating.
  • Extend the existing reasoning-content replay padding behavior to apply for MiMo automatically via the shared gate.
  • Add a new regression test suite covering MiMo detection, padding, and assistant message construction.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
run_agent.py Adds MiMo provider/model/base_url detection and enrolls it into the existing thinking-mode reasoning_content echo-back enforcement gate.
tests/run_agent/test_mimo_reasoning_content_echo.py Adds regression tests validating MiMo detection and ensuring replay-safe reasoning_content behavior across tool-call and non-tool turns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread run_agent.py
Comment on lines +10099 to +10100
thinking all reject replays of assistant tool-call messages that omit
``reasoning_content`` (refs #15250, #17400, #24443).


class TestCopyReasoningContentForApiMimo:
"""_copy_reasoning_content_for_api applies all four tiers for MiMo."""
@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder provider/xiaomi Xiaomi MiLM P2 Medium — degraded but workaround exists duplicate This issue or pull request already exists labels May 12, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #24465 — all three PRs (#24465, #24603, #24605) enroll Xiaomi MiMo into reasoning_content echo-back enforcement via _needs_mimo_tool_reasoning(). #24465 is the earliest open PR. Fixes #24443.

@wesleysimplicio

Copy link
Copy Markdown
Contributor Author

Closing as duplicate/superseded by #24662 to keep one canonical PR per fix topic and avoid review split.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists P2 Medium — degraded but workaround exists provider/xiaomi Xiaomi MiLM type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MiMo reasoning models may fail in Hermes because reasoning_content is not preserved in chat history

3 participants