Skip to content

fix(gateway,xiaomi): reasoning_content echo-back for MiMo + prevent infinite retry#24784

Open
wanxinhao wants to merge 1 commit into
NousResearch:mainfrom
wanxinhao:fix/xiaomi-mimo-reasoning-content
Open

fix(gateway,xiaomi): reasoning_content echo-back for MiMo + prevent infinite retry#24784
wanxinhao wants to merge 1 commit into
NousResearch:mainfrom
wanxinhao:fix/xiaomi-mimo-reasoning-content

Conversation

@wanxinhao

Copy link
Copy Markdown

Summary

Xiaomi MiMo's Anthropic-compatible /anthropic endpoint requires reasoning_content on every assistant message when thinking mode is enabled. Omitting it causes HTTP 400:

The reasoning_content in the thinking mode must be passed back to the API.

This PR adds three fixes:

1. run_agent.py — Xiaomi reasoning pad detection

Added _needs_xiaomi_tool_reasoning() matching provider xiaomi/mimo/xiaomi-mimo or base_url containing xiaomimimo.com. Included in _needs_thinking_reasoning_pad() so the padding logic applies to MiMo tool-call replays (same pattern as DeepSeek #15250 and Kimi #17400).

2. agent/anthropic_adapter.py — Anthropic adapter compatibility

  • Added _is_xiaomi_anthropic_endpoint() function
  • Added it to _preserve_unsigned_thinking whitelist so unsigned thinking blocks survive signature stripping on third-party endpoints
  • Fixed reasoning_content → thinking block insertion to always insert even when _already_has_thinking is True, because signed blocks from reasoning_details will be stripped by the third-party endpoint code below

3. gateway/run.py — Prevent infinite retry loop

Added is_reasoning_echo_failure detection matching "reasoning_content", "thinking mode", "must be passed back" in error messages. When triggered:

  • Auto-resets the session (like compression_exhausted)
  • Skips transcript persistence (don't grow a broken session)
  • Logs a warning instead of "Transient agent failure"

Without this, the gateway enters an infinite retry loop on the same stale session history that always produces the same 400 error. Observed: a session retried for 2+ hours hitting the same error every time.

Testing

# Verify Xiaomi detection
hermes chat -q "echo hello" -m mimo-v2.5-pro --provider xiaomi --yolo

# Verify gateway doesn't infinite retry on stale sessions
# (requires a session that previously hit the reasoning_content 400)

Related

@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery provider/xiaomi Xiaomi MiLM P2 Medium — degraded but workaround exists labels May 13, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Partial overlap with #24465 (earliest open PR for Xiaomi MiMo reasoning_content replay). The reasoning_content echo-back portion of this PR (run_agent.py + anthropic_adapter.py) competes with #24465, #24603, #24737. However, this PR additionally fixes a gateway infinite retry loop on stale reasoning_content 400 errors (gateway/run.py) which is net-new.

Related: #24443 (tracking issue), #24401 (cross-provider thinking block strip failure)

When MiMo reasoning_content echo-back returns HTTP 400, gateway/run.py
treated it as a transient error and retried indefinitely. This fix
classifies reasoning_content-related 400s as non-recoverable errors
and auto-resets the stale session instead of looping.

Related: NousResearch#24443 (tracking issue), NousResearch#24401
@wanxinhao

Copy link
Copy Markdown
Author

Update: Narrowed scope to gateway-only fix

Based on triage feedback from @alt-glitch, I've stripped the reasoning_content echo-back changes from this PR (run_agent.py + agent/anthropic_adapter.py) since they overlap with #24465, #24603, and #24737.

This PR now contains only the gateway/run.py fix: preventing infinite retry loops when reasoning_content 400 errors hit stale sessions.

The core reasoning_content replay fix is covered by #24603 (most complete implementation with 101 tests). This PR is complementary — it handles the gateway-level fallout that #24603 doesn't address.

Suggested merge order: #24603 first, then this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists provider/xiaomi Xiaomi MiLM type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants