fix(xiaomi): MiMo reasoning models fail multi-turn tool calls (#81419)#81589
Conversation
|
Codex review: needs maintainer review before merge. Summary Reproducibility: yes. The PR body and linked issue provide a concrete MiMo custom-provider multi-turn tool-call path with pre-fix 400 logs, and source inspection shows current main lacks the Xiaomi plugin/fallback wrapper that would add reasoning_content. Real behavior proof Next step before merge Security Review detailsBest possible solution: Land this focused compatibility fix after maintainer review and required checks; keep the separate reasoning-only visible-output behavior in #60304 independent if it still applies. Do we have a high-confidence way to reproduce the issue? Yes. The PR body and linked issue provide a concrete MiMo custom-provider multi-turn tool-call path with pre-fix 400 logs, and source inspection shows current main lacks the Xiaomi plugin/fallback wrapper that would add reasoning_content. Is this the best way to solve the issue? Yes. Keeping Xiaomi-owned behavior in the Xiaomi plugin while adding a narrow unowned-proxy fallback matches the existing DeepSeek V4 architecture and avoids a user-facing config knob. What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 2156b204eaac. |
|
Per CONTRIBUTING.md ("Do not submit test or CI-config fixes for failures already red on main CI"), flagging rather than fixing. The failing rule is function requireToolChoicePayload(payload: unknown): unknown | undefined {
That file is not touched by this PR. The line was introduced by upstream commit The same
Reproduced locally against upstream/main alone (with this PR's commits excluded): Local oxlint on this branch's changed files: 0 warnings, 0 errors. |
64df7f1 to
8c85b61
Compare
|
Lint failure resolved, all CI checks passed, ready for maintainer review. -JHD |
181076e to
99d0059
Compare
99d0059 to
8b32a47
Compare
…st endpoint classes
…mat and requiresReasoningContentOnAssistantMessages
… to convertMessages
…ModelRef, and replay hooks
…g_content injection tests
…sistantMessages (detection-only field)
… xiaomi-native to native+nonstandard lists, test host resolution
…d proxy providers
8b32a47 to
adad451
Compare
|
Thanks @jimdawdy-hub. I rebased this on current What changed:
Verification:
Live behavior note: I did not re-run the Xiaomi endpoint locally here; I relied on the PR's supplied live proof showing the pre-fix second-turn |
|
Built Main from source at 3:30PM CDT to confirm fix. Logs: MiMo Reasoning Tool-Call Fix Verification Log Test 1: Test 2: Pre-fix behavior (from logs at 15:06-15:11 CDT): Post-fix behavior (after gateway restart with PR #81589): Conclusion: PR #81589 resolves #81419. MiMo reasoning models can now complete multi-turn tool-call conversations on custom providers (xiaomi-orbit) without |
Fix: Xiaomi MiMo reasoning models fail on multi-turn tool calls
Closes #81419
Summary
Xiaomi MiMo reasoning models (mimo-v2-pro, mimo-v2-omni, mimo-v2.5, mimo-v2.5-pro) return
400 Param Incorrect("provider rejected the request schema or tool payload") on any turn that replays a prior assistant message — for example the second turn of a tool-call conversation. MiMo uses the same OpenAI-compatible reasoning wire format as DeepSeek V4 and requiresreasoning_contenton every assistant message in the request payload. OpenClaw was never adding that field for MiMo.The fix mirrors the existing DeepSeek V4 plumbing:
xiaomi-nativeis now a first-class endpoint class (src/agents/provider-attribution.ts) and recognized inisKnownNativeEndpoint. Manifest-driven host resolution maps the bundled host plus the three regionaltoken-plan-*.xiaomimimo.comhosts (used by self-hosted/custom MiMo proxies) to that class.src/agents/openai-completions-compat.ts) now setsthinkingFormat: "deepseek"andrequiresReasoningContentOnAssistantMessages: truefor xiaomi-native endpoints, alongside the existing DeepSeek path.xiaomiplugin (extensions/xiaomi/) registerswrapStreamFn,resolveThinkingProfile,isModernModelRef, andbuildProviderReplayFamilyHooks({ family: "openai-compatible", dropReasoningFromHistory: false })— same shape as the DeepSeek plugin. Two new local helpers (thinking.ts,stream.ts) reuse the sharedcreateDeepSeekV4OpenAICompatibleThinkingWrapperSDK helper rather than copying its logic.src/agents/pi-embedded-runner/extra-params.ts). When MiMo reasoning models are reached through a custom/proxy provider (e.g. a user-configuredxiaomi-orbitprovider pointing attoken-plan-ams.xiaomimimo.com), the bundledxiaomiplugin'swrapStreamFndoes not match. Core already has the same fallback pattern for DeepSeek V4 in this file; this PR adds the parallel MiMo matcher so the shared wrapper fires there too.What we found
Original bug
User reported
FailoverError: LLM request failed: provider rejected the request schema or tool payload.(400 Param Incorrect) when running mimo-v2.5-pro on a customxiaomi-orbitprovider pointed athttps://token-plan-ams.xiaomimimo.com/v1. The failure consistently fired on the second model turn — i.e., the one that includes a replayed assistant message + tool result.Root cause: pi-ai's openai-completions message converter does not inject
reasoning_content: ""on assistant messages for MiMo because OpenClaw never marked MiMo as a DeepSeek-style reasoning provider. The DeepSeek bundled plugin handles this via awrapStreamFnthat callsensureDeepSeekV4AssistantReasoningContenton every outgoing payload. MiMo had no equivalent.Code review findings (addressed in this PR)
A pre-PR self-review surfaced four issues, all fixed before this PR was opened:
mimo-v2.5-flash,mimo-v2.5, andmimo-v2.5-proto the bundled catalog without confirming those IDs exist on Xiaomi's officialapi.xiaomimimo.com. The bundled manifest in this PR only ships the existing officially-documented models (mimo-v2-flash,mimo-v2-pro,mimo-v2-omni). The reasoning detection set inextensions/xiaomi/thinking.tsstill recognizes the v2.5 IDs, so users who configure those via a custom provider (the bug-report case) still get correct reasoning treatment.xiaomi-nativeadded toisKnownNativeEndpointinsrc/agents/provider-attribution.ts. Matches the DeepSeek/xAI/zAI pattern.xiaomi-nativeadded toisNonStandardinsrc/agents/openai-completions-compat.ts. Functionally consistent with the DeepSeek precedent.src/agents/provider-attribution.test.tsto lock in the*.xiaomimimo.com→xiaomi-nativemapping.Live verification
After the first round of fixes (commits 1-7) the live test still failed: pi-ai does not actually read
requiresReasoningContentOnAssistantMessages(the field appears only in pi-ai's staticmodels.generated.js, not in any logic), and the bundled plugin'swrapStreamFnonly fires whenmodel.provider === "xiaomi"— not for the user'sxiaomi-orbitcustom provider. The core fallback inapplyPostPluginStreamWrappersonly matched DeepSeek V4 model IDs.Commit 10 (
64df7f12dd) added the MiMo matcher to the core fallback so the shared DeepSeek-style wrapper fires for MiMo reasoning IDs on any provider, including custom proxy providers. Live test then passed.Tests
src/agents/openai-completions-compat.test.ts(5 new xiaomi cases)extensions/xiaomi/index.test.ts(auth, catalog, replay policy, thinking profile,isModernModelRef,reasoning_contentinjection across thinking-on / thinking-off / cross-provider replay)src/agents/provider-attribution.test.tspnpm buildcleanpnpm check:changedclean (lint, typecheck, import cycles)Real Behavior Proof
Behavior addressed: Xiaomi MiMo reasoning models (mimo-v2.5-pro on a
xiaomi-orbitcustom provider pointed athttps://token-plan-ams.xiaomimimo.com/v1) returning400 Param Incorrecton any multi-turn / tool-call conversation. Reproduced before the fix; verified fixed after the fix.Real environment tested: Linux desktop, OpenClaw gateway built from this branch (
fix/mimo-reasoning-content), run viapnpm dev gateway --forceon the same port (18789) the user normally uses, exercised through the live Telegram channel against a real MiMo backend with a real API key.Exact steps or command run after this patch: Stopped the systemd-installed openclaw gateway, then started a dev gateway from this branch on the same port the user normally uses. Then, through the user's normal Telegram chat client, sent a message that selects
xiaomi-orbit / mimo-v2.5-pro(reasoning enabled), prompts aweb_searchtool call, and requires a follow-up model turn after the tool result. Exact commands:Evidence after fix (relevant log lines, gateway log
/tmp/openclaw/openclaw-2026-05-13.log):Pre-fix evidence (same gateway, same user, same provider, before the wrapper was wired):
Observed result after fix: The MiMo agent received the prompt, emitted a tool call (
web_searchfor "Esperanto language overview history"), received the tool result, then continued the conversation and emitted a second tool call (web_searchfor "Esperanto language history overview facts"). That second tool call requires the model to receive the prior assistant message + tool result and reason about a new request — exactly the multi-turn scenario the bug regressed. NoFailoverError, no400 Param Incorrect, noprovider rejected the request schemain the log after the fix went live.The
web_search failed: Ollama web search authentication failedmessages are an unrelated user-side configuration issue (Ollama needsollama signin) — they show that the model did successfully emit valid tool-use payloads to the gateway and the gateway invoked the tool plugin.What was not tested:
xiaomiprovider againstapi.xiaomimimo.comdirectly. The user only currently has axiaomi-orbitcustom provider configured. The bundled-provider path is exercised by the new unit tests inextensions/xiaomi/index.test.ts(auth, catalog, thinking profile, reasoning_content injection), but no live API hit. The core fallback wrapper test (pi-embedded-runner-extraparams.test.tsstyle) is logically covered by the existing DeepSeek "unowned proxy" precedent.xiaomi.live.test.tsfor TTS is unmodified by this PR).AI assistance
This PR was developed with Claude Sonnet 4.6 acting as a coding agent over a single working session, driving edits, builds, and tests through Claude Code. After the initial implementation, a self-review pass (also via Claude Sonnet 4.6) identified the four issues listed under "Code review findings" above; all four were fixed before this PR was opened. The live verification step caught a remaining gap (the compat-flag plumbing alone does not fix the bug because pi-ai does not consume that field) which was then resolved by adding the core fallback wrapper.
The human author (Jim Dawdy) reviewed the plan and the final diffs, supplied live API credentials and gateway access for real-environment proof, ran the Telegram chat that produced the post-fix evidence above, and is the named contributor for this change. All blame falls on human shoulders.