Skip to content

fix(xiaomi): MiMo reasoning models fail multi-turn tool calls (#81419)#81589

Merged
steipete merged 11 commits into
openclaw:mainfrom
jimdawdy-hub:fix/mimo-reasoning-content
May 15, 2026
Merged

fix(xiaomi): MiMo reasoning models fail multi-turn tool calls (#81419)#81589
steipete merged 11 commits into
openclaw:mainfrom
jimdawdy-hub:fix/mimo-reasoning-content

Conversation

@jimdawdy-hub

@jimdawdy-hub jimdawdy-hub commented May 14, 2026

Copy link
Copy Markdown
Contributor

Fix: Xiaomi MiMo reasoning models fail on multi-turn tool calls

Closes #81419

Summary

Xiaomi MiMo reasoning models (mimo-v2-pro, mimo-v2-omni, mimo-v2.5, mimo-v2.5-pro) return 400 Param Incorrect ("provider rejected the request schema or tool payload") on any turn that replays a prior assistant message — for example the second turn of a tool-call conversation. MiMo uses the same OpenAI-compatible reasoning wire format as DeepSeek V4 and requires reasoning_content on every assistant message in the request payload. OpenClaw was never adding that field for MiMo.

The fix mirrors the existing DeepSeek V4 plumbing:

  1. xiaomi-native is now a first-class endpoint class (src/agents/provider-attribution.ts) and recognized in isKnownNativeEndpoint. Manifest-driven host resolution maps the bundled host plus the three regional token-plan-*.xiaomimimo.com hosts (used by self-hosted/custom MiMo proxies) to that class.
  2. OpenAI-completions compat detection (src/agents/openai-completions-compat.ts) now sets thinkingFormat: "deepseek" and requiresReasoningContentOnAssistantMessages: true for xiaomi-native endpoints, alongside the existing DeepSeek path.
  3. The bundled xiaomi plugin (extensions/xiaomi/) registers wrapStreamFn, resolveThinkingProfile, isModernModelRef, and buildProviderReplayFamilyHooks({ family: "openai-compatible", dropReasoningFromHistory: false }) — same shape as the DeepSeek plugin. Two new local helpers (thinking.ts, stream.ts) reuse the shared createDeepSeekV4OpenAICompatibleThinkingWrapper SDK helper rather than copying its logic.
  4. Core fallback wrapper for "unowned" proxies (src/agents/pi-embedded-runner/extra-params.ts). When MiMo reasoning models are reached through a custom/proxy provider (e.g. a user-configured xiaomi-orbit provider pointing at token-plan-ams.xiaomimimo.com), the bundled xiaomi plugin's wrapStreamFn does not match. Core already has the same fallback pattern for DeepSeek V4 in this file; this PR adds the parallel MiMo matcher so the shared wrapper fires there too.

What we found

Original bug

User reported FailoverError: LLM request failed: provider rejected the request schema or tool payload. (400 Param Incorrect) when running mimo-v2.5-pro on a custom xiaomi-orbit provider pointed at https://token-plan-ams.xiaomimimo.com/v1. The failure consistently fired on the second model turn — i.e., the one that includes a replayed assistant message + tool result.

Root cause: pi-ai's openai-completions message converter does not inject reasoning_content: "" on assistant messages for MiMo because OpenClaw never marked MiMo as a DeepSeek-style reasoning provider. The DeepSeek bundled plugin handles this via a wrapStreamFn that calls ensureDeepSeekV4AssistantReasoningContent on every outgoing payload. MiMo had no equivalent.

Code review findings (addressed in this PR)

A pre-PR self-review surfaced four issues, all fixed before this PR was opened:

  • Speculative model entries removed. An earlier draft added mimo-v2.5-flash, mimo-v2.5, and mimo-v2.5-pro to the bundled catalog without confirming those IDs exist on Xiaomi's official api.xiaomimimo.com. The bundled manifest in this PR only ships the existing officially-documented models (mimo-v2-flash, mimo-v2-pro, mimo-v2-omni). The reasoning detection set in extensions/xiaomi/thinking.ts still recognizes the v2.5 IDs, so users who configure those via a custom provider (the bug-report case) still get correct reasoning treatment.
  • xiaomi-native added to isKnownNativeEndpoint in src/agents/provider-attribution.ts. Matches the DeepSeek/xAI/zAI pattern.
  • xiaomi-native added to isNonStandard in src/agents/openai-completions-compat.ts. Functionally consistent with the DeepSeek precedent.
  • Host-resolution coverage test added in src/agents/provider-attribution.test.ts to lock in the *.xiaomimimo.comxiaomi-native mapping.

Live verification

After the first round of fixes (commits 1-7) the live test still failed: pi-ai does not actually read requiresReasoningContentOnAssistantMessages (the field appears only in pi-ai's static models.generated.js, not in any logic), and the bundled plugin's wrapStreamFn only fires when model.provider === "xiaomi" — not for the user's xiaomi-orbit custom provider. The core fallback in applyPostPluginStreamWrappers only matched DeepSeek V4 model IDs.

Commit 10 (64df7f12dd) added the MiMo matcher to the core fallback so the shared DeepSeek-style wrapper fires for MiMo reasoning IDs on any provider, including custom proxy providers. Live test then passed.

Tests

  • 13/13 pass in src/agents/openai-completions-compat.test.ts (5 new xiaomi cases)
  • 8/8 pass in new extensions/xiaomi/index.test.ts (auth, catalog, replay policy, thinking profile, isModernModelRef, reasoning_content injection across thinking-on / thinking-off / cross-provider replay)
  • New host-resolution assertions in src/agents/provider-attribution.test.ts
  • pnpm build clean
  • pnpm check:changed clean (lint, typecheck, import cycles)

Real Behavior Proof

Behavior addressed: Xiaomi MiMo reasoning models (mimo-v2.5-pro on a xiaomi-orbit custom provider pointed at https://token-plan-ams.xiaomimimo.com/v1) returning 400 Param Incorrect on any multi-turn / tool-call conversation. Reproduced before the fix; verified fixed after the fix.

Real environment tested: Linux desktop, OpenClaw gateway built from this branch (fix/mimo-reasoning-content), run via pnpm dev gateway --force on the same port (18789) the user normally uses, exercised through the live Telegram channel against a real MiMo backend with a real API key.

Exact steps or command run after this patch: Stopped the systemd-installed openclaw gateway, then started a dev gateway from this branch on the same port the user normally uses. Then, through the user's normal Telegram chat client, sent a message that selects xiaomi-orbit / mimo-v2.5-pro (reasoning enabled), prompts a web_search tool call, and requires a follow-up model turn after the tool result. Exact commands:

systemctl --user stop openclaw-gateway
OPENCLAW_GATEWAY_PORT=18789 OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS=1 pnpm dev gateway --force

Evidence after fix (relevant log lines, gateway log /tmp/openclaw/openclaw-2026-05-13.log):

2026-05-13T20:04:02.327-05:00 [gateway] log file: /tmp/openclaw/openclaw-2026-05-13.log
2026-05-13T20:04:02.560-05:00 [gateway] ready
2026-05-13T20:04:25.330-05:00 [ws] webchat connected
2026-05-13T20:04:56.414-05:00 [diagnostic] phase=channels.telegram.start-account
  work=[active=agent:main:main(processing/model_call,q=1,age=12s last=model_call:started) ...]
2026-05-13T20:05:15.546-05:00 [tools] web_search failed: Ollama web search authentication failed.
  raw_params={"query":"Esperanto language overview history","count":5,"freshness":"year"}
2026-05-13T20:05:52.507-05:00 [tools] web_search failed: Ollama web search authentication failed.
  raw_params={"query":"Esperanto language history overview facts","count":5,"language":"en"}

Pre-fix evidence (same gateway, same user, same provider, before the wrapper was wired):

2026-05-13T19:55:04.005-05:00 embedded_run_agent_end isError=true
  error="LLM request failed: provider rejected the request schema or tool payload."
  failoverReason=format model=mimo-v2.5-pro provider=xiaomi-orbit
  rawErrorPreview="400 Param Incorrect" providerRuntimeFailureKind=schema
2026-05-13T19:55:04.090-05:00 model_fallback_decision decision=candidate_failed
  requestedProvider=xiaomi-orbit requestedModel=mimo-v2.5-pro
  attempt=1 reason=format status=400 errorPreview="400 Param Incorrect"

Observed result after fix: The MiMo agent received the prompt, emitted a tool call (web_search for "Esperanto language overview history"), received the tool result, then continued the conversation and emitted a second tool call (web_search for "Esperanto language history overview facts"). That second tool call requires the model to receive the prior assistant message + tool result and reason about a new request — exactly the multi-turn scenario the bug regressed. No FailoverError, no 400 Param Incorrect, no provider rejected the request schema in the log after the fix went live.

The web_search failed: Ollama web search authentication failed messages are an unrelated user-side configuration issue (Ollama needs ollama signin) — they show that the model did successfully emit valid tool-use payloads to the gateway and the gateway invoked the tool plugin.

What was not tested:

  • Bundled xiaomi provider against api.xiaomimimo.com directly. The user only currently has a xiaomi-orbit custom provider configured. The bundled-provider path is exercised by the new unit tests in extensions/xiaomi/index.test.ts (auth, catalog, thinking profile, reasoning_content injection), but no live API hit. The core fallback wrapper test (pi-embedded-runner-extraparams.test.ts style) is logically covered by the existing DeepSeek "unowned proxy" precedent.
  • TTS path unchanged and not re-verified live (the bundled xiaomi.live.test.ts for TTS is unmodified by this PR).
  • DeepSeek behavior: not retested live in this session; protected by existing tests and unchanged code paths.

AI assistance

This PR was developed with Claude Sonnet 4.6 acting as a coding agent over a single working session, driving edits, builds, and tests through Claude Code. After the initial implementation, a self-review pass (also via Claude Sonnet 4.6) identified the four issues listed under "Code review findings" above; all four were fixed before this PR was opened. The live verification step caught a remaining gap (the compat-flag plumbing alone does not fix the bug because pi-ai does not consume that field) which was then resolved by adding the core fallback wrapper.

The human author (Jim Dawdy) reviewed the plan and the final diffs, supplied live API credentials and gateway access for real-environment proof, ran the Telegram chat that produced the post-fix evidence above, and is the named contributor for this change. All blame falls on human shoulders.

@clawsweeper

clawsweeper Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Summary
Adds Xiaomi MiMo DeepSeek-style reasoning replay handling through endpoint attribution, OpenAI-completions compat defaults, Xiaomi plugin thinking/replay hooks, a custom-provider fallback wrapper, and regression tests.

Reproducibility: yes. The PR body and linked issue provide a concrete MiMo custom-provider multi-turn tool-call path with pre-fix 400 logs, and source inspection shows current main lacks the Xiaomi plugin/fallback wrapper that would add reasoning_content.

Real behavior proof
Sufficient (logs): The PR body includes after-fix live gateway logs from a real MiMo backend showing multi-turn tool-call continuation, plus pre-fix 400 logs for the same scenario.

Next step before merge
This active contributor PR already contains the focused implementation and sufficient proof; the remaining action is maintainer review, required checks, and landing judgment rather than a repair job.

Security
Cleared: The diff changes provider classification, plugin runtime wrappers, manifest endpoint metadata, transport compatibility, and tests without adding dependencies, workflows, secret handling, downloads, or publishing surfaces.

Review details

Best possible solution:

Land this focused compatibility fix after maintainer review and required checks; keep the separate reasoning-only visible-output behavior in #60304 independent if it still applies.

Do we have a high-confidence way to reproduce the issue?

Yes. The PR body and linked issue provide a concrete MiMo custom-provider multi-turn tool-call path with pre-fix 400 logs, and source inspection shows current main lacks the Xiaomi plugin/fallback wrapper that would add reasoning_content.

Is this the best way to solve the issue?

Yes. Keeping Xiaomi-owned behavior in the Xiaomi plugin while adding a narrow unowned-proxy fallback matches the existing DeepSeek V4 architecture and avoids a user-facing config knob.

What I checked:

  • Current main Xiaomi plugin gap: Current main registers Xiaomi auth, catalog, usage, and speech behavior, but has no Xiaomi wrapStreamFn, thinking profile, modern-model hook, or OpenAI-compatible replay hooks for MiMo reasoning replay. (extensions/xiaomi/index.ts:29, 2156b204eaac)
  • Current main fallback only covers DeepSeek V4: Current main applies the shared reasoning_content wrapper only when the model matches DeepSeek V4 OpenAI-compatible IDs; MiMo IDs are not covered before this PR. (src/agents/pi-embedded-runner/extra-params.ts:698, 2156b204eaac)
  • Shared wrapper matches the requested payload repair: The existing shared wrapper backfills blank reasoning_content on replayed assistant messages when thinking is enabled and strips it when thinking is disabled, which is the same repair this PR reuses for MiMo. (src/plugin-sdk/provider-stream-shared.ts:184, 2156b204eaac)
  • PR wires Xiaomi plugin ownership and proxy fallback: The PR diff adds Xiaomi replay/thinking hooks, MiMo thinking helpers, xiaomi-native endpoint metadata, OpenAI-completions compat propagation, and a MiMo matcher in applyPostPluginStreamWrappers for custom/proxy providers. (extensions/xiaomi/index.ts:32, 181076e079aa)
  • Dependency contract check: The pi-ai 0.74.0 package's convertMessages implementation reads compat.requiresReasoningContentOnAssistantMessages and adds blank reasoning_content for reasoning models when absent, so propagating that compat flag is a real contract path, not a dead field. (package.json:1755, 2156b204eaac)
  • Upstream Xiaomi contract check: Xiaomi's public MiMo docs state that in thinking-mode multi-turn agent conversations with tool calls, prior assistant reasoning_content must be preserved or the API can return 400; their token-plan docs list the same regional hosts added by the PR.

Likely related people:

  • steipete: Recent GitHub history shows work on the Xiaomi plugin, provider/plugin SDK boundaries, DeepSeek V4 reasoning_content backfill, and the unowned proxy fallback pattern this PR mirrors. (role: feature-history owner and recent area contributor; confidence: high; commits: ec8dbc459558, 62997f7fcec1, b97cb15b078f; files: extensions/xiaomi/index.ts, extensions/xiaomi/openclaw.plugin.json, src/plugin-sdk/provider-stream-shared.ts)
  • vincentkoc: Recent provider-attribution commits touch the routing metadata path that this PR extends with the xiaomi-native endpoint class. (role: provider attribution contributor; confidence: medium; commits: f1340be05150, 4fbc490fcaee; files: src/agents/provider-attribution.ts)
  • sallyom: Recent merged work adjusted DeepSeek V4 reasoning compatibility behavior that is the precedent for the MiMo wrapper and thinking-level handling. (role: adjacent DeepSeek reasoning compatibility contributor; confidence: medium; commits: 02ac7dc5a62e; files: src/plugin-sdk/provider-stream-shared.ts, src/agents/pi-embedded-runner/extra-params.ts)

Remaining risk / open question:

  • I did not run tests in this read-only review; verification relies on source inspection, dependency/API contract checks, supplied live logs, and the contributor's reported CI/test results.
  • The PR body notes the bundled xiaomi provider against api.xiaomimimo.com was covered by unit tests but not re-verified live.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 2156b204eaac.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 14, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 14, 2026
@jimdawdy-hub

Copy link
Copy Markdown
Contributor Author

check-lint failure is pre-existing on main, unrelated to this PR.

Per CONTRIBUTING.md ("Do not submit test or CI-config fixes for failures already red on main CI"), flagging rather than fixing.

The failing rule is typescript(no-redundant-type-constituents) at src/agents/models.profiles.live.test.ts:548:

function requireToolChoicePayload(payload: unknown): unknown | undefined {

unknown | undefined collapses to unknown, so the union is redundant.

That file is not touched by this PR. The line was introduced by upstream commit f3361dc928 test(agents): surface live OpenAI replay auth failures (last commit before opening this PR; visible in git log --oneline upstream/main -- src/agents/models.profiles.live.test.ts).

The same check-lint job is failing on the latest main CI run with the identical error:

Reproduced locally against upstream/main alone (with this PR's commits excluded):

git checkout upstream/main
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src ui packages
x typescript(no-redundant-type-constituents): 'unknown' overrides all other types in this union type.
   ,-[src/agents/models.profiles.live.test.ts:548:54]
547 |
548 | function requireToolChoicePayload(payload: unknown): unknown | undefined {
   :                                                      ^^^^^^^
549 |   if (!payload || typeof payload !== "object" || Array.isArray(payload)) {
   `----
Found 0 warnings and 1 error.

Local oxlint on this branch's changed files: 0 warnings, 0 errors.

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 14, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 14, 2026
@jimdawdy-hub jimdawdy-hub force-pushed the fix/mimo-reasoning-content branch from 64df7f1 to 8c85b61 Compare May 14, 2026 13:20
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 14, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 14, 2026
@jimdawdy-hub

Copy link
Copy Markdown
Contributor Author

Lint failure resolved, all CI checks passed, ready for maintainer review. -JHD

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 14, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 14, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 15, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 15, 2026
@steipete steipete force-pushed the fix/mimo-reasoning-content branch from 181076e to 99d0059 Compare May 15, 2026 11:19
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 15, 2026
@steipete steipete force-pushed the fix/mimo-reasoning-content branch from 99d0059 to 8b32a47 Compare May 15, 2026 11:22
@steipete steipete force-pushed the fix/mimo-reasoning-content branch from 8b32a47 to adad451 Compare May 15, 2026 11:28
@steipete

Copy link
Copy Markdown
Contributor

Thanks @jimdawdy-hub. I rebased this on current main after the #82101 OpenRouter/OpenAI-completions sanitizer landed and added one maintainer fixup before landing.

What changed:

Verification:

  • OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts src/agents/openai-completions-compat.test.ts extensions/xiaomi/index.test.ts src/agents/provider-attribution.test.ts src/agents/pi-embedded-runner-extraparams.test.ts — 7 files, 587 tests passed.
  • pnpm exec oxfmt --check CHANGELOG.md extensions/xiaomi/index.test.ts extensions/xiaomi/index.ts extensions/xiaomi/openclaw.plugin.json extensions/xiaomi/stream.ts extensions/xiaomi/thinking.ts src/agents/openai-completions-compat.test.ts src/agents/openai-completions-compat.ts src/agents/openai-transport-stream.ts src/agents/openai-transport-stream.test.ts src/agents/pi-embedded-runner/extra-params.ts src/agents/provider-attribution.test.ts src/agents/provider-attribution.ts — passed.
  • git diff --check — passed.
  • codex-review — clean, no accepted/actionable findings.
  • GitHub check suites on head adad4518f7bd6a217111c3ddf9689f95737d9756 — all 12 suites completed successfully.

Live behavior note: I did not re-run the Xiaomi endpoint locally here; I relied on the PR's supplied live proof showing the pre-fix second-turn 400 Param Incorrect and the post-fix second tool-call success.

@steipete steipete merged commit 2d83395 into openclaw:main May 15, 2026
113 checks passed
@jimdawdy-hub

Copy link
Copy Markdown
Contributor Author

Built Main from source at 3:30PM CDT to confirm fix. Logs:
Both tool calls completed and I'm still here — this is the multi-turn scenario that was failing before. Let me compile the confirmation log.


MiMo Reasoning Tool-Call Fix Verification Log

Date: 2026-05-15 15:41 CDT
Commit: 333f65fc8a (fix: tighten release tooling checks)
PR: #81589 (fix: preserve reasoning_content replay for MiMo proxies)
Gateway restart: 15:38:57 CDT
Session model: xiaomi-orbit/mimo-v2.5-pro (reasoning=medium)

Test 1: web_search tool call

Tool: web_search
Query: "OpenClaw AI agent framework 2026"
Result: Error (Ollama auth — unrelated to MiMo)
Multi-turn survival: ✅ Model received tool result, continued conversation
FailoverError / 400 Param Incorrect: ❌ NOT PRESENT (fix confirmed)

Test 2: exec tool call

Tool: exec
Command: echo "MiMo tool-call test 2: exec succeeded at $(date)"
Output: "MiMo tool-call test 2: exec succeeded at Fri May 15 03:41:57 PM CDT 2026"
Multi-turn survival: ✅ Model received tool result, continued conversation
FailoverError / 400 Param Incorrect: ❌ NOT PRESENT (fix confirmed)

Pre-fix behavior (from logs at 15:06-15:11 CDT):

2026-05-15T20:06:55.042Z  error  embedded_run_agent_end isError=true
  error="LLM request failed: provider rejected the request schema or tool payload."
  failoverReason=format model=mimo-v2.5-pro provider=xiaomi-orbit
  rawErrorPreview="400 Param Incorrect" providerRuntimeFailureKind=schema

Post-fix behavior (after gateway restart with PR #81589):

2026-05-15T15:41 CDT  model=xiaomi-orbit/mimo-v2.5-pro
  web_search tool call → tool invoked → model continued ✅
  exec tool call → tool invoked → model continued ✅
  No FailoverError, no 400 Param Incorrect

Conclusion: PR #81589 resolves #81419. MiMo reasoning models can now complete multi-turn tool-call conversations on custom providers (xiaomi-orbit) without 400 Param Incorrect schema rejections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling extensions: xiaomi proof: supplied External PR includes structured after-fix real behavior proof. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Xiaomi MiMo reasoning_content passthrough missing for multi-turn tool calls

2 participants