Skip to content

opencode-go/kimi-k2.6 (and other openai-completions models): reasoning field leaks through passthrough replay policy, rejected as "Extra inputs are not permitted" on multi-turn #81988

@haumanto

Description

@haumanto

Summary

OpenClaw's opencode-go provider plugin replays the reasoning field on assistant messages back to openai-completions models that reject unknown fields. Multi-turn tool-using runs against opencode-go/kimi-k2.6 fail on the second turn with:

400 Error from provider: 9 request validation errors:
Extra inputs are not permitted, field: 'messages[15].reasoning',
value: 'The user is sending another bootstrap request. I already completed...'

This is distinct from #70392 / #71491 (which are about reasoning_content — with underscore — being required but missing on Moonshot's direct API). Here the issue is the wrong field name (reasoning — no underscore) being present and rejected via the opencode-go proxy path. The two paths use different replay hooks.

Environment

  • OpenClaw 2026.5.12 (commit f066dd2), installed via npm install -g
  • Provider plugin: opencode-go (proxies Moonshot's openai-completions API)
  • Model: opencode-go/kimi-k2.6, thinkingDefault: medium
  • Agent has Docker sandbox (sandbox.mode: all) and exercises tool calls

Reproduction

Configure a coder agent with model.primary: opencode-go/kimi-k2.6, thinkingDefault: medium, then run a two-turn task:

openclaw agent --agent coder --local --json --message \
  'In your sandbox at /workspace, create probe/hello.txt with contents "READY". Use the Write tool. Reply with exactly: DONE.'

The agent issues a Write tool call (turn 1, model produces thinking), receives the tool result, then attempts to generate the final assistant text (turn 2). The turn-2 request replays turn 1's history including the assistant thinking. The provider rejects with the 400 quoted above; the watchdog falls back to a different model and the task completes via the fallback.

Single-turn probes against the same primary (Reply with exactly: PONG, no tool call) succeed first try — the bug only manifests when an assistant turn with thinking is replayed.

Root cause (from dist/ inspection)

/usr/lib/node_modules/openclaw/dist/extensions/opencode-go/index.js (line ~86) installs PASSTHROUGH_GEMINI_REPLAY_HOOKS:

...PASSTHROUGH_GEMINI_REPLAY_HOOKS,

That builder is buildPassthroughGeminiSanitizingReplayPolicy in /usr/lib/node_modules/openclaw/dist/provider-model-shared-D-slKnZa.js (lines 109–119):

function buildPassthroughGeminiSanitizingReplayPolicy(modelId) {
  return {
    applyAssistantFirstOrderingFix: false,
    validateGeminiTurns: false,
    validateAnthropicTurns: false,
    ...normalizeLowercaseStringOrEmpty(modelId).includes("gemini") ? {
      sanitizeThoughtSignatures: { ... }
    } : {}
  };
  // No dropReasoningFromHistory
}

Compare to the unowned-provider fallback buildUnownedProviderTransportReplayFallback in dist/tool-result-middleware-ecm3p4oK.js, which does set dropReasoningFromHistory: true when modelApi === "openai-completions". The passthrough policy used by opencode-go does not, creating an asymmetry: explicit-plugin providers proxying openai-completions models inherit the "leak reasoning on replay" behavior; identical models behind the unowned-fallback path get them stripped.

Per OpenClaw's stored representation, the assistant thinking is { "type": "thinking", "thinking": "...", "thinkingSignature": "reasoning" }. On replay, the thinkingSignature: "reasoning" is serialized to a top-level reasoning field on the outgoing assistant message, which the openai-completions endpoint validates and rejects as an unknown field.

Why single-turn doesn't repro

No thinking → no reasoning field in history → no replay → no error.

Suggested fix

One-line change in buildPassthroughGeminiSanitizingReplayPolicy to include dropReasoningFromHistory: true when the underlying model's API is openai-completions. Pattern is already established in buildUnownedProviderTransportReplayFallback. This keeps Gemini-specific signature sanitization intact while preventing the reasoning leak on openai-completions paths.

Workaround

The 2026.5.12 LLM idle watchdog (#76877) catches the failure and rotates to a configured fallback model, so tasks complete. Cost is ~13s per multi-turn call on the affected primary, plus misleading model fallback decision log entries with fallbackStepFromFailureReason: model_not_found (the model exists; the request body is invalid).

Cross-references

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions