Skip to content

Kimi Code thinking/tool replay fails: missing reasoning_content in assistant tool-call replay #91558

@liuhaiyang14

Description

@liuhaiyang14

Summary

When using Kimi Code through an OpenAI-compatible provider in OpenClaw, a normal single-turn response can succeed, but enabling thinking for an agent turn that uses tools fails during tool-call replay.

The provider rejects the follow-up request with:

400 thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

This looks like the OpenAI-compatible replay path is not preserving or reconstructing the assistant reasoning_content required by Kimi Code when thinking is enabled and tool calls are replayed.

Environment

  • OpenClaw version: 2026.6.1
  • Gateway version: 2026.6.1
  • Provider id: moonshot-code
  • Model id: kimi-for-coding
  • API style: openai-completions
  • Base URL: https://api.kimi.com/coding/v1
  • Current workaround: keep Kimi Code thinking disabled and use it as a late fallback.

Current working model config workaround:

{
  "id": "kimi-for-coding",
  "name": "Kimi Code (for coding)",
  "api": "openai-completions",
  "reasoning": false,
  "compat": {
    "supportsReasoningEffort": false,
    "maxTokensField": "max_tokens"
  }
}

Agent model params workaround:

{
  "max_tokens": 65536,
  "thinking": false
}

Reproduction

  1. Configure Kimi Code as an OpenAI-compatible provider with reasoning enabled:
{
  "models": {
    "providers": {
      "moonshot-code": {
        "baseUrl": "https://api.kimi.com/coding/v1",
        "apiKey": "${KIMI_API_KEY}",
        "api": "openai-completions",
        "models": [
          {
            "id": "kimi-for-coding",
            "name": "Kimi Code (for coding)",
            "input": ["text", "image"],
            "reasoning": true,
            "contextWindow": 131072,
            "maxTokens": 65536,
            "compat": {
              "supportsReasoningEffort": true,
              "maxTokensField": "max_tokens"
            },
            "api": "openai-completions"
          }
        ]
      }
    }
  }
}
  1. Run a simple thinking-enabled turn:
openclaw agent \
  --model moonshot-code/kimi-for-coding \
  --thinking high \
  --session-key agent:codex:kimi-thinking-simple \
  --message 'Test Kimi Code thinking high. Reply only KIMI_THINK_SIMPLE_OK_9.' \
  --json

This succeeds.

  1. Run a thinking-enabled tool turn:
openclaw agent \
  --model moonshot-code/kimi-for-coding \
  --thinking high \
  --session-key agent:codex:kimi-thinking-tool \
  --message 'You must call exec to run: printf KIMI_TOOL_OK. Then reply only FINAL_KIMI_TOOL_OK.' \
  --json

Actual result

The run fails with:

GatewayClientRequestError: FailoverError: LLM request failed: provider rejected the request schema or tool payload.

Gateway logs show the provider error:

400 thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

Relevant diagnostic fields:

{
  "provider": "moonshot-code",
  "model": "kimi-for-coding",
  "failoverReason": "format",
  "providerRuntimeFailureKind": "schema",
  "rawErrorPreview": "400 thinking is enabled but reasoning_content is missing in assistant tool call message at index 2"
}

Expected result

OpenClaw should either:

  1. Preserve and replay Kimi Code reasoning_content correctly for assistant messages that contain tool calls, or
  2. Automatically disable thinking for Kimi Code tool-using sessions when the provider requires reasoning_content replay that OpenClaw cannot satisfy, or
  3. Expose a provider compat flag similar to existing reasoning replay controls so Kimi Code can be configured safely.

Workaround

Keep Kimi Code configured with thinking disabled and use it only as a fallback:

{
  "reasoning": false,
  "compat": {
    "supportsReasoningEffort": false,
    "maxTokensField": "max_tokens"
  }
}

Agent params:

{
  "max_tokens": 65536,
  "thinking": false
}

This avoids the replay failure, but prevents thinking-enabled Kimi Code tool use.

Additional notes

I tested Qwen3.7 Plus with OpenClaw compat.thinkingFormat: "qwen" under the same style of tool-call replay test. Qwen completed the tool turn and a follow-up turn successfully. So this appears to be specific to Kimi Code / Moonshot-style reasoning_content replay behavior rather than all OpenAI-compatible thinking providers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.impact:auth-providerAuth, provider routing, model choice, or SecretRef resolution may break.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions