fix(moonshot): preserve native Kimi tool_call IDs in openai-completions replay by Cierra0506 · Pull Request #70030 · openclaw/openclaw

Cierra0506 · 2026-04-22T06:42:50Z

Summary

Problem: Moonshot's bundled OpenAI-compatible replay policy strict-sanitizes tool_call IDs down to [a-zA-Z0-9], which rewrites Kimi K2.6's native IDs (functions.<name>:<index>, e.g., functions.read:0 → functionsread0). Kimi's serving layer then fails to match the mangled IDs back to the original tool definitions in multi-turn history.
Why it matters: All multi-turn agentic flows through Kimi K2.6 break after 2–3 tool-calling rounds, with finish_reason: "stop" returned instead of "tool_calls" ~80% of the time.
What changed: Added a sanitizeToolCallIds opt-out to the shared openai-compatible replay family helper (buildOpenAICompatibleReplayPolicy + buildProviderReplayFamilyHooks), and wired the Moonshot plugin to opt out. Default behavior for all other openai-compatible providers is unchanged.
What did NOT change (scope boundary): Not touching the generic ToolCallIdMode enum or sanitizeToolCallId() implementation. Not touching the kimi-coding plugin (its policy is minimal and runs on anthropic-messages, not openai-completions; the 2026-04-10 comment about it being "also affected" is not reproducible from the code and deserves its own repro before scope expansion). No @mariozechner/pi-ai patches.

Change Type (select all)

Bug fix
Refactor required for the fix

Scope (select all touched areas)

API / contracts
Integrations

(Additive opt-out on buildProviderReplayFamilyHooks / buildOpenAICompatibleReplayPolicy; Moonshot provider plugin wiring.)

Linked Issue/PR

Closes Strict tool_call ID sanitization breaks Kimi K2.5 multi-turn tool calling #62319
This PR fixes a bug or regression

Root Cause

Root cause: Moonshot's plugin entry spread ...OPENAI_COMPATIBLE_REPLAY_HOOKS, which unconditionally returns { sanitizeToolCallIds: true, toolCallIdMode: "strict" }. Kimi K2.6 returns IDs containing . and :, which are not in [a-zA-Z0-9], so strict sanitization drops them. The mangled ID is then sent back in conversation history; Kimi's serving layer can't match it, and emits text instead of a structured tool call.
Missing detection / guardrail: No family-level opt-out from ID sanitization existed for openai-compatible transports — the only per-provider dial was to re-implement the whole policy locally (as mistral does with "strict9"). That friction pushed Moonshot toward copying the default.
Contributing context (if known): OpenAI's own call_<uuid> and the generic alphanumeric tests pass through sanitization losslessly, so the bug only surfaces for providers whose native IDs contain non-alphanumeric structure. Kimi is the first such case in the bundled set.

Regression Test Plan

Coverage level that should have caught this:
- Unit test
Target test or file:
- src/plugins/provider-replay-helpers.test.ts — two new cases for buildOpenAICompatibleReplayPolicy(api, { sanitizeToolCallIds: false }) on both openai-completions and openai-responses.
- src/plugin-sdk/provider-model-shared.test.ts — new case for buildProviderReplayFamilyHooks({ family: "openai-compatible", sanitizeToolCallIds: false }).
- extensions/moonshot/index.test.ts — flipped the existing plugin-boundary assertion: Moonshot's policy must not carry sanitizeToolCallIds or toolCallIdMode on openai-completions, while keeping applyAssistantFirstOrderingFix/validateGeminiTurns/validateAnthropicTurns.
Scenario the test should lock in: with the Moonshot plugin registered against openai-completions, the resolved policy passes through tool_call IDs unchanged so native functions.<name>:<index> IDs survive round-trip.
Why this is the smallest reliable guardrail: the bug sits on a pure helper → family hook → plugin entry chain. Unit-level assertions at each of the three seams catch regressions without requiring full transcript-replay integration runs; the existing sanitize-session-history harness already owns the cross-cutting ID-rewriting contract.
Existing test that already covers this (if any): src/agents/pi-embedded-runner.openai-tool-id-preservation.test.ts owns the downstream replay behavior but is keyed on toolCallIdMode: "strict". With Moonshot opting out, those paths simply don't apply — no contradiction.

User-visible / Behavior Changes

Moonshot / Kimi K2.6 multi-turn tool calling over openai-completions now survives more than ~2 rounds.
No behavioral change for any other bundled or third-party provider on the openai-compatible family; the new option defaults to true (existing behavior) and must be explicitly opted out.

Diagram

Before (strict sanitization, openai-compatible family):
  Kimi returns: functions.read:0
    -> strict sanitize [^a-zA-Z0-9] -> functionsread0
    -> sent back as tool_call_id in next turn
    -> Kimi serving layer fails to match, emits finish_reason: "stop"

After (Moonshot opts out of sanitization):
  Kimi returns: functions.read:0
    -> pass-through (no sanitize step)
    -> sent back as tool_call_id in next turn
    -> Kimi serving layer matches; finish_reason: "tool_calls"

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation: N/A.

Repro + Verification

Environment

OS: macOS (Darwin 25.3.0, Apple Silicon)
Runtime/container: local Node 22 pnpm workspace checkout
Model/provider: Moonshot provider plugin with Kimi K2.6 on the openai-completions transport (issue reporter's CanopyWave endpoint; bug is endpoint-independent, governed by OpenClaw's own replay policy)
Integration/channel (if any): N/A (replay policy is transport-layer)
Relevant config (redacted): N/A

Steps

Register the Moonshot plugin (bundled).
Resolve its replay policy for { modelApi: "openai-completions", modelId: "kimi-k2.6" }.
Confirm the resolved policy carries applyAssistantFirstOrderingFix / validateGeminiTurns / validateAnthropicTurns but does not carry sanitizeToolCallIds or toolCallIdMode.
Confirm other openai-compatible providers (e.g., xai on openai-completions) continue to carry sanitizeToolCallIds: true and toolCallIdMode: "strict".

Expected

Moonshot policy passes native tool_call IDs through untouched.
All other openai-compatible provider defaults unchanged.

Actual

Matches expected. Three test suites lock this in at pure-helper, family-hook, and plugin-entry layers.

Evidence

Failing test/log before + passing after

Before the fix (on the new tests):

FAIL  extensions/moonshot/index.test.ts > moonshot provider plugin > owns replay policy for OpenAI-compatible Moonshot transports without mangling native Kimi tool_call IDs
AssertionError: expected { sanitizeToolCallIds: true, …(4) } to not have property "sanitizeToolCallIds"

After the fix:

Test Files  2 passed (2)
Tests       15 passed (15)   (unit-fast)

Test Files  1 passed (1)
Tests       2 passed (2)     (extension-providers)

Issue reporter's independent streaming-API measurements (pre-merge):

ID format	Pass rate (5 trials, 66K system prompt)
`functionsread0` (current strict)	1/5 (20%)
`functions.read:0` (native, as this PR preserves)	5/5 (100%)
`call_abc123def456` (OpenAI style)	5/5 (100%)

Human Verification (required)

What I personally verified (not just CI), and how:

Verified scenarios:
- Unit: buildOpenAICompatibleReplayPolicy("openai-completions", { sanitizeToolCallIds: false }) omits both sanitizeToolCallIds and toolCallIdMode and keeps the openai-completions-shaped fields.
- Unit: same for "openai-responses" (which has the non-openai-completions shape).
- Family hook: buildProviderReplayFamilyHooks({ family: "openai-compatible", sanitizeToolCallIds: false }) threads through to the helper.
- Plugin entry: Moonshot plugin resolves a policy that matches the opted-out shape.
- Ran pnpm build — bundled plugin dist emits cleanly, no [INEFFECTIVE_DYNAMIC_IMPORT] warnings.
- Ran pnpm plugin-sdk:api:check — no baseline drift (the added optional field lives on a module-internal type).
- Ran pnpm check:changed — typecheck (core + core tests + extensions + extension tests), lint (core + extensions), import cycles, webhook/pairing guards all green.
Edge cases checked:
- Default behavior unchanged when no option passed (existing tests locked in).
- Opt-out works independently of the openai-completions vs openai-responses branch (both covered by unit tests).
- Other bundled openai-compatible providers (xAI, etc.) untouched because they continue to spread OPENAI_COMPATIBLE_REPLAY_HOOKS without the opt-out.
What I did not verify:
- Live round-trip against a real Kimi K2.6 endpoint from this checkout. The issue reporter's pre-fix measurements (patched locally) already cover the empirical side; this PR is the structural delivery of that fix.
- kimi-coding plugin behavior. The 2026-04-10 comment flagging it as "also affected" is not consistent with its code (api: "anthropic-messages", minimal KIMI_REPLAY_POLICY, default sanitizeToolCallIds: false merge). Left for a separate issue/PR with a fresh repro.
Unrelated CI signal: pnpm check:changed reports 2 failures in src/agents/tools/web-fetch.provider-fallback.test.ts (SSRF guard rejects DNS that resolves to a private IP in my local network environment). These fail identically on a stashed clean upstream/main — not introduced by this PR. All guards relevant to this diff are green.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? Yes (sanitizeToolCallIds defaults to true; OPENAI_COMPATIBLE_REPLAY_HOOKS unchanged; only Moonshot opts out).
Config/env changes? No
Migration needed? No
If yes, exact upgrade steps: N/A.

Risks and Mitigations

Risk: A future bundled openai-compatible provider appears whose ID format does need sanitization and accidentally copies the Moonshot pattern.
- Mitigation: The opt-out is explicit (sanitizeToolCallIds: false); defaults still sanitize. The inline comment at extensions/moonshot/index.ts documents the rationale so future readers understand the opt-out is Kimi-specific.
Risk: A third-party Moonshot-compatible gateway returns tool_call IDs that a different endpoint would reject.
- Mitigation: OpenAI and Moonshot API docs both accept arbitrary string IDs on tool_call_id; Kimi's native format is already valid wire-level and works across all three formats tested in the issue.

greptile-apps · 2026-04-22T06:44:06Z

Greptile Summary

This PR adds a sanitizeToolCallIds opt-out to buildOpenAICompatibleReplayPolicy and buildProviderReplayFamilyHooks, then wires Moonshot to opt out, so that Kimi K2+'s native functions.<name>:<index> tool_call ID format is passed through untouched in multi-turn replay. The fix is additive and fully backward-compatible: all other openai-compatible providers continue to use the default sanitizeToolCallIds: true behavior via the unchanged OPENAI_COMPATIBLE_REPLAY_HOOKS constant. Implementation is clean, tests cover all three seams (helper → family hook → plugin entry), and the inline comment in extensions/moonshot/index.ts clearly documents the rationale.

Confidence Score: 5/5

Safe to merge — additive opt-out with correct default, well-tested at three seams, no behavioral change for any other provider.

The change is minimal and strictly additive. The ?? true fallback in buildOpenAICompatibleReplayPolicy preserves the existing behavior for all callers that don't pass the new option. Moonshot is the only bundled provider opting out, and the three-layer unit test coverage locks in the contract. No security, data-loss, or runtime-error risk identified.

No files require special attention.

_{Reviews (1): Last reviewed commit: "fix(moonshot): preserve native Kimi tool..." | Re-trigger Greptile}

…ns replay

steipete · 2026-04-23T00:53:24Z

Landed via rebase merge. Thanks @LeoDu0314.

Validation:

Local full gate: pnpm build && pnpm check && pnpm check:test-types && pnpm test
Local follow-up for updated base: pnpm test:contracts:plugins
CI: fresh rebased branch green before merge
Live review smoke: moonshot/kimi-k2.6 via .profile passed direct tool-replay sanity (reasoning_content retained; native Kimi tool_call IDs replayed)

Landed SHAs:

c4dea58 fix(moonshot): preserve native Kimi tool_call IDs in openai-completions replay
23a4489 fix(xai): declare websocket runtime dependency

steipete · 2026-04-23T00:54:27Z

Follow-up issue closure:

Strict tool_call ID sanitization breaks Kimi K2.5 multi-turn tool calling #62319 was auto-closed by this PR and now has the landed SHA + validation note.
[Bug] Moonshot K2.6 multi-turn tool calls fail: reasoning_content stripped during replay #70392 was closed after live moonshot/kimi-k2.6 replay verification on current main; the probe retained reasoning_content across the assistant tool-call replay and produced the next structured tool call without the reported 400.

openclaw-barnacle Bot added extensions: moonshot size: S labels Apr 22, 2026

steipete force-pushed the fix/moonshot-preserve-native-tool-call-ids branch from 2a96c52 to d4fffd4 Compare April 23, 2026 00:41

Cierra0506 and others added 2 commits April 23, 2026 01:44

fix(moonshot): preserve native Kimi tool_call IDs in openai-completio…

4063441

…ns replay

fix: declare xai realtime ws dependency (openclaw#70030)

36037fc

steipete force-pushed the fix/moonshot-preserve-native-tool-call-ids branch from d4fffd4 to 36037fc Compare April 23, 2026 00:50

steipete merged commit c4dea58 into openclaw:main Apr 23, 2026
73 checks passed

This was referenced Apr 23, 2026

Strict tool_call ID sanitization breaks Kimi K2.5 multi-turn tool calling #62319

Closed

[Bug] Moonshot K2.6 multi-turn tool calls fail: reasoning_content stripped during replay #70392

Closed

perlowja mentioned this pull request Apr 23, 2026

[Bug] Agent tool-use regressed for all providers between v2026.4.20 and main — 0/6 models invoke tools on a cold-start prompt; multiple model families converge on fabricated "Exec is blocked on this session" phrase #70456

Closed

y9c mentioned this pull request Apr 24, 2026

Bug: Kimi Code model enters infinite tool call loop #71273

Open

RoseKongPS mentioned this pull request Apr 25, 2026

Kimi K2.6 reasoning_content 400 regression in long conversations after LCM compaction (follow-up #70392) #71491

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(moonshot): preserve native Kimi tool_call IDs in openai-completions replay#70030

fix(moonshot): preserve native Kimi tool_call IDs in openai-completions replay#70030
steipete merged 2 commits into
openclaw:mainfrom
Cierra0506:fix/moonshot-preserve-native-tool-call-ids

Cierra0506 commented Apr 22, 2026

Uh oh!

greptile-apps Bot commented Apr 22, 2026

Uh oh!

Uh oh!

steipete commented Apr 23, 2026

Uh oh!

steipete commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants