fix(anthropic): broaden Kimi thinking-suppression to custom endpoints by teknium1 · Pull Request #17455 · NousResearch/hermes-agent

teknium1 · 2026-04-29T13:26:34Z

Summary

Kimi-compatible custom endpoints + api_mode: anthropic_messages + thinking no longer fail with HTTP 400 after a tool call. The gate that drops Anthropic's thinking kwarg for Kimi was matched on https://api.kimi.com/coding only; users with a private gateway fronting Kimi (or an official Moonshot host) fell through to the generic third-party path, which strips thinking blocks AND still sends thinking={enabled,...}. Upstream then rejects the replay with reasoning_content is missing in assistant tool call message at index N.

Changes

agent/anthropic_adapter.py: new _is_kimi_family_endpoint(base_url, model) covering api.kimi.com/coding* URLs, host matches on api.kimi.com / moonshot.ai / moonshot.cn, and Kimi/Moonshot family model names (kimi-, moonshot-, k1., k2., k25, k2.5). Strips vendor prefix so moonshotai/kimi-k2.5 is recognised the same as kimi-k2.5.
Both gate sites now use the broader helper — thinking-kwarg suppression in build_anthropic_kwargs, and unsigned-thinking preservation in convert_messages_to_anthropic.
convert_messages_to_anthropic grows an optional model param so the custom-endpoint Kimi branch gets the signal it needs.
build_anthropic_client UA-header check stays URL-only — claude-code/0.1.0 is an official-Kimi-only contract.
tests/agent/test_kimi_coding_anthropic_thinking.py: custom-endpoint × {kimi-2.6, kimi-k2.5, moonshot-v1-*, kimi_thinking, vendor-prefixed} matrix; negative MiniMax test; replay test confirming unsigned reasoning_content→thinking survives the third-party strip on custom Kimi hosts. Retargeted the stale test_kimi_root_endpoint_unaffected whose "we should never see it" premise didn't survive user-configurable api_mode.

Validation

	Before	After
custom host + kimi-2.6 + thinking	400 on turn 2	thinking dropped, replay OK
api.moonshot.ai/anthropic + moonshot-v1-*	thinking sent (would 400)	thinking dropped
MiniMax on custom /anthropic	thinking sent	thinking sent (unchanged)
api.kimi.com/coding (regression)	thinking dropped	thinking dropped (unchanged)

tests/agent/test_kimi_coding_anthropic_thinking.py: 17/17 passing (7 new)
tests/agent/ -k "anthropic or kimi or moonshot or thinking": 349/349 passing
tests/run_agent/ -k "kimi or moonshot or reasoning or thinking": 144 passed, 1 skipped
E2E: reporter's exact config (base_url=http://custom-endpoint.example.com, model=kimi-2.6, reasoning_effort=medium) now produces outbound kwargs with thinking absent.

Root cause (one line)

_is_kimi_coding_endpoint was hostname-only and hardcoded to Kimi's /coding URL; any other host speaking Anthropic Messages to a Kimi-family model hit the generic third-party path, which is incompatible with Kimi's reasoning_content semantics.

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was matched on `https://api.kimi.com/coding` only. Users configuring a custom Kimi-compatible gateway (or an official Moonshot host) with `api_mode: anthropic_messages` fall through to the generic third-party path, which strips thinking blocks AND still sends `thinking={enabled,...}` → upstream rejects with HTTP 400 "reasoning_content is missing in assistant tool call message at index N" on the next request after a tool call. Replace `_is_kimi_coding_endpoint` callers (history replay + thinking kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`, …) for custom / proxied endpoints. Keeps the UA-header check in `build_anthropic_client` URL-only — the `claude-code/0.1.0` header is an official-Kimi contract. Plumbs optional `model` through `convert_messages_to_anthropic` so the unsigned reasoning_content→thinking block synthesised for Kimi's history validation survives the third-party signature-stripping pass on custom hosts too. Closes #17057.

…NousResearch#17455) The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was matched on `https://api.kimi.com/coding` only. Users configuring a custom Kimi-compatible gateway (or an official Moonshot host) with `api_mode: anthropic_messages` fall through to the generic third-party path, which strips thinking blocks AND still sends `thinking={enabled,...}` → upstream rejects with HTTP 400 "reasoning_content is missing in assistant tool call message at index N" on the next request after a tool call. Replace `_is_kimi_coding_endpoint` callers (history replay + thinking kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`, …) for custom / proxied endpoints. Keeps the UA-header check in `build_anthropic_client` URL-only — the `claude-code/0.1.0` header is an official-Kimi contract. Plumbs optional `model` through `convert_messages_to_anthropic` so the unsigned reasoning_content→thinking block synthesised for Kimi's history validation survives the third-party signature-stripping pass on custom hosts too. Closes NousResearch#17057.

teknium1 merged commit 83c288d into main Apr 29, 2026
11 of 12 checks passed

teknium1 deleted the hermes/hermes-46931fe7 branch April 29, 2026 13:35

teknium1 mentioned this pull request Apr 29, 2026

Custom Kimi-compatible endpoint with api_mode=anthropic_messages fails after tool call when thinking is enabled #17057

Closed

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/kimi Kimi / Moonshot labels Apr 29, 2026

briandevans mentioned this pull request Apr 29, 2026

fix(anthropic): send thinking=disabled explicitly on third-party endpoints (#15700) #15712

Closed

6 tasks

Leihb mentioned this pull request Apr 30, 2026

fix(anthropic_adapter): preserve Kimi thinking blocks and strip signatures to fix intermittent 400s #17210

Closed

This was referenced Jun 4, 2026

[Bug]: step-3.7-flash gets Anthropic thinking + temperature=1 forced in anthropic_messages mode, breaking tool use #39124

Open

fix(anthropic): suppress thinking + forced temperature for step-3.7-flash #39131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(anthropic): broaden Kimi thinking-suppression to custom endpoints#17455

fix(anthropic): broaden Kimi thinking-suppression to custom endpoints#17455
teknium1 merged 1 commit into
mainfrom
hermes/hermes-46931fe7

teknium1 commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 29, 2026

Summary

Changes

Validation

Root cause (one line)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants