Skip to content

fix(anthropic): broaden Kimi thinking-suppression to custom endpoints#17455

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-46931fe7
Apr 29, 2026
Merged

fix(anthropic): broaden Kimi thinking-suppression to custom endpoints#17455
teknium1 merged 1 commit into
mainfrom
hermes/hermes-46931fe7

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Closes #17057.

Summary

Kimi-compatible custom endpoints + api_mode: anthropic_messages + thinking no longer fail with HTTP 400 after a tool call. The gate that drops Anthropic's thinking kwarg for Kimi was matched on https://api.kimi.com/coding only; users with a private gateway fronting Kimi (or an official Moonshot host) fell through to the generic third-party path, which strips thinking blocks AND still sends thinking={enabled,...}. Upstream then rejects the replay with reasoning_content is missing in assistant tool call message at index N.

Changes

  • agent/anthropic_adapter.py: new _is_kimi_family_endpoint(base_url, model) covering api.kimi.com/coding* URLs, host matches on api.kimi.com / moonshot.ai / moonshot.cn, and Kimi/Moonshot family model names (kimi-, moonshot-, k1., k2., k25, k2.5). Strips vendor prefix so moonshotai/kimi-k2.5 is recognised the same as kimi-k2.5.
  • Both gate sites now use the broader helper — thinking-kwarg suppression in build_anthropic_kwargs, and unsigned-thinking preservation in convert_messages_to_anthropic.
  • convert_messages_to_anthropic grows an optional model param so the custom-endpoint Kimi branch gets the signal it needs.
  • build_anthropic_client UA-header check stays URL-only — claude-code/0.1.0 is an official-Kimi-only contract.
  • tests/agent/test_kimi_coding_anthropic_thinking.py: custom-endpoint × {kimi-2.6, kimi-k2.5, moonshot-v1-*, kimi_thinking, vendor-prefixed} matrix; negative MiniMax test; replay test confirming unsigned reasoning_content→thinking survives the third-party strip on custom Kimi hosts. Retargeted the stale test_kimi_root_endpoint_unaffected whose "we should never see it" premise didn't survive user-configurable api_mode.

Validation

Before After
custom host + kimi-2.6 + thinking 400 on turn 2 thinking dropped, replay OK
api.moonshot.ai/anthropic + moonshot-v1-* thinking sent (would 400) thinking dropped
MiniMax on custom /anthropic thinking sent thinking sent (unchanged)
api.kimi.com/coding (regression) thinking dropped thinking dropped (unchanged)
  • tests/agent/test_kimi_coding_anthropic_thinking.py: 17/17 passing (7 new)
  • tests/agent/ -k "anthropic or kimi or moonshot or thinking": 349/349 passing
  • tests/run_agent/ -k "kimi or moonshot or reasoning or thinking": 144 passed, 1 skipped
  • E2E: reporter's exact config (base_url=http://custom-endpoint.example.com, model=kimi-2.6, reasoning_effort=medium) now produces outbound kwargs with thinking absent.

Root cause (one line)

_is_kimi_coding_endpoint was hostname-only and hardcoded to Kimi's /coding URL; any other host speaking Anthropic Messages to a Kimi-family model hit the generic third-party path, which is incompatible with Kimi's reasoning_content semantics.

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was
matched on `https://api.kimi.com/coding` only.  Users configuring a
custom Kimi-compatible gateway (or an official Moonshot host) with
`api_mode: anthropic_messages` fall through to the generic third-party
path, which strips thinking blocks AND still sends
`thinking={enabled,...}` → upstream rejects with HTTP 400
"reasoning_content is missing in assistant tool call message at index N"
on the next request after a tool call.

Replace `_is_kimi_coding_endpoint` callers (history replay + thinking
kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also
matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and
Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`,
…) for custom / proxied endpoints.  Keeps the UA-header check in
`build_anthropic_client` URL-only — the `claude-code/0.1.0` header is
an official-Kimi contract.

Plumbs optional `model` through `convert_messages_to_anthropic` so
the unsigned reasoning_content→thinking block synthesised for Kimi's
history validation survives the third-party signature-stripping pass
on custom hosts too.

Closes #17057.
@teknium1 teknium1 merged commit 83c288d into main Apr 29, 2026
11 of 12 checks passed
@teknium1 teknium1 deleted the hermes/hermes-46931fe7 branch April 29, 2026 13:35
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/kimi Kimi / Moonshot labels Apr 29, 2026
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…NousResearch#17455)

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was
matched on `https://api.kimi.com/coding` only.  Users configuring a
custom Kimi-compatible gateway (or an official Moonshot host) with
`api_mode: anthropic_messages` fall through to the generic third-party
path, which strips thinking blocks AND still sends
`thinking={enabled,...}` → upstream rejects with HTTP 400
"reasoning_content is missing in assistant tool call message at index N"
on the next request after a tool call.

Replace `_is_kimi_coding_endpoint` callers (history replay + thinking
kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also
matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and
Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`,
…) for custom / proxied endpoints.  Keeps the UA-header check in
`build_anthropic_client` URL-only — the `claude-code/0.1.0` header is
an official-Kimi contract.

Plumbs optional `model` through `convert_messages_to_anthropic` so
the unsigned reasoning_content→thinking block synthesised for Kimi's
history validation survives the third-party signature-stripping pass
on custom hosts too.

Closes NousResearch#17057.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…NousResearch#17455)

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was
matched on `https://api.kimi.com/coding` only.  Users configuring a
custom Kimi-compatible gateway (or an official Moonshot host) with
`api_mode: anthropic_messages` fall through to the generic third-party
path, which strips thinking blocks AND still sends
`thinking={enabled,...}` → upstream rejects with HTTP 400
"reasoning_content is missing in assistant tool call message at index N"
on the next request after a tool call.

Replace `_is_kimi_coding_endpoint` callers (history replay + thinking
kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also
matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and
Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`,
…) for custom / proxied endpoints.  Keeps the UA-header check in
`build_anthropic_client` URL-only — the `claude-code/0.1.0` header is
an official-Kimi contract.

Plumbs optional `model` through `convert_messages_to_anthropic` so
the unsigned reasoning_content→thinking block synthesised for Kimi's
history validation survives the third-party signature-stripping pass
on custom hosts too.

Closes NousResearch#17057.
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…NousResearch#17455)

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was
matched on `https://api.kimi.com/coding` only.  Users configuring a
custom Kimi-compatible gateway (or an official Moonshot host) with
`api_mode: anthropic_messages` fall through to the generic third-party
path, which strips thinking blocks AND still sends
`thinking={enabled,...}` → upstream rejects with HTTP 400
"reasoning_content is missing in assistant tool call message at index N"
on the next request after a tool call.

Replace `_is_kimi_coding_endpoint` callers (history replay + thinking
kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also
matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and
Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`,
…) for custom / proxied endpoints.  Keeps the UA-header check in
`build_anthropic_client` URL-only — the `claude-code/0.1.0` header is
an official-Kimi contract.

Plumbs optional `model` through `convert_messages_to_anthropic` so
the unsigned reasoning_content→thinking block synthesised for Kimi's
history validation survives the third-party signature-stripping pass
on custom hosts too.

Closes NousResearch#17057.
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…NousResearch#17455)

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was
matched on `https://api.kimi.com/coding` only.  Users configuring a
custom Kimi-compatible gateway (or an official Moonshot host) with
`api_mode: anthropic_messages` fall through to the generic third-party
path, which strips thinking blocks AND still sends
`thinking={enabled,...}` → upstream rejects with HTTP 400
"reasoning_content is missing in assistant tool call message at index N"
on the next request after a tool call.

Replace `_is_kimi_coding_endpoint` callers (history replay + thinking
kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also
matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and
Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`,
…) for custom / proxied endpoints.  Keeps the UA-header check in
`build_anthropic_client` URL-only — the `claude-code/0.1.0` header is
an official-Kimi contract.

Plumbs optional `model` through `convert_messages_to_anthropic` so
the unsigned reasoning_content→thinking block synthesised for Kimi's
history validation survives the third-party signature-stripping pass
on custom hosts too.

Closes NousResearch#17057.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…NousResearch#17455)

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was
matched on `https://api.kimi.com/coding` only.  Users configuring a
custom Kimi-compatible gateway (or an official Moonshot host) with
`api_mode: anthropic_messages` fall through to the generic third-party
path, which strips thinking blocks AND still sends
`thinking={enabled,...}` → upstream rejects with HTTP 400
"reasoning_content is missing in assistant tool call message at index N"
on the next request after a tool call.

Replace `_is_kimi_coding_endpoint` callers (history replay + thinking
kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also
matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and
Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`,
…) for custom / proxied endpoints.  Keeps the UA-header check in
`build_anthropic_client` URL-only — the `claude-code/0.1.0` header is
an official-Kimi contract.

Plumbs optional `model` through `convert_messages_to_anthropic` so
the unsigned reasoning_content→thinking block synthesised for Kimi's
history validation survives the third-party signature-stripping pass
on custom hosts too.

Closes NousResearch#17057.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…NousResearch#17455)

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was
matched on `https://api.kimi.com/coding` only.  Users configuring a
custom Kimi-compatible gateway (or an official Moonshot host) with
`api_mode: anthropic_messages` fall through to the generic third-party
path, which strips thinking blocks AND still sends
`thinking={enabled,...}` → upstream rejects with HTTP 400
"reasoning_content is missing in assistant tool call message at index N"
on the next request after a tool call.

Replace `_is_kimi_coding_endpoint` callers (history replay + thinking
kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also
matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and
Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`,
…) for custom / proxied endpoints.  Keeps the UA-header check in
`build_anthropic_client` URL-only — the `claude-code/0.1.0` header is
an official-Kimi contract.

Plumbs optional `model` through `convert_messages_to_anthropic` so
the unsigned reasoning_content→thinking block synthesised for Kimi's
history validation survives the third-party signature-stripping pass
on custom hosts too.

Closes NousResearch#17057.
Seven74AI pushed a commit to Seven74AI/hermes-agent that referenced this pull request Jun 13, 2026
…NousResearch#17455)

The guard that drops Anthropic's `thinking` kwarg for Kimi endpoints was
matched on `https://api.kimi.com/coding` only.  Users configuring a
custom Kimi-compatible gateway (or an official Moonshot host) with
`api_mode: anthropic_messages` fall through to the generic third-party
path, which strips thinking blocks AND still sends
`thinking={enabled,...}` → upstream rejects with HTTP 400
"reasoning_content is missing in assistant tool call message at index N"
on the next request after a tool call.

Replace `_is_kimi_coding_endpoint` callers (history replay + thinking
kwarg gate) with `_is_kimi_family_endpoint(base_url, model)` that also
matches the `api.kimi.com` / `moonshot.ai` / `moonshot.cn` hosts and
Kimi/Moonshot family model names (`kimi-`, `moonshot-`, `k1.`, `k2.`,
…) for custom / proxied endpoints.  Keeps the UA-header check in
`build_anthropic_client` URL-only — the `claude-code/0.1.0` header is
an official-Kimi contract.

Plumbs optional `model` through `convert_messages_to_anthropic` so
the unsigned reasoning_content→thinking block synthesised for Kimi's
history validation survives the third-party signature-stripping pass
on custom hosts too.

Closes NousResearch#17057.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists provider/kimi Kimi / Moonshot type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Custom Kimi-compatible endpoint with api_mode=anthropic_messages fails after tool call when thinking is enabled

2 participants