Skip to content

server : preserve anthropic thinking blocks in conversion#20120

Merged
ngxson merged 2 commits intoggml-org:masterfrom
T0mSIlver:fix/20090-anthropic-thinking-conversion
Mar 6, 2026
Merged

server : preserve anthropic thinking blocks in conversion#20120
ngxson merged 2 commits intoggml-org:masterfrom
T0mSIlver:fix/20090-anthropic-thinking-conversion

Conversation

@T0mSIlver
Copy link
Contributor

@T0mSIlver T0mSIlver commented Mar 5, 2026

Fix Anthropic /v1/messages conversion to preserve assistant thinking blocks as reasoning_content when converting to internal OpenAI-compatible chat messages.

Fixes #20090.

AI usage disclosure:

  • AI was used in an assistive role for code review suggestions, small implementation adjustments, and command execution support.
  • I manually reviewed the final patch and validation results.

@T0mSIlver T0mSIlver changed the title server : preserve anthropic thinking blocks in conversion (#20090) server : preserve anthropic thinking blocks in conversion Mar 5, 2026
@T0mSIlver T0mSIlver force-pushed the fix/20090-anthropic-thinking-conversion branch from e01f9e3 to 09b3429 Compare March 5, 2026 00:13
Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a test case with /apply-template to prove that this works

@T0mSIlver T0mSIlver force-pushed the fix/20090-anthropic-thinking-conversion branch from 2650560 to 330df34 Compare March 5, 2026 13:46
@T0mSIlver
Copy link
Contributor Author

Added two test cases:

  • test_anthropic_thinking_history_in_count_tokens : sends Anthropic-format messages with interleaved thinking + tool use through /v1/messages/count_tokens and verifies thinking blocks increase the token count (i.e. they're not silently dropped by convert_anthropic_to_oai).
  • test_anthropic_thinking_history_in_template : uses /apply-template with the Qwen3 template to verify the converted reasoning_content renders inside tags in the prompt.

Both use the Qwen3 template which natively handles reasoning_content.

@T0mSIlver T0mSIlver requested a review from ngxson March 5, 2026 13:48
@github-actions github-actions bot added the python python script changes label Mar 5, 2026
@florianbrede-ayet
Copy link

@T0mSIlver just wanted to open a PR for the same issue when I came across yours. I've been struggling with the same problems and qwen35 models. The core fix is basically identical to mine - i compiled it against ROCm and it solves the issue with claude code.

On a side note, it also fixes the autoparser branch where you'd previously get 500 {"error":{"code":500,"message":"Failed to parse input at pos 67: ","type":"server_error"}} (or other low positions within malformed thinking blocks) after some turns.

@ngxson ngxson merged commit e68f2fb into ggml-org:master Mar 6, 2026
73 of 81 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval bug: Anthropic Messages API drops thinking content blocks during conversion

3 participants