Add reasoning capability detection and automatic polyfills#20
Draft
Add reasoning capability detection and automatic polyfills#20
Conversation
18 tasks
- Add supports_thinking flag to detect reasoning_content field support - Add supports_disable_thinking, supports_reasoning_only, supports_reasoning_with_content flags - Add reasoning_requires_tools flag for templates that only reason with tools - Add tests for Qwen3-235B-A22B-Thinking-2507 and GLM-4.6 - Add model IDs: DeepSeek-V3.1, granite-3.3-2b-instruct, GLM-4.7 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…cture ThinkingPattern detection & polyfills: - Add polyfill logic to transform reasoning_content to template's native format - Support for THOUGHT_FIELD (MiniCPM3), THINKING_FIELD (GPT-OSS), TOOL_PLAN_FIELD (Command-R7B) - Add CONTENT_BLOCK patterns (Ministral/Apertus) with improved detection - Improved content block detection: reject stringified output by checking for structural markers - Add supports_clear_thinking detection for templates like GLM-4.7 Test infrastructure: - Add test metadata (_test_metadata) to context JSON files for template-independent validation - Add expected_strings/forbidden_strings checks to test-supported-template.cpp - Support conditional checks: expected_strings_if_supports_thinking, _system_role, _tool_calls, _tool_responses - Add ThinkingPattern capability tests to test-capabilities.cpp New reasoning test contexts: - reasoning_only.json - basic reasoning content - reasoning_multi_turn.json - multi-turn conversation with reasoning - reasoning_position_based.json - position-based visibility - reasoning_clear_thinking.json - clear_thinking flag behavior - reasoning_with_tools.json - reasoning with tool calls - reasoning_disabled.json - enable_thinking=false 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
0c9ae97 to
c12caa0
Compare
Add the missing collapse_blank_lines function and regex include that was lost during the rebase conflict resolution. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The template is already in MODEL_IDS and gets downloaded to build/tests/ during cmake configure. No need to commit it separately. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
API renames for consistency: - ThinkingPattern → ReasoningFormat - REASONING_CONTENT_FIELD → REASONING_CONTENT - thinking_pattern → reasoning_format - supports_thinking → supports_reasoning - supports_clear_thinking → supports_reasoning_visibility New behavior detection probes (computed via template rendering): - supports_reasoning_without_content: Can emit reasoning with empty content - supports_reasoning_with_content: Can emit both reasoning and content - respects_enable_reasoning: Template honors enable_thinking=false Added tool_plan_reasoning.json test context for TOOL_PLAN_FIELD format. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The name directly matches the input flag (clear_thinking).
… tojson separators - Rename `requires_typed_content` to `requires_typed_content_blocks` for clarity - Rename ReasoningFormat enum values: - REASONING_CONTENT → REASONING_CONTENT_FIELD - CONTENT_BLOCK_THINKING → THINKING_CONTENT_BLOCK - CONTENT_BLOCK_THOUGHTS → THOUGHTS_CONTENT_BLOCK - Add `tojson(separators=...)` support (used by Kimi K2 template) - Add Kimi K2 (moonshotai/Kimi-K2-Instruct) to test suite - Add capabilities tests for reasoning_requires_tools behavior - Add stringification checks to test contexts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The tools directory is optional and may not exist in all environments. Check for its existence before adding it as a subdirectory. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Command-R7B is excluded from MODEL_IDS on Windows due to known issues (google#40). The test-capabilities test for ToolPlanField_CommandR7B should also be skipped on Windows. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7556a25 to
19141bd
Compare
Comprehensive documentation for: - Capability detection (tools, reasoning, content formats) - ReasoningFormat enum and detection priority - Automatic polyfill system - Usage examples in C++ - Integration guidance for llama.cpp 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5026655 to
0d8d3f0
Compare
More consistent with the flag name (enable_thinking) and the naming pattern of other capability flags (supports_*). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
b36f5a1 to
3bf064d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements comprehensive reasoning capability detection and polyfills for chat templates. It enables automatic transformation of the canonical
reasoning_contentfield to each template's native format, allowing applications to use a unified input format across all models.Documentation: See docs/CAPABILITIES_AND_POLYFILLS.md for detailed usage guide.
Key Features
1. ReasoningFormat Detection
Automatically detects which reasoning format a template supports:
REASONING_CONTENT_FIELDmessage.reasoning_contentTHOUGHT_FIELDmessage.thoughtTHINKING_FIELDmessage.thinkingTOOL_PLAN_FIELDmessage.tool_planTHINKING_CONTENT_BLOCKcontent[].type == "thinking"THOUGHTS_CONTENT_BLOCKcontent[].type == "thoughts"2. Automatic Polyfills
When input uses the canonical
reasoning_contentfield but the template expects a different format, the library automatically transforms it:3. New Capability Flags
4. Polyfill Control
Individual polyfills can be enabled/disabled:
5. tojson Enhancements
Added
tojson(separators=(',', ':'))support for compact JSON output (used by Kimi K2 template).Models That Benefit
reasoning_content→thoughtreasoning_content→tool_planTest Infrastructure
Template-Independent Validation
Added
_test_metadatain context JSON files for assertions that work across all templates:{ "_test_metadata": { "expected_strings": ["always present"], "expected_strings_if_supports_reasoning": ["reasoning text"], "expected_strings_if_supports_tool_calls": ["tool name"], "forbidden_strings": ["[object Object]"] } }New Test Contexts
reasoning_only.json- Basic reasoning contentreasoning_multi_turn.json- Multi-turn conversation with reasoningreasoning_position_based.json- Position-based visibility (Kimi K2)reasoning_clear_thinking.json-clear_thinking=falsebehaviorreasoning_with_tools.json- Reasoning combined with tool callsreasoning_disabled.json-enable_thinking=falsetool_plan_reasoning.json- TOOL_PLAN_FIELD patternllama.cpp Integration
This enables llama.cpp to:
reasoning_contentto native formatsSee integration: sync-minja-reasoning branch
Test Results
All 880+ tests pass across Ubuntu, Windows, and macOS.
Closes #19