Add reasoning capability detection and automatic polyfills by ochafik · Pull Request #20 · ochafik/minja

ochafik · 2025-12-23T12:54:06Z

Summary

This PR implements comprehensive reasoning capability detection and polyfills for chat templates. It enables automatic transformation of the canonical reasoning_content field to each template's native format, allowing applications to use a unified input format across all models.

Documentation: See docs/CAPABILITIES_AND_POLYFILLS.md for detailed usage guide.

Key Features

1. ReasoningFormat Detection

Automatically detects which reasoning format a template supports:

Format	Field/Structure	Example Models
`REASONING_CONTENT_FIELD`	`message.reasoning_content`	Qwen3, GLM-4.6/4.7
`THOUGHT_FIELD`	`message.thought`	MiniCPM3
`THINKING_FIELD`	`message.thinking`	GPT-OSS-120B
`TOOL_PLAN_FIELD`	`message.tool_plan`	Command-R7B (requires tool_calls)
`THINKING_CONTENT_BLOCK`	`content[].type == "thinking"`	Ministral, DeepSeek-R1
`THOUGHTS_CONTENT_BLOCK`	`content[].type == "thoughts"`	Apertus, Kimi K2

2. Automatic Polyfills

When input uses the canonical reasoning_content field but the template expects a different format, the library automatically transforms it:

reasoning_content  →  thought                    (MiniCPM3)
reasoning_content  →  thinking                   (GPT-OSS-120B)
reasoning_content  →  tool_plan                  (Command-R7B, with tool_calls)
reasoning_content  →  content[{type:"thinking"}] (Ministral, DeepSeek-R1)
reasoning_content  →  content[{type:"thoughts"}] (Kimi K2, Apertus)

3. New Capability Flags

struct chat_template_caps {
    // Reasoning capabilities
    bool supports_reasoning = false;
    ReasoningFormat reasoning_format = NONE;
    bool reasoning_requires_tools = false;        // Command-R7B pattern
    bool supports_clear_thinking = false;         // GLM-4.7 visibility control
    
    // Behavior detection
    bool supports_reasoning_without_content = false;
    bool supports_reasoning_with_content = false;
    bool respects_enable_reasoning = false;
    
    // Content format
    bool requires_typed_content_blocks = false;   // [{type:"text", text:...}]
};

4. Polyfill Control

Individual polyfills can be enabled/disabled:

chat_template_options opts;
opts.apply_polyfills = true;           // Master switch
opts.polyfill_reasoning = true;        // reasoning_content conversion
opts.polyfill_typed_content = true;    // String → content blocks
// ... other polyfills

5. tojson Enhancements

Added tojson(separators=(',', ':')) support for compact JSON output (used by Kimi K2 template).

Models That Benefit

Model Family	Reasoning Format	Polyfill Applied
Qwen3	REASONING_CONTENT_FIELD	None (native)
GLM-4.6/4.7	REASONING_CONTENT_FIELD	None (native)
MiniCPM3	THOUGHT_FIELD	`reasoning_content` → `thought`
Command-R7B	TOOL_PLAN_FIELD	`reasoning_content` → `tool_plan`
DeepSeek-R1	THINKING_CONTENT_BLOCK	→ content blocks
Ministral	THINKING_CONTENT_BLOCK	→ content blocks
Kimi K2	THOUGHTS_CONTENT_BLOCK	→ content blocks

Test Infrastructure

Template-Independent Validation

Added _test_metadata in context JSON files for assertions that work across all templates:

{
  "_test_metadata": {
    "expected_strings": ["always present"],
    "expected_strings_if_supports_reasoning": ["reasoning text"],
    "expected_strings_if_supports_tool_calls": ["tool name"],
    "forbidden_strings": ["[object Object]"]
  }
}

New Test Contexts

reasoning_only.json - Basic reasoning content
reasoning_multi_turn.json - Multi-turn conversation with reasoning
reasoning_position_based.json - Position-based visibility (Kimi K2)
reasoning_clear_thinking.json - clear_thinking=false behavior
reasoning_with_tools.json - Reasoning combined with tool calls
reasoning_disabled.json - enable_thinking=false
tool_plan_reasoning.json - TOOL_PLAN_FIELD pattern

llama.cpp Integration

This enables llama.cpp to:

Detect reasoning format from model templates automatically
Apply polyfills to convert reasoning_content to native formats
Simplify parsers - output parsers only need to handle canonical format

See integration: sync-minja-reasoning branch

Test Results

All 880+ tests pass across Ubuntu, Windows, and macOS.

Closes #19

- Add supports_thinking flag to detect reasoning_content field support - Add supports_disable_thinking, supports_reasoning_only, supports_reasoning_with_content flags - Add reasoning_requires_tools flag for templates that only reason with tools - Add tests for Qwen3-235B-A22B-Thinking-2507 and GLM-4.6 - Add model IDs: DeepSeek-V3.1, granite-3.3-2b-instruct, GLM-4.7 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…cture ThinkingPattern detection & polyfills: - Add polyfill logic to transform reasoning_content to template's native format - Support for THOUGHT_FIELD (MiniCPM3), THINKING_FIELD (GPT-OSS), TOOL_PLAN_FIELD (Command-R7B) - Add CONTENT_BLOCK patterns (Ministral/Apertus) with improved detection - Improved content block detection: reject stringified output by checking for structural markers - Add supports_clear_thinking detection for templates like GLM-4.7 Test infrastructure: - Add test metadata (_test_metadata) to context JSON files for template-independent validation - Add expected_strings/forbidden_strings checks to test-supported-template.cpp - Support conditional checks: expected_strings_if_supports_thinking, _system_role, _tool_calls, _tool_responses - Add ThinkingPattern capability tests to test-capabilities.cpp New reasoning test contexts: - reasoning_only.json - basic reasoning content - reasoning_multi_turn.json - multi-turn conversation with reasoning - reasoning_position_based.json - position-based visibility - reasoning_clear_thinking.json - clear_thinking flag behavior - reasoning_with_tools.json - reasoning with tool calls - reasoning_disabled.json - enable_thinking=false 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add the missing collapse_blank_lines function and regex include that was lost during the rebase conflict resolution. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The template is already in MODEL_IDS and gets downloaded to build/tests/ during cmake configure. No need to commit it separately. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

API renames for consistency: - ThinkingPattern → ReasoningFormat - REASONING_CONTENT_FIELD → REASONING_CONTENT - thinking_pattern → reasoning_format - supports_thinking → supports_reasoning - supports_clear_thinking → supports_reasoning_visibility New behavior detection probes (computed via template rendering): - supports_reasoning_without_content: Can emit reasoning with empty content - supports_reasoning_with_content: Can emit both reasoning and content - respects_enable_reasoning: Template honors enable_thinking=false Added tool_plan_reasoning.json test context for TOOL_PLAN_FIELD format. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The name directly matches the input flag (clear_thinking).

… tojson separators - Rename `requires_typed_content` to `requires_typed_content_blocks` for clarity - Rename ReasoningFormat enum values: - REASONING_CONTENT → REASONING_CONTENT_FIELD - CONTENT_BLOCK_THINKING → THINKING_CONTENT_BLOCK - CONTENT_BLOCK_THOUGHTS → THOUGHTS_CONTENT_BLOCK - Add `tojson(separators=...)` support (used by Kimi K2 template) - Add Kimi K2 (moonshotai/Kimi-K2-Instruct) to test suite - Add capabilities tests for reasoning_requires_tools behavior - Add stringification checks to test contexts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The tools directory is optional and may not exist in all environments. Check for its existence before adding it as a subdirectory. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Command-R7B is excluded from MODEL_IDS on Windows due to known issues (google#40). The test-capabilities test for ToolPlanField_CommandR7B should also be skipped on Windows. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Comprehensive documentation for: - Capability detection (tools, reasoning, content formats) - ReasoningFormat enum and detection priority - Automatic polyfill system - Usage examples in C++ - Integration guidance for llama.cpp 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

More consistent with the flag name (enable_thinking) and the naming pattern of other capability flags (supports_*). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ochafik mentioned this pull request Dec 24, 2025

[WIP] tool-call: experimental migration of all parsers to peg-parser infra (w/ better test coverage) ggml-org/llama.cpp#18353

Draft

18 tasks

loci-dev mentioned this pull request Dec 24, 2025

UPSTREAM PR #18353: [WIP] tool-call: experimental migration of all parsers to peg-parser infra (w/ better test coverage) auroralabs-loci/llama.cpp#692

Open

1 task

ochafik and others added 2 commits December 29, 2025 22:24

ochafik force-pushed the issue-19-thinking-capabilities branch from 0c9ae97 to c12caa0 Compare December 29, 2025 22:28

ochafik and others added 7 commits December 29, 2025 22:40

Revert supports_reasoning_visibility → supports_clear_thinking

1e39bb3

The name directly matches the input flag (clear_thinking).

ochafik force-pushed the issue-19-thinking-capabilities branch from 7556a25 to 19141bd Compare December 31, 2025 14:10

ochafik changed the title ~~Add thinking/reasoning capability detection~~ Add reasoning capability detection and automatic polyfills Dec 31, 2025

ochafik force-pushed the issue-19-thinking-capabilities branch from 5026655 to 0d8d3f0 Compare December 31, 2025 15:19

ochafik force-pushed the issue-19-thinking-capabilities branch from b36f5a1 to 3bf064d Compare December 31, 2025 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reasoning capability detection and automatic polyfills#20

Add reasoning capability detection and automatic polyfills#20
ochafik wants to merge 11 commits intomainfrom
issue-19-thinking-capabilities

ochafik commented Dec 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ochafik commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Features

1. ReasoningFormat Detection

2. Automatic Polyfills

3. New Capability Flags

4. Polyfill Control

5. tojson Enhancements

Models That Benefit

Test Infrastructure

Template-Independent Validation

New Test Contexts

llama.cpp Integration

Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ochafik commented Dec 23, 2025 •

edited

Loading