Skip to content

Feature: Add reasoning parameter to chat API#45

Merged
SantiagoDePolonia merged 3 commits intomainfrom
claude/add-reasoning-parameter-rn9sz
Jan 26, 2026
Merged

Feature: Add reasoning parameter to chat API#45
SantiagoDePolonia merged 3 commits intomainfrom
claude/add-reasoning-parameter-rn9sz

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented Jan 22, 2026

Summary by CodeRabbit

  • New Features
    • Added a reasoning configuration to requests, allowing selectable effort levels to control model thinking intensity.
    • Reasoning is preserved when creating streaming/shallow-copy request variants.
    • Extended-thinking support enabled for compatible model backends: when an effort level is supplied, the request is mapped to use extended-thinking behavior and related parameters are adjusted automatically.

✏️ Tip: You can customize this high-level summary in your review settings.

Add Reasoning struct and field to ChatRequest to support extended
thinking capabilities in OpenAI's o-series models (o1, o3, etc.) and
other reasoning-capable models.
Map OpenAI reasoning effort levels to Anthropic budget_tokens:
- low → 5,000 tokens
- medium → 10,000 tokens
- high → 20,000 tokens

Also add Reasoning field to ResponsesRequest for consistency.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jan 22, 2026

📝 Walkthrough

Walkthrough

Adds a new Reasoning struct with an Effort field, wires Reasoning into ChatRequest and ResponsesRequest, propagates it in WithStreaming, and maps Reasoning.Effort to Anthropic extended-thinking (Thinking) with BudgetTokens and temperature override when present.

Changes

Cohort / File(s) Summary
Core types & requests
internal/core/types.go, internal/core/responses.go
Added Reasoning struct (Effort string). Added Reasoning *Reasoning to ChatRequest and ResponsesRequest. WithStreaming() updated to copy Reasoning.
Anthropic provider mapping
internal/providers/anthropic/anthropic.go
Added internal anthropicThinking and Thinking *anthropicThinking to anthropicRequest. Convert functions map Reasoning.EffortThinking{Type:"enabled", BudgetTokens:...} and clear Temperature when extended thinking is enabled. Added effort→budget token mapping logic.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Core
  participant AnthropicConverter
  participant AnthropicAPI

  Client->>Core: send ChatRequest/ResponsesRequest (includes Reasoning.Effort?)
  Core->>AnthropicConverter: convert request (propagate Reasoning)
  alt Reasoning.Effort provided
    AnthropicConverter->>AnthropicConverter: map Effort -> Thinking{Type:"enabled",BudgetTokens}
    AnthropicConverter->>AnthropicConverter: unset Temperature
  end
  AnthropicConverter->>AnthropicAPI: send anthropicRequest (with Thinking if enabled)
  AnthropicAPI-->>Client: model response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibbled a field named Reasoning today,
Effort tucked in bytes where thoughts can play.
Anthropic listens, budgets a thought or two,
Hopping through code — a clever debut! 🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a reasoning parameter to the chat API. The changes include new Reasoning field in ChatRequest and ResponsesRequest structs, plus integration with Anthropic's extended thinking.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@internal/core/types.go`:
- Around line 11-17: The Reasoning.Effort field is an untyped string but should
be a constrained enum; define a new typed string (e.g., type ReasoningEffort
string) with constants for "low", "medium", and "high", change the
Reasoning.Effort field to that new type, and add a small validation helper
(e.g., func (e ReasoningEffort) IsValid() bool and/or func (r *Reasoning)
Validate() error) that checks allowed values so callers can centrally validate
inputs; ensure JSON marshaling still uses the existing `json:"effort,omitempty"`
tag and update any usages to convert/validate accordingly.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 157-168: The function reasoningEffortToBudgetTokens should stop
silently defaulting unknown effort strings; change its signature to return (int,
error) (or (int, bool)) and make it return an explicit error for any
non-"low"/"medium"/"high" input so callers can fail fast and validate before
enabling Anthropic thinking, and update all call sites to handle the error and
avoid enabling thinking on invalid effort; additionally, ensure the Anthropic
request explicitly sets Temperature = 1.0 (do not leave it nil) so
extended-thinking requests serialize correctly.

Comment on lines +11 to +17
// Reasoning configures reasoning behavior for models that support extended thinking.
// This is used with OpenAI's o-series models and other reasoning-capable models.
type Reasoning struct {
// Effort controls how much reasoning effort the model should use.
// Valid values are "low", "medium", and "high".
Effort string `json:"effort,omitempty"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Prefer a typed effort enum (and optional validation helper).

Effort is documented as "low/medium/high" but currently accepts any string. A small typed enum + helper improves type safety and enables central validation.

♻️ Proposed refactor
-// Reasoning configures reasoning behavior for models that support extended thinking.
-// This is used with OpenAI's o-series models and other reasoning-capable models.
-type Reasoning struct {
-	// Effort controls how much reasoning effort the model should use.
-	// Valid values are "low", "medium", and "high".
-	Effort string `json:"effort,omitempty"`
-}
+// ReasoningEffort defines allowed reasoning effort levels.
+type ReasoningEffort string
+
+const (
+	ReasoningEffortLow    ReasoningEffort = "low"
+	ReasoningEffortMedium ReasoningEffort = "medium"
+	ReasoningEffortHigh   ReasoningEffort = "high"
+)
+
+// Reasoning configures reasoning behavior for models that support extended thinking.
+// This is used with OpenAI's o-series models and other reasoning-capable models.
+type Reasoning struct {
+	// Effort controls how much reasoning effort the model should use.
+	Effort ReasoningEffort `json:"effort,omitempty"`
+}
+
+// Valid returns true when Effort is one of the supported values.
+func (r Reasoning) Valid() bool {
+	switch r.Effort {
+	case ReasoningEffortLow, ReasoningEffortMedium, ReasoningEffortHigh:
+		return true
+	default:
+		return false
+	}
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Reasoning configures reasoning behavior for models that support extended thinking.
// This is used with OpenAI's o-series models and other reasoning-capable models.
type Reasoning struct {
// Effort controls how much reasoning effort the model should use.
// Valid values are "low", "medium", and "high".
Effort string `json:"effort,omitempty"`
}
// ReasoningEffort defines allowed reasoning effort levels.
type ReasoningEffort string
const (
ReasoningEffortLow ReasoningEffort = "low"
ReasoningEffortMedium ReasoningEffort = "medium"
ReasoningEffortHigh ReasoningEffort = "high"
)
// Reasoning configures reasoning behavior for models that support extended thinking.
// This is used with OpenAI's o-series models and other reasoning-capable models.
type Reasoning struct {
// Effort controls how much reasoning effort the model should use.
Effort ReasoningEffort `json:"effort,omitempty"`
}
// Valid returns true when Effort is one of the supported values.
func (r Reasoning) Valid() bool {
switch r.Effort {
case ReasoningEffortLow, ReasoningEffortMedium, ReasoningEffortHigh:
return true
default:
return false
}
}
🤖 Prompt for AI Agents
In `@internal/core/types.go` around lines 11 - 17, The Reasoning.Effort field is
an untyped string but should be a constrained enum; define a new typed string
(e.g., type ReasoningEffort string) with constants for "low", "medium", and
"high", change the Reasoning.Effort field to that new type, and add a small
validation helper (e.g., func (e ReasoningEffort) IsValid() bool and/or func (r
*Reasoning) Validate() error) that checks allowed values so callers can
centrally validate inputs; ensure JSON marshaling still uses the existing
`json:"effort,omitempty"` tag and update any usages to convert/validate
accordingly.

Log a warning when the user provides a temperature value that gets
overridden to nil due to reasoning/extended thinking being enabled.
This helps users understand why their temperature setting is ignored.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@SantiagoDePolonia SantiagoDePolonia merged commit fb05e2d into main Jan 26, 2026
6 of 7 checks passed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@internal/providers/anthropic/anthropic.go`:
- Around line 186-197: When extended thinking is enabled (detect via
anthropicReq.Thinking or where you call reasoningEffortToBudgetTokens), ensure
anthropicReq.MaxTokens is bumped to be at least the budget tokens: compute
budget := reasoningEffortToBudgetTokens(req.Reasoning.Effort) and if
anthropicReq.MaxTokens is nil or *anthropicReq.MaxTokens < budget then set
anthropicReq.MaxTokens = &budget (or otherwise increase it to budget). Update
both places that set anthropicReq.Thinking (the block using anthropicThinking
and the other converter at lines ~510–521) so that max_tokens is adjusted
whenever Thinking/BudgetTokens is set.
♻️ Duplicate comments (1)
internal/providers/anthropic/anthropic.go (1)

157-170: Fail fast on invalid reasoning effort instead of silently defaulting.

Right now unknown effort values fall back to "low" with a warning, which can mask client bugs and produce unexpected behavior. Prefer returning an error/ok flag and handling it at the call site.

Comment on lines +186 to +197
// Map reasoning effort to Anthropic extended thinking
if req.Reasoning != nil && req.Reasoning.Effort != "" {
anthropicReq.Thinking = &anthropicThinking{
Type: "enabled",
BudgetTokens: reasoningEffortToBudgetTokens(req.Reasoning.Effort),
}
// Extended thinking requires temperature to be unset (defaults to 1)
if anthropicReq.Temperature != nil {
slog.Warn("temperature overridden to nil, reasoning requires unset temperature")
anthropicReq.Temperature = nil
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Anthropic extended thinking budget_tokens max_tokens API requirements

💡 Result:

To use Anthropic “extended thinking” in the Messages API, you must:

  • Enable thinking in the request by adding:
    • thinking.type: "enabled"
    • thinking.budget_tokens: <int> (the token budget for internal reasoning). [1]
  • Ensure budget_tokens is at least 1,024 and (in the normal/non-interleaved case) budget_tokens must be < max_tokens. [1]
  • Understand how max_tokens is enforced:
    • In Claude 3.7 and 4 models, max_tokens is a strict cap; if (prompt tokens + max_tokens) exceeds the model context window, the API returns a validation error (it won’t silently shrink max_tokens like older models). [1]
    • With thinking enabled, the current-turn thinking counts toward that turn’s max_tokens limit. [1]
  • Streaming requirement: Streaming is required when max_tokens > 21,333. [1]
  • Interleaved thinking (tools) (Claude 4 + Messages API tools):
    • Add the beta header interleaved-thinking-2025-05-14.
    • In this mode, budget_tokens can exceed max_tokens (it’s treated as a total budget across thinking blocks within the assistant turn). [1]

If you’re using Anthropic via an OpenAI-SDK-compatible interface, you can still pass thinking (e.g., via extra_body), but the OpenAI SDK won’t return the detailed thinking content—use the native Anthropic API if you need full thinking features. [2]

Sources: Anthropic docs on Extended Thinking and OpenAI SDK compatibility. [1] [2]


🏁 Script executed:

# Find the reasoningEffortToBudgetTokens function
rg -n "reasoningEffortToBudgetTokens" internal/providers/anthropic/

Repository: ENTERPILOT/GOModel

Length of output: 537


🏁 Script executed:

# Examine the file around lines 186-197 and 510-521
sed -n '180,200p' internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 731


🏁 Script executed:

# Look at the second location
sed -n '505,525p' internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 754


🏁 Script executed:

# Find max_tokens initialization and defaults
rg -n "MaxTokens" internal/providers/anthropic/anthropic.go | head -20

Repository: ENTERPILOT/GOModel

Length of output: 339


🏁 Script executed:

# Get the full reasoningEffortToBudgetTokens function
sed -n '157,170p' internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 431


🏁 Script executed:

# Check if interleaved thinking mode is used (beta header)
rg -n "interleaved" internal/providers/anthropic/

Repository: ENTERPILOT/GOModel

Length of output: 44


🏁 Script executed:

# Check entire context of both converters for any max_tokens adjustments
sed -n '150,200p' internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 1706


🏁 Script executed:

# Check second location context
sed -n '495,530p' internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 1224


Ensure max_tokens ≥ budget_tokens when extended thinking is enabled.

The reasoningEffortToBudgetTokens() function returns 5k/10k/20k for low/medium/high effort, but default max_tokens is only 4,096. Anthropic's extended thinking API requires budget_tokens < max_tokens (in non-interleaved mode), so even low effort will violate this constraint. Users will encounter API validation errors.

Bump max_tokens to at least budget_tokens when thinking is enabled:

Fix for both converters (lines 186–197 and 510–521)
if req.Reasoning != nil && req.Reasoning.Effort != "" {
	anthropicReq.Thinking = &anthropicThinking{
		Type:         "enabled",
		BudgetTokens: reasoningEffortToBudgetTokens(req.Reasoning.Effort),
	}
+	if anthropicReq.MaxTokens < anthropicReq.Thinking.BudgetTokens {
+		slog.Warn("max_tokens raised to satisfy thinking budget",
+			"max_tokens", anthropicReq.MaxTokens,
+			"budget_tokens", anthropicReq.Thinking.BudgetTokens,
+		)
+		anthropicReq.MaxTokens = anthropicReq.Thinking.BudgetTokens
+	}
	// Extended thinking requires temperature to be unset (defaults to 1)
	if anthropicReq.Temperature != nil {
		slog.Warn("temperature overridden to nil, reasoning requires unset temperature")
		anthropicReq.Temperature = nil
	}
}
🤖 Prompt for AI Agents
In `@internal/providers/anthropic/anthropic.go` around lines 186 - 197, When
extended thinking is enabled (detect via anthropicReq.Thinking or where you call
reasoningEffortToBudgetTokens), ensure anthropicReq.MaxTokens is bumped to be at
least the budget tokens: compute budget :=
reasoningEffortToBudgetTokens(req.Reasoning.Effort) and if
anthropicReq.MaxTokens is nil or *anthropicReq.MaxTokens < budget then set
anthropicReq.MaxTokens = &budget (or otherwise increase it to budget). Update
both places that set anthropicReq.Thinking (the block using anthropicThinking
and the other converter at lines ~510–521) so that max_tokens is adjusted
whenever Thinking/BudgetTokens is set.

@SantiagoDePolonia SantiagoDePolonia deleted the claude/add-reasoning-parameter-rn9sz branch March 22, 2026 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants