Skip to content

fix(me/gom-33): anthropic reasoning error when i request an opus model#83

Merged
giorgi-pro merged 14 commits intomainfrom
me/gom-33-anthropic-reasoning-error-when-i-request-an-opus-model
Feb 22, 2026
Merged

fix(me/gom-33): anthropic reasoning error when i request an opus model#83
giorgi-pro merged 14 commits intomainfrom
me/gom-33-anthropic-reasoning-error-when-i-request-an-opus-model

Conversation

@giorgi-pro
Copy link
Copy Markdown
Collaborator

@giorgi-pro giorgi-pro commented Feb 20, 2026

Two issues were fixed for Anthropic's newer 4.6 models (e.g. claude-opus-4-6):

  1. max_tokens error with reasoning — The 4.6 models support a new adaptive thinking mode with output_config.effort instead of the legacy budget_tokens approach. The code was applying the legacy path to all models, which caused Anthropic to reject requests with "max_tokens must be greater than thinking.budget_tokens". The fix adds model detection (isAdaptiveThinkingModel) and uses the appropriate API format per model generation.

  2. Empty response with reasoning enabled — When adaptive thinking is active, Anthropic can return multiple content blocks: possibly a preamble text block (in our case, just \n\n), a thinking block, and the actual answer text block. The fix introduces extractTextContent which picks the last text block — matching Anthropic's pattern of preamble → thinking → answer.

After the fix:

{
  "id": "msg_019HyRtW4EDkQgSAK5ShsH1a",
  "object": "response",
  "created_at": 1771623447,
  "model": "claude-opus-4-6",
  "provider": "anthropic",
  "status": "completed",
  "output": [
    {
      "id": "msg_291c31c4-38ab-4cd0-bdbe-c3d7000901a1",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "# Bash Matrix Transpose Script\n\n```bash\n#!/bin/bash\n\n# Usage: ./transpose.sh \"[1,2],[3,4],[5,6]\"\n# Output: [1,3,5],[2,4,6]\n\ninput=\"$1\"\n\nif [[ -z \"$input\" ]]; then\n    echo \"Usage: $0 \\\"[1,2],[3,4],[5,6]\\\"\"\n    exit 1\nfi\n\n# ---------------------------------------------------------------------------\n# 1. Parse: turn \"[1,2],[3,4],[5,6]\" into a clean \"1,2;3,4;5,6\" form\n#    - replace ],[ with ;   (row delimiter)\n#    - strip the leading [ and trailing ]\n# ---------------------------------------------------------------------------\ncleaned=$(echo \"$input\" | sed 's/\\],\\[/;/g' | sed 's/^\\[//' | sed 's/\\]$//')\n\n# ---------------------------------------------------------------------------\n# 2. Read into a flat array that simulates a 2-D matrix\n# ---------------------------------------------------------------------------\nIFS=';' read -ra rows <<< \"$cleaned\"\nnum_rows=${#rows[@]}\nnum_cols=0\n\ndeclare -a matrix\n\nfor ((i = 0; i < num_rows; i++)); do\n    IFS=',' read -ra cols <<< \"${rows[$i]}\"\n    # Set column count from the first row\n    if (( i == 0 )); then\n        num_cols=${#cols[@]}\n    fi\n    for ((j = 0; j < ${#cols[@]}; j++)); do\n        matrix=$(( i * num_cols + j ))          # flat index\n        matrix[$((i * num_cols + j))]=\"${cols[$j]}\"\n    done\ndone\n\n# ---------------------------------------------------------------------------\n# 3. Build the transposed output\n#    Original  : num_rows × num_cols\n#    Transposed: num_cols × num_rows\n# ---------------------------------------------------------------------------\nresult=\"\"\n\nfor ((j = 0; j < num_cols; j++)); do\n    row=\"[\"\n    for ((i = 0; i < num_rows; i++)); do\n        (( i > 0 )) && row+=\",\"\n        row+=\"${matrix[$((i * num_cols + j))]}\"\n    done\n    row+=\"]\"\n\n    if [[ -n \"$result\" ]]; then\n        result+=\",$row\"\n    else\n        result=\"$row\"\n    fi\ndone\n\necho \"$result\"\n```\n\n## How it works\n\n| Step | What happens | Example value |\n|------|-------------|---------------|\n| **Input** | Raw matrix string | `[1,2],[3,4],[5,6]` |\n| **Clean** | Replace `],[` → `;`, strip outer `[]` | `1,2;3,4;5,6` |\n| **Split rows** | Split on `;` | `[\"1,2\", \"3,4\", \"5,6\"]` |\n| **Flat array** | Store elements row-major | `matrix = [1,2,3,4,5,6]` (3×2) |\n| **Transpose** | Iterate columns-first, rows-second | Swap `(i,j)` → `(j,i)` |\n| **Output** | Rebuild bracket format | `[1,3,5],[2,4,6]` |\n\n## Test runs\n\n```bash\n$ ./transpose.sh \"[1,2],[3,4],[5,6]\"\n[1,3,5],[2,4,6]\n\n$ ./transpose.sh \"[1,2,3],[4,5,6]\"\n[1,4],[2,5],[3,6]\n\n$ ./transpose.sh \"[1],[2],[3]\"\n[1,2,3]\n\n$ ./transpose.sh \"[1,2,3]\"\n[1],[2],[3]\n\n$ ./transpose.sh \"[10,20,30],[40,50,60],[70,80,90]\"\n[10,40,70],[20,50,80],[30,60,90]\n```\n\nThe key insight is that **transposing** simply means iterating **column-first, then row** instead of the original row-first order, which swaps the `(i, j)` indices to `(j, i)`."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 61,
    "output_tokens": 1866,
    "total_tokens": 1927
  }
}

Summary by CodeRabbit

  • New Features

    • Support for adaptive "thinking" modes on newer models with configurable effort and output options.
  • Improvements

    • More reliable extraction of assistant text from responses that include intermediate thinking blocks.
    • Centralized reasoning handling with model-aware behavior, token adjustments, and cleaner request formatting.
  • Tests

    • Added tests covering thinking-block extraction and adaptive-model recognition.

When extended thinking is enabled, Anthropic returns multiple content blocks — a thinking block first, then a text block. The thinking block has type: "thinking" and its text field contains the reasoning. The actual answer is in the subsequent text block
The \n\n preamble text block is not mentioned in docs and is not reliable. Also the docs guarantee the order of thinking blocks followed by text blocks
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds adaptive-thinking support and centralizes reasoning configuration in applyReasoning; introduces extractTextContent to handle mixed text/thinking blocks; updates request/response conversion to use these helpers; removes legacy max-token adjustment path; adds tests for thinking-block extraction and adaptive-model detection.

Changes

Cohort / File(s) Summary
Anthropic provider core
internal/providers/anthropic/anthropic.go
Adds anthropicOutputConfig and OutputConfig on anthropicRequest; implements isAdaptiveThinkingModel() and adaptiveThinkingPrefixes; centralizes reasoning into applyReasoning() which sets adaptive OutputConfig or legacy BudgetTokens, clears temperature when reasoning is used, and auto-adjusts MaxTokens for budget-based flows; removes logMaxTokensAdjustment pathway; formatting/comment tweaks.
Response extraction & conversion
internal/providers/anthropic/anthropic.go
Introduces extractTextContent(blocks []anthropicContent) and normalizeEffort(); replaces inline content extraction in convertFromAnthropicResponse and convertAnthropicResponseToResponses to properly handle text/thinking blocks and propagate usage.
Tests
internal/providers/anthropic/anthropic_test.go
Adds tests covering thinking-block scenarios and adaptive-model recognition: TestConvertFromAnthropicResponse_WithThinkingBlocks, TestExtractTextContent, TestConvertAnthropicResponseToResponses_WithThinkingBlocks, and TestIsAdaptiveThinkingModel. Validates final text extraction and usage accounting.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Converter as RequestBuilder
    participant Anthropic as AnthropicModel
    participant Parser as ResponseConverter

    Client->>Converter: build anthropicRequest (model, effort)
    note right of Converter: applyReasoning()\nsets OutputConfig (adaptive) or BudgetTokens (legacy)\nmay adjust MaxTokens and clear temperature
    Converter->>Anthropic: send anthropicRequest (with OutputConfig/BudgetTokens)
    Anthropic-->>Parser: returns content blocks (text / thinking)
    Parser->>Parser: extractTextContent(blocks)
    Parser-->>Client: normalized response (text + usage)

    rect rgba(200,230,255,0.5)
    Note over Converter,Anthropic: Adaptive vs. budget-based thinking\nbranch handled in applyReasoning
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I hopped through code with nimble paws,

I braided OutputConfig into the laws,
Thinking blocks and final text in view,
Tests tucked in my pocket like dew,
A carrot-shaped commit — fresh and true!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main fix: handling Anthropic reasoning errors when requesting an Opus model by implementing adaptive thinking support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch me/gom-33-anthropic-reasoning-error-when-i-request-an-opus-model

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/providers/anthropic/anthropic_test.go (1)

1658-1660: ⚠️ Potential issue | 🟡 Minor

intPtr body is not indented — gofmt violation

return &i appears at column 0 instead of being indented inside the function body. gofmt would flag this.

🛠️ Fix
 func intPtr(i int) *int {
-return &i
+	return &i
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic_test.go` around lines 1658 - 1660, The
function intPtr has its body unindented causing a gofmt violation; inside the
intPtr function (symbol: intPtr) indent the return statement so the function
body is properly formatted (i.e., place "return &i" on a new line indented
within the function block) and run gofmt/golangci-lint to verify formatting.
internal/providers/anthropic/anthropic.go (1)

415-439: ⚠️ Potential issue | 🟡 Minor

Streaming preamble text forwarded to clients; inconsistent with non-streaming behaviour

Adaptive thinking can return preamble text blocks before thinking blocks in both regular and streaming responses. In the non-streaming path, extractTextContent discards the preamble by returning only the last text block. In streaming, every content_block_delta with non-empty Delta.Text is immediately forwarded — so preamble content like "\n\n" (whitespace) is sent to the client before the real answer, creating a behavioural inconsistency.

This is an inherent limitation of forward-only streaming (you cannot "un-send" already-emitted deltas). The same applies to responsesStreamConverter.convertEvent (Line 830). Consider documenting this known difference, or tracking the current content-block type using the Index from content_block_start events to gate which deltas are forwarded.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 415 - 439, The
streaming path currently forwards every content_block_delta immediately, causing
preamble/whitespace blocks to be sent to clients (inconsistent with
extractTextContent non-streaming behavior); fix by tracking the current content
block type/index (set on content_block_start) and only marshal/return deltas in
the content_block_delta case when the block's Index/Type matches the intended
response block (e.g., the final "thinking"/answer block) — update the state used
in the switch handling content_block_start and content_block_delta in
anthropic.go (and mirror the same gating logic in
responsesStreamConverter.convertEvent) so preamble deltas are skipped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 1514-1518: When expectedThinkType == "" the test only checks
result.Thinking is nil but misses verifying result.OutputConfig; update the test
in the same block to also assert result.OutputConfig == nil (i.e., when
expectedThinkType is empty ensure both result.Thinking and result.OutputConfig
are nil) so regressions that set OutputConfig are caught; locate the assertion
block referencing expectedThinkType and result.Thinking in the test function and
add the OutputConfig nil check alongside it.
- Around line 1551-1656: The test
TestConvertResponsesRequestToAnthropic_ReasoningEffort is missing checks for
temperature clearing and OutputConfig being nil when no reasoning is applied;
update the test table to include the setTemperature (or expectNilTemp) boolean
like in TestConvertToAnthropicRequest_ReasoningEffort, then in the loop after
building req call convertResponsesRequestToAnthropic(req) and assert that when
expectNilTemp/setTemperature indicates temperature should be cleared the
result.Temperature is nil, and when tt.expectedThinkType == "" also assert
result.OutputConfig == nil; refer to convertResponsesRequestToAnthropic,
applyReasoning, and the test struct to locate where to add the new fields and
assertions.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 91-94: The anthropicOutputConfig struct's Effort field currently
serializes empty strings as "effort":"" which can cause API rejections; update
the JSON tag on anthropicOutputConfig.Effort to include `omitempty` so empty
values are omitted from the payload (i.e., change the tag to
`json:"effort,omitempty"`), keeping the field name and type unchanged to
preserve call sites and behavior when a non-empty Effort is provided.
- Around line 108-110: The isAdaptiveThinkingModel function is too permissive
and can false-positive (e.g., "claude-4-60"); replace the simple Contains checks
with a precise match: either implement a compiled regex like `\b4[.-]6\b` and
test the model string with regexp.MatchString, or use an explicit allowlist of
canonical model names/prefixes (e.g., "claude-opus-4-6", "claude-sonnet-4-6")
and check membership; update isAdaptiveThinkingModel to use the chosen approach
and add unit tests for edge cases such as "claude-4-60", "claude-opus-4-6", and
"claude-opus-4-6x".

---

Outside diff comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 1658-1660: The function intPtr has its body unindented causing a
gofmt violation; inside the intPtr function (symbol: intPtr) indent the return
statement so the function body is properly formatted (i.e., place "return &i" on
a new line indented within the function block) and run gofmt/golangci-lint to
verify formatting.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 415-439: The streaming path currently forwards every
content_block_delta immediately, causing preamble/whitespace blocks to be sent
to clients (inconsistent with extractTextContent non-streaming behavior); fix by
tracking the current content block type/index (set on content_block_start) and
only marshal/return deltas in the content_block_delta case when the block's
Index/Type matches the intended response block (e.g., the final
"thinking"/answer block) — update the state used in the switch handling
content_block_start and content_block_delta in anthropic.go (and mirror the same
gating logic in responsesStreamConverter.convertEvent) so preamble deltas are
skipped.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/anthropic/anthropic.go (1)

201-213: ⚠️ Potential issue | 🟡 Minor

"max" effort silently maps to the lowest budget (5,000 tokens) for legacy models.

Anthropic supports four effort levels: low, medium, high, and max. "max" only reaches reasoningEffortToBudgetTokens for non-adaptive (pre-4.6) models, but the default branch maps it to 5,000 tokens — the opposite of the user's intent. The fallback should map to the highest budget or return an explicit error.

🛡️ Proposed fix
 func reasoningEffortToBudgetTokens(effort string) int {
     switch effort {
     case "low":
         return 5000
     case "medium":
         return 10000
-    case "high":
+    case "high", "max":
         return 20000
     default:
         slog.Warn("inappropriate reasoning effort, defaulting to 'low'", "effort", effort)
         return 5000
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 201 - 213, The
function reasoningEffortToBudgetTokens currently maps unknown efforts (including
"max") to 5000 tokens; add an explicit case for "max" that returns the highest
budget (match "high"/max budget, e.g., 20000) and change the default branch to
warn and return the high/max budget instead of defaulting to low. Update the
slog.Warn message in reasoningEffortToBudgetTokens to reflect that an
inappropriate effort was provided and that you're falling back to the highest
budget.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 597-613: The function extractTextContent uses a redundant found
flag and a fallback to blocks[0].Text which is always empty for non-"text"
blocks; simplify by removing the found flag and the blocks[0].Text fallback:
iterate blocks ([]anthropicContent) and set last when b.Type == "text", then
after the loop just return last; update references to anthropicContent/Text
accordingly and remove any dead-branch logic related to the fallback.
- Around line 195-198: The current code unconditionally warns and clears
req.Temperature whenever it's non-nil, causing spurious warnings for a value of
1.0; update the guard around req.Temperature so it only logs and sets
req.Temperature = nil when the dereferenced value is present and not equal to
1.0 (i.e., if req.Temperature != nil && *req.Temperature != 1.0) and keep the
slog.Warn and clearing behavior only in that branch to avoid noisy logs when
temperature==1.0; reference req.Temperature and slog.Warn in anthropic.go to
locate and change the conditional.

---

Outside diff comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 201-213: The function reasoningEffortToBudgetTokens currently maps
unknown efforts (including "max") to 5000 tokens; add an explicit case for "max"
that returns the highest budget (match "high"/max budget, e.g., 20000) and
change the default branch to warn and return the high/max budget instead of
defaulting to low. Update the slog.Warn message in reasoningEffortToBudgetTokens
to reflect that an inappropriate effort was provided and that you're falling
back to the highest budget.

---

Duplicate comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 108-110: The isAdaptiveThinkingModel function currently uses
strings.Contains(model, "4-6") and strings.Contains(model, "4.6") which causes
false positives (e.g., "claude-4-60"); update this to perform a bounded match or
use an explicit allowlist: replace the substring checks in
isAdaptiveThinkingModel with a regex match using word boundaries like
`\b4[.-]6\b` (or implement an allowlist of exact model names) so only exact
"4-6" or "4.6" tokens are matched; keep the function name
isAdaptiveThinkingModel and ensure you import/use the regex utilities
accordingly.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/anthropic/anthropic.go (1)

415-497: ⚠️ Potential issue | 🟡 Minor

Streaming path does not filter preamble text blocks when thinking is enabled.

When adaptive thinking is active, Anthropic returns content blocks in order: [text("\n\n"), thinking(...), text(answer)]. The non-streaming path correctly uses extractTextContent to pick only the last text block. However, the streaming converter forwards all content_block_delta events with non-empty Delta.Text — including the preamble "\n\n" text block.

This means streamed responses will include the preamble whitespace, while non-streamed responses will not, creating an inconsistency. For most cases the preamble is just whitespace so impact is low, but callers concatenating streamed deltas will get leading \n\n that non-streaming callers won't see.

A potential fix would be to track the current content_block type from content_block_start events and only forward deltas for text-type blocks that are the last text block, or at minimum skip blocks whose content is only whitespace. This may be better addressed in a follow-up.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 415 - 497, The
streaming converter (streamConverter.convertEvent) is emitting all
content_block_delta text chunks including preamble whitespace from non-final
text blocks; track the current content block type on content_block_start (e.g.,
add a field like sc.currentContentType or sc.inTextBlock and update it in the
"content_block_start" case) and only emit content_block_delta when the block is
a text block you intend to forward; additionally, filter out deltas whose Text
is only whitespace (trim and skip) to match the non-streaming extractTextContent
behavior so leading "\n\n" preambles are not streamed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 666-737: Add a test case to TestExtractTextContent that covers the
scenario where all blocks are "thinking" (no "text" blocks) to assert
extractTextContent returns an empty string; specifically add an entry like name:
"only thinking blocks - returns empty" with blocks: []anthropicContent{{Type:
"thinking", Text: ""}} and expected: "" so extractTextContent's fallback
behavior (falling back to blocks[0].Text) is explicitly tested.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 185-210: The adaptive branch in applyReasoning currently sets
req.OutputConfig.Effort directly (for models where
isAdaptiveThinkingModel(model) is true) allowing invalid effort strings to reach
Anthropic; validate the effort the same way as the legacy branch by running the
input through the existing reasoningEffortToBudgetTokens (or a shared
validateEffort function) to map/normalize invalid values to the default (e.g.,
"low"), assign the validated value to req.OutputConfig.Effort, and emit a
warning via slog.Warn when the original effort was invalid/normalized so callers
get a clear gateway-side message.

---

Outside diff comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 415-497: The streaming converter (streamConverter.convertEvent) is
emitting all content_block_delta text chunks including preamble whitespace from
non-final text blocks; track the current content block type on
content_block_start (e.g., add a field like sc.currentContentType or
sc.inTextBlock and update it in the "content_block_start" case) and only emit
content_block_delta when the block is a text block you intend to forward;
additionally, filter out deltas whose Text is only whitespace (trim and skip) to
match the non-streaming extractTextContent behavior so leading "\n\n" preambles
are not streamed.

---

Duplicate comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 206-209: The guard currently warns and clears any non-nil
req.Temperature unconditionally; change it to only warn and clear when
req.Temperature is non-nil and not equal to 1.0 (the special API-required value
for "thinking"), so keep the temperature untouched when it equals 1.0. Locate
the check around req.Temperature in anthropic.go and update the conditional to
verify the dereferenced temperature value before logging and setting it to nil.
- Around line 605-624: The helper extractTextContent can be simplified by
removing the unused boolean "found": in the loop over blocks of type
anthropicContent set last = b.Text whenever b.Type == "text", then after the
loop return last if last != "" else if len(blocks) > 0 return blocks[0].Text
else return ""; update the function extractTextContent to use only last, blocks,
Type and Text (no found flag) and preserve the legacy fallback to
blocks[0].Text.

giorgi-pro and others added 4 commits February 20, 2026 23:56
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/anthropic/anthropic.go (1)

436-460: ⚠️ Potential issue | 🟠 Major

Streaming path still forwards preamble text blocks — inconsistent with the extractTextContent fix on the non-streaming path.

When extended/adaptive thinking is enabled, Anthropic streams content blocks in this order:

  1. text block — preamble (e.g. "\n\n")
  2. thinking block — hidden reasoning
  3. text block — the actual answer

extractTextContent correctly filters this for non-streaming responses (returns only the last "text" block). However, in streamConverter.convertEvent (and responsesStreamConverter.convertEvent, line 842), the "content_block_delta" handler emits every non-empty Delta.Text, so the preamble "\n\n" is forwarded as a data: chunk before the answer. Streaming clients of adaptive thinking models will see leading whitespace that non-streaming clients never see.

A minimal fix is to track whether the current content block is a preamble (i.e., a "text" block that precedes at least one "thinking" block) using content_block_start events. The converter already discards those events (return ""); adding type tracking there would allow filtering without buffering the full response.

Would you like me to draft the state-tracking approach for streamConverter and responsesStreamConverter to suppress preamble text deltas?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 436 - 460, The
streaming converter forwards every non-empty content_block_delta (in
streamConverter.convertEvent and responsesStreamConverter.convertEvent), which
causes preamble "text" blocks (e.g. "\n\n") to be emitted before hidden
"thinking" blocks; fix this by adding state tracking on the stream converter
(e.g., fields like currentContentBlockType string and seenThinkingInMessage bool
on the streamConverter struct instance sc), set currentContentBlockType when
handling "content_block_start" events, set seenThinkingInMessage=true when you
see a "thinking" start, and in the "content_block_delta" handler only emit
"text" deltas if either the block type is not "text" or seenThinkingInMessage is
true (suppress text deltas that are a preamble before any thinking); reset the
flags at message end/new message start to avoid cross-message leakage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 1360-1368: Test currently checks len(result.Output) but then
indexes result.Output[0].Content[0] which can panic if Content is empty; update
the test (around the assertions using result.Output, result.Output[0].Content,
and result.Usage) to first assert len(result.Output[0].Content) > 0 (or
otherwise check Content length) before accessing Content[0].Text so the test
fails with a clear error rather than panicking; reference the values produced by
convertAnthropicResponseToResponses and add an explicit error message if Content
is empty.
- Around line 654-663: The test currently indexes result.Choices[0] in the
convertFromAnthropicResponse test without verifying the slice length; add a
guard before that access in the test: check len(result.Choices) == 0 and call
t.Fatalf (or t.Errorf followed by return) with a clear message like "no choices
returned" so the test fails with a descriptive error instead of panicking;
update the block around convertFromAnthropicResponse and the subsequent
assertions (Choices[0].Message.Content and Usage.CompletionTokens) to run only
after the non-empty check.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 185-188: The doc comment currently above normalizeEffort actually
describes applyReasoning; move or rewrite that comment so it immediately
precedes the applyReasoning function (and remove or replace the incorrect
comment over normalizeEffort). Specifically, relocate the block starting with
"applyReasoning configures thinking and effort on an anthropicRequest..." from
above normalizeEffort to directly above the applyReasoning function declaration,
and ensure normalizeEffort has an appropriate short comment (or none) so go doc
attributes the documentation to the correct function names normalizeEffort and
applyReasoning.
- Around line 108-112: The adaptiveThinkingPrefixes slice currently includes
"claude-haiku-4-6" which is unsupported for adaptive thinking; remove that entry
from adaptiveThinkingPrefixes so it contains only "claude-opus-4-6" and
"claude-sonnet-4-6", and update any nearby comment or validation that references
adaptive thinking support (functions that check adaptiveThinkingPrefixes) to
ensure they only target supported model IDs (adaptiveThinkingPrefixes).
- Around line 188-196: Add a clarifying comment above normalizeEffort explaining
that although Anthropic Opus 4.6 supports "max", the gateway's public type
core.Reasoning.Effort only exposes "low", "medium", and "high", so "max" is
intentionally rejected and any unsupported value is downgraded to "low" (keep
the existing switch behavior and slog.Warn). Reference normalizeEffort and
mention "max" and Opus 4.6 in the comment so future readers understand why the
gateway rejects "max".

---

Outside diff comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 436-460: The streaming converter forwards every non-empty
content_block_delta (in streamConverter.convertEvent and
responsesStreamConverter.convertEvent), which causes preamble "text" blocks
(e.g. "\n\n") to be emitted before hidden "thinking" blocks; fix this by adding
state tracking on the stream converter (e.g., fields like
currentContentBlockType string and seenThinkingInMessage bool on the
streamConverter struct instance sc), set currentContentBlockType when handling
"content_block_start" events, set seenThinkingInMessage=true when you see a
"thinking" start, and in the "content_block_delta" handler only emit "text"
deltas if either the block type is not "text" or seenThinkingInMessage is true
(suppress text deltas that are a preamble before any thinking); reset the flags
at message end/new message start to avoid cross-message leakage.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/providers/anthropic/anthropic.go (2)

437-464: ⚠️ Potential issue | 🟡 Minor

Streaming path emits preamble text deltas for interleaved-thinking responses.

For adaptive thinking models, adaptive thinking automatically enables interleaved thinking, which means Anthropic can stream responses in a [preamble_text, thinking, answer_text] block sequence. The streamConverter doesn't track the active content block type: it forwards every content_block_delta where Delta.Text != "", so the preamble "\n\n" text deltas are emitted to clients ahead of the actual answer. This creates an inconsistency with the non-streaming path, where extractTextContent correctly picks only the last text block.

Thinking deltas themselves are fine — Anthropic sends them as type: "thinking_delta" with a thinking field, not a text field, so Delta.Text is empty and they're already dropped.

A targeted fix would track the current block type using event.ContentBlock in content_block_start and gate text emission accordingly:

🛡️ Suggested approach
 type streamConverter struct {
     reader           *bufio.Reader
     body             io.ReadCloser
     model            string
     msgID            string
     buffer           []byte
     closed           bool
+    // activeBlockType tracks the content_block_start type so that
+    // thinking/redacted_thinking deltas are not forwarded to clients.
+    activeBlockType  string
 }

 case "content_block_start":
+    if event.ContentBlock != nil {
+        sc.activeBlockType = event.ContentBlock.Type
+    }
     return ""

 case "content_block_delta":
-    if event.Delta != nil && event.Delta.Text != "" {
+    if event.Delta != nil && event.Delta.Text != "" && sc.activeBlockType == "text" {

The same pattern applies to responsesStreamConverter.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 437 - 464, The stream
converter currently emits every content_block_delta with non-empty Delta.Text,
causing preamble_text to be forwarded; update the handler around the
"content_block_start"/"content_block_delta" cases to track the active block type
(use event.ContentBlock in the "content_block_start" case, store it on the
stream converter state e.g., sc.activeContentBlock, and clear it on
content_block_end), and only marshal/return text chunks in the
"content_block_delta" case when the active block is the final answer block
(e.g., "answer" or "answer_text") — apply the same guarded logic to
responsesStreamConverter so only answer text is streamed to clients.

437-464: ⚠️ Potential issue | 🟡 Minor

Streaming path emits preamble text blocks for adaptive thinking models, creating inconsistency with non-streaming behavior.

When adaptive thinking is enabled, Anthropic's response structure is [text("\n\n"), thinking(...), text(answer)] (as documented in extractTextContent). The non-streaming path correctly picks only the last text block via extractTextContent. However, in streaming, convertEvent forwards every content_block_delta where Delta.Text != "", which means the preamble text block ("\n\n") is emitted to clients before the actual answer, creating unintended leading whitespace.

Thinking blocks are correctly skipped because their deltas have no Text field, but the initial text block (preamble) is indistinguishable from other text blocks in the current logic.

A minimal fix would track the content block type or index in streamConverter and suppress text deltas from non-final text blocks:

Suggested fix outline
 type streamConverter struct {
     reader           *bufio.Reader
     body             io.ReadCloser
     model            string
     msgID            string
     buffer           []byte
     closed           bool
+    activeBlockType  string // set from content_block_start
+    activeBlockIndex int
 }
 case "content_block_start":
+    if event.ContentBlock != nil {
+        sc.activeBlockType = event.ContentBlock.Type
+        sc.activeBlockIndex = event.Index
+    }
     return ""

 case "content_block_delta":
-    if event.Delta != nil && event.Delta.Text != "" {
+    if event.Delta != nil && event.Delta.Text != "" && sc.activeBlockType == "text" {
+        // Note: still emits text from ALL text blocks including preamble;
+        // a complete fix needs to detect which text block is the final one.

A complete solution requires either buffering to emit only the last text block (breaking the streaming contract) or detecting Anthropic's end_turn stop signal as the anchor for the final text block.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 437 - 464, The
streaming converter (convertEvent / streamConverter handling of
"content_block_delta") is emitting every text delta including Anthropic's
preamble text block, causing leading whitespace; change streamConverter to track
content block sequence/type and suppress non-final text blocks by recording the
current content block kind on "content_block_start" and only emitting text
deltas for the final text content block (or emit text deltas only after seeing
the final block's start or the "end_turn" stop anchor), e.g., add a field on
streamConverter to store the active content block type/index, update it on
"content_block_start"/"content_block_start"/other start events, and in the
"content_block_delta" branch only marshal/return the delta when the active block
is the final text block (or when end_turn is reached). Ensure you reference
convertEvent, streamConverter, and extractTextContent when implementing this
logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 1384-1578: Add a test case to
TestConvertToAnthropicRequest_ReasoningEffort that verifies temperature == 1.0
is preserved when reasoning is enabled: extend the test table to include a field
(e.g., temperatureVal float64 or a boolean like setTempOne) and add an entry
such as "legacy model - preserves temperature=1.0 with reasoning" for model
"claude-3-5-sonnet-20241022" with reasoning Effort "medium", expectedThinkType
"enabled", expectedBudget 10000, expectedMaxTokens 15000 and expectNilTemp
false; in the test loop, set req.Temperature to &temperatureVal (use 1.0 when
the new flag is true) before calling convertToAnthropicRequest and assert
result.Temperature is non-nil and equals 1.0 to exercise the applyReasoning
branch that preserves temperature==1.0.
- Around line 1384-1578: The test suite
TestConvertToAnthropicRequest_ReasoningEffort lacks a case for temperature ==
1.0; add a case to ensure applyReasoning preserves temperature==1.0 when
reasoning is enabled. Extend the test table (in
TestConvertToAnthropicRequest_ReasoningEffort) with either a new field like
temperature float64 (default 0.7) or a boolean setTemperatureOne, add the test
case named "legacy model - preserves temperature=1.0" with model
"claude-3-5-sonnet-20241022", reasoning {Effort:"medium"}, maxTokens 15000,
expectedThinkType "enabled", expectedBudget 10000, expectedMaxTokens 15000 and
expectNilTemp false; update the loop that sets req.Temperature to use the new
temperature value (1.0 for this case) so convertToAnthropicRequest and
applyReasoning are exercised with temperature == 1.0.

---

Outside diff comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 437-464: The stream converter currently emits every
content_block_delta with non-empty Delta.Text, causing preamble_text to be
forwarded; update the handler around the
"content_block_start"/"content_block_delta" cases to track the active block type
(use event.ContentBlock in the "content_block_start" case, store it on the
stream converter state e.g., sc.activeContentBlock, and clear it on
content_block_end), and only marshal/return text chunks in the
"content_block_delta" case when the active block is the final answer block
(e.g., "answer" or "answer_text") — apply the same guarded logic to
responsesStreamConverter so only answer text is streamed to clients.
- Around line 437-464: The streaming converter (convertEvent / streamConverter
handling of "content_block_delta") is emitting every text delta including
Anthropic's preamble text block, causing leading whitespace; change
streamConverter to track content block sequence/type and suppress non-final text
blocks by recording the current content block kind on "content_block_start" and
only emitting text deltas for the final text content block (or emit text deltas
only after seeing the final block's start or the "end_turn" stop anchor), e.g.,
add a field on streamConverter to store the active content block type/index,
update it on "content_block_start"/"content_block_start"/other start events, and
in the "content_block_delta" branch only marshal/return the delta when the
active block is the final text block (or when end_turn is reached). Ensure you
reference convertEvent, streamConverter, and extractTextContent when
implementing this logic.

---

Duplicate comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 1580-1734: The test suite misses coverage for temperature == 1.0:
in TestConvertResponsesRequestToAnthropic_ReasoningEffort the helper always sets
temp := 0.7, so add a new test case entry in the tests slice that sets
setTemperature true but uses temperature 1.0 (e.g., model "claude-opus-4-6" with
Reasoning non-nil like Effort "high"), set expectNilTemp to false and
appropriate expectedThinkType/expectedEffort/expectedMaxTokens; run the same
assertions (via convertResponsesRequestToAnthropic) to ensure result.Temperature
is non-nil when temperature == 1.0 and reasoning is enabled. Ensure the new case
is added alongside existing cases in
TestConvertResponsesRequestToAnthropic_ReasoningEffort.
- Around line 1580-1733: Add a test row to
TestConvertResponsesRequestToAnthropic_ReasoningEffort that exercises a
Temperature explicitly set to 1.0 so it is preserved (expectNilTemp=false):
create a case (e.g. name "temperature 1.0 preserved") with Model like
"claude-3-5-sonnet-20241022", Reasoning nil, MaxOutputTokens intPtr(1000),
setTemperature true (and in the test set temp := 1.0 on the Requests
Temperature), and expectedMaxTokens 1000 and expectNilTemp false; this verifies
convertResponsesRequestToAnthropic preserves a Temperature of 1.0 on
core.ResponsesRequest.

Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@giorgi-pro giorgi-pro merged commit e3a5ffa into main Feb 22, 2026
12 checks passed
@giorgi-pro giorgi-pro deleted the me/gom-33-anthropic-reasoning-error-when-i-request-an-opus-model branch February 22, 2026 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants