Ensure Anthropic max_tokens clears the thinking budget#52
Conversation
Greptile SummaryThis PR fixes a 400 error that occurs when Anthropic's
Confidence Score: 5/5Safe to merge — the change only raises max_tokens when needed and cannot reduce it, so existing behaviour for callers who already set a high value or use adaptive/disabled thinking is unchanged. The helper is purely additive: it raises max_tokens or leaves it alone, never lowers it. Both active call sites are updated, the legacy stub is correctly unaffected, overflow is guarded with saturating_add, and the four unit tests cover all meaningful boundary conditions including the edge case between budget and budget+margin. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Incoming request with reasoning.effort] --> B{Resolve max_tokens from payload / config / default}
B --> C[convert_*_reasoning_config produces thinking + output_config]
C --> D{thinking variant?}
D -- "Enabled { budget_tokens }" --> E["desired = budget_tokens + THINKING_OUTPUT_MARGIN (8 000)"]
E --> F{"max_tokens >= desired?"}
F -- "Yes (caller set generous value)" --> G[Keep max_tokens unchanged]
F -- "No (too low)" --> H[Raise max_tokens to desired, emit tracing::debug!]
D -- "Adaptive / Disabled / None" --> G
G --> I[Build AnthropicRequest with final max_tokens]
H --> I
I --> J[POST /v1/messages]
Reviews (3): Last reviewed commit: "Review fixes" | Re-trigger Greptile |
No description provided.