feat: adapt max_tokens → max_completion_tokens for OpenAI o-series models#89
Conversation
…dels OpenAI o-series models (o1, o3, o4) reject max_tokens and require max_completion_tokens instead. They also don't support temperature. The OpenAI provider now auto-detects these models and translates parameters before forwarding, so clients can always use max_tokens uniformly (Postel's Law). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR adds OpenAI o-series (reasoning) model support to the gateway by implementing model detection and conditional parameter adaptation. The provider now translates standard requests to o-series-specific formats (max_completion_tokens, no temperature) before forwarding to OpenAI's API, with comprehensive test coverage for both streaming and non-streaming paths. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Gateway as OpenAI Provider
participant OpenAI as OpenAI API
Client->>Gateway: ChatRequest (model: o1, max_tokens: 100)
Gateway->>Gateway: Detect o-series model
Gateway->>Gateway: Adapt parameters (max_tokens → max_completion_tokens, drop temperature)
Gateway->>OpenAI: oSeriesChatRequest (model: o1, max_completion_tokens: 100)
OpenAI->>Gateway: Response (model: o1-...)
Gateway->>Client: ChatResponse
Client->>Gateway: ChatRequest (model: gpt-4o, max_tokens: 100)
Gateway->>Gateway: Detect non-o-series model
Gateway->>OpenAI: ChatRequest (model: gpt-4o, max_tokens: 100, temperature: ...)
OpenAI->>Gateway: Response (model: gpt-4o)
Gateway->>Client: ChatResponse
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/openai/openai.go (1)
132-156:⚠️ Potential issue | 🟠 MajorWrap upstream errors with
core.ParseProviderError.
ChatCompletion/StreamChatCompletioncurrently return raw errors fromllmclient, which bypasses GatewayError typing. Please map upstream HTTP errors viacore.ParseProviderErrorbefore returning.Suggested pattern
err := p.client.Do(ctx, llmclient.Request{ Method: http.MethodPost, Endpoint: "/chat/completions", Body: chatRequestBody(req), }, &resp) if err != nil { - return nil, err + return nil, core.ParseProviderError(err) }-body := p.client.DoStream(ctx, llmclient.Request{ ... }) -return body +body, err := p.client.DoStream(ctx, llmclient.Request{ ... }) +if err != nil { + return nil, core.ParseProviderError(err) +} +return body, nilAs per coding guidelines: Use
core.ParseProviderError()to convert upstream HTTP errors to the correctGatewayErrortype with typed categories (provider_error, rate_limit_error, invalid_request_error, authentication_error, not_found_error).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/openai/openai.go` around lines 132 - 156, Both ChatCompletion and StreamChatCompletion return raw errors from p.client.Do / p.client.DoStream; wrap those errors with core.ParseProviderError before returning so upstream HTTP errors are converted to GatewayError types. In ChatCompletion replace the direct return of err from p.client.Do with return nil, core.ParseProviderError(err) (after checking err != nil) and ensure resp.Provider/model logic remains the same; in StreamChatCompletion capture the error returned by p.client.DoStream and, if non-nil, return nil, core.ParseProviderError(err) so callers always receive parsed provider errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@internal/providers/openai/openai.go`:
- Around line 132-156: Both ChatCompletion and StreamChatCompletion return raw
errors from p.client.Do / p.client.DoStream; wrap those errors with
core.ParseProviderError before returning so upstream HTTP errors are converted
to GatewayError types. In ChatCompletion replace the direct return of err from
p.client.Do with return nil, core.ParseProviderError(err) (after checking err !=
nil) and ensure resp.Provider/model logic remains the same; in
StreamChatCompletion capture the error returned by p.client.DoStream and, if
non-nil, return nil, core.ParseProviderError(err) so callers always receive
parsed provider errors.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (4)
CLAUDE.mdgo.modinternal/providers/openai/openai.gointernal/providers/openai/openai_test.go
💤 Files with no reviewable changes (1)
- go.mod
|
@coderabbitai ⏺ After reading the code, this change is not needed. Here's why: The llmclient.Client.Do() and llmclient.Client.DoStream() methods already
Adding core.ParseProviderError() wrapping in the OpenAI provider would be |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 1257 🏁 Script executed: Length of output: 4528
The llmclient layer already handles all error conversion to GatewayError types:
The OpenAI provider correctly receives already-typed The error conversion architecture is:
I have a learning in the knowledge base that incorrectly suggests using ⛔ Learnings removed🧠 Learnings used |
|
@coderabbitai - make your mind on your own :) |
|
Tip For best results, initiate chat on the files or code changes.
Based on the verified code architecture and your detailed explanation, I'm removing the incorrect learning. The llmclient layer is the correct and only place for error conversion - provider layers should not call ⛔ Learnings removed🧠 Learnings used |
OpenAI o-series models (o1, o3, o4) reject max_tokens and require max_completion_tokens instead. They also don't support temperature. The OpenAI provider now auto-detects these models and translates parameters before forwarding, so clients can always use max_tokens uniformly (Postel's Law).
Summary by CodeRabbit
New Features
Documentation
Chores
Tests