Skip to content

feat: adapt max_tokens → max_completion_tokens for OpenAI o-series models#89

Merged
SantiagoDePolonia merged 1 commit intomainfrom
fix/o3-o4-model-specific
Feb 24, 2026
Merged

feat: adapt max_tokens → max_completion_tokens for OpenAI o-series models#89
SantiagoDePolonia merged 1 commit intomainfrom
fix/o3-o4-model-specific

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented Feb 24, 2026

OpenAI o-series models (o1, o3, o4) reject max_tokens and require max_completion_tokens instead. They also don't support temperature. The OpenAI provider now auto-detects these models and translates parameters before forwarding, so clients can always use max_tokens uniformly (Postel's Law).

Summary by CodeRabbit

  • New Features

    • Added support for OpenAI o-series models with automatic parameter adaptation for reasoning model requests
  • Documentation

    • Enhanced project overview documentation describing gateway design philosophy and request adaptation mechanisms
  • Chores

    • Removed several unused indirect dependencies to streamline the dependency tree
  • Tests

    • Comprehensive test coverage added for model-specific parameter handling in standard and streaming request modes

…dels

OpenAI o-series models (o1, o3, o4) reject max_tokens and require
max_completion_tokens instead. They also don't support temperature.
The OpenAI provider now auto-detects these models and translates
parameters before forwarding, so clients can always use max_tokens
uniformly (Postel's Law).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@SantiagoDePolonia SantiagoDePolonia self-assigned this Feb 24, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

This PR adds OpenAI o-series (reasoning) model support to the gateway by implementing model detection and conditional parameter adaptation. The provider now translates standard requests to o-series-specific formats (max_completion_tokens, no temperature) before forwarding to OpenAI's API, with comprehensive test coverage for both streaming and non-streaming paths.

Changes

Cohort / File(s) Summary
Documentation
CLAUDE.md
Added design philosophy describing Postel's Law and the gateway's role in liberalizing and adapting input requests to provider-specific requirements.
Dependency Management
go.mod
Removed five indirect dependencies: cpuguy83/go-md2man/v2, russross/blackfriday/v2, shurcooL/sanitized_anchor_name, urfave/cli/v2, and sigs.k8s.io/yaml.
OpenAI Provider Implementation
internal/providers/openai/openai.go
Added o-series model detection (isOSeriesModel), introduced oSeriesChatRequest struct with max_completion_tokens, implemented adaptForOSeries helper to transform standard requests, and integrated chatRequestBody to conditionally route requests through o-series adaptation based on detected model type. Updated ChatCompletion and StreamChatCompletion to use adapted request bodies.
OpenAI Provider Tests
internal/providers/openai/openai_test.go
Added 9+ new test cases covering: o-series model identification, parameter adaptation for reasoning vs. non-reasoning models, streaming and non-streaming chat completion flows, request ID validation, streaming response parsing, and context cancellation behavior.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Gateway as OpenAI Provider
    participant OpenAI as OpenAI API

    Client->>Gateway: ChatRequest (model: o1, max_tokens: 100)
    Gateway->>Gateway: Detect o-series model
    Gateway->>Gateway: Adapt parameters (max_tokens → max_completion_tokens, drop temperature)
    Gateway->>OpenAI: oSeriesChatRequest (model: o1, max_completion_tokens: 100)
    OpenAI->>Gateway: Response (model: o1-...)
    Gateway->>Client: ChatResponse

    Client->>Gateway: ChatRequest (model: gpt-4o, max_tokens: 100)
    Gateway->>Gateway: Detect non-o-series model
    Gateway->>OpenAI: ChatRequest (model: gpt-4o, max_tokens: 100, temperature: ...)
    OpenAI->>Gateway: Response (model: gpt-4o)
    Gateway->>Client: ChatResponse
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Refactor: streaming sse #26: Refactors streaming SSE handling in the OpenAI provider's streaming path that intersects with this PR's streaming request body adaptation.
  • Feature: added xAI provider #22: Modifies how streaming request bodies are constructed and forwarded, directly overlapping with the streaming changes introduced here for o-series parameter adaptation.

Suggested labels

enhancement

Poem

🐰 The gateway adapts with flexible grace,
O-series models find their rightful place,
Max tokens transform, temperature falls,
Robustly robust, the gateway calls!
Parameters dance in their destined role,

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: parameter adaptation for OpenAI o-series models. It directly corresponds to the core functionality added in openai.go and tested extensively in openai_test.go.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/o3-o4-model-specific

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/openai/openai.go (1)

132-156: ⚠️ Potential issue | 🟠 Major

Wrap upstream errors with core.ParseProviderError.
ChatCompletion/StreamChatCompletion currently return raw errors from llmclient, which bypasses GatewayError typing. Please map upstream HTTP errors via core.ParseProviderError before returning.

Suggested pattern
 err := p.client.Do(ctx, llmclient.Request{
   Method:   http.MethodPost,
   Endpoint: "/chat/completions",
   Body:     chatRequestBody(req),
 }, &resp)
 if err != nil {
-  return nil, err
+  return nil, core.ParseProviderError(err)
 }
-body := p.client.DoStream(ctx, llmclient.Request{ ... })
-return body
+body, err := p.client.DoStream(ctx, llmclient.Request{ ... })
+if err != nil {
+  return nil, core.ParseProviderError(err)
+}
+return body, nil

As per coding guidelines: Use core.ParseProviderError() to convert upstream HTTP errors to the correct GatewayError type with typed categories (provider_error, rate_limit_error, invalid_request_error, authentication_error, not_found_error).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/openai/openai.go` around lines 132 - 156, Both
ChatCompletion and StreamChatCompletion return raw errors from p.client.Do /
p.client.DoStream; wrap those errors with core.ParseProviderError before
returning so upstream HTTP errors are converted to GatewayError types. In
ChatCompletion replace the direct return of err from p.client.Do with return
nil, core.ParseProviderError(err) (after checking err != nil) and ensure
resp.Provider/model logic remains the same; in StreamChatCompletion capture the
error returned by p.client.DoStream and, if non-nil, return nil,
core.ParseProviderError(err) so callers always receive parsed provider errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/providers/openai/openai.go`:
- Around line 132-156: Both ChatCompletion and StreamChatCompletion return raw
errors from p.client.Do / p.client.DoStream; wrap those errors with
core.ParseProviderError before returning so upstream HTTP errors are converted
to GatewayError types. In ChatCompletion replace the direct return of err from
p.client.Do with return nil, core.ParseProviderError(err) (after checking err !=
nil) and ensure resp.Provider/model logic remains the same; in
StreamChatCompletion capture the error returned by p.client.DoStream and, if
non-nil, return nil, core.ParseProviderError(err) so callers always receive
parsed provider errors.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 317e4bf and bcc97ae.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • CLAUDE.md
  • go.mod
  • internal/providers/openai/openai.go
  • internal/providers/openai/openai_test.go
💤 Files with no reviewable changes (1)
  • go.mod

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor Author

@coderabbitai ⏺ After reading the code, this change is not needed. Here's why:

The llmclient.Client.Do() and llmclient.Client.DoStream() methods already
convert errors to GatewayError types internally:

  1. Do() → DoRaw() (line 168-181): DoRaw() already calls
    core.ParseProviderError() for non-200 status codes (lines 283, 296) and wraps
    network errors with core.NewProviderError() (line 493). The errors returned to
    the OpenAI provider are already properly typed GatewayError instances.
  2. DoStream() (lines 322-445): Similarly, it already calls
    core.ParseProviderError() on non-200 responses (line 411), wraps network
    failures with core.NewProviderError() (line 383), and wraps circuit-breaker
    errors with core.NewProviderError() (line 344).

Adding core.ParseProviderError() wrapping in the OpenAI provider would be
double-wrapping — ParseProviderError expects raw HTTP status codes and
response bodies, not already-parsed GatewayError objects. The llmclient
package is the correct place for this conversion, and it's already doing it.

@SantiagoDePolonia SantiagoDePolonia merged commit 4f6df74 into main Feb 24, 2026
12 checks passed
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 24, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Examine the llmclient package to verify error handling claims
echo "=== Checking llmclient Do/DoRaw methods ==="
rg -n -A5 -B5 'func \(c \*Client\) Do\(' --type go

echo -e "\n=== Checking for ParseProviderError calls in llmclient ==="
rg -n -C3 'ParseProviderError' --type go -g 'llmclient*.go' -g '!*test.go'

echo -e "\n=== Checking OpenAI provider error handling ==="
rg -n -C3 'ParseProviderError|NewProviderError' internal/providers/openai/ --type go -g '!*test.go'

echo -e "\n=== Finding llmclient DoStream method ==="
rg -n -A3 'func \(c \*Client\) DoStream\(' --type go

Length of output: 1257


🏁 Script executed:

#!/bin/bash
# Search more broadly for ParseProviderError in llmclient
echo "=== Checking for ParseProviderError in llmclient directory ==="
fd -e go . internal/llmclient/ -x rg -n -C3 'ParseProviderError'

echo -e "\n=== Check DoRaw method around lines 280-300 ==="
sed -n '275,305p' internal/llmclient/client.go | cat -n

echo -e "\n=== Check DoRaw method around line 493 for network error wrapping ==="
sed -n '485,500p' internal/llmclient/client.go | cat -n

echo -e "\n=== Check DoStream around line 411 ==="
sed -n '405,420p' internal/llmclient/client.go | cat -n

echo -e "\n=== Check DoStream around line 383 for network error wrapping ==="
sed -n '378,390p' internal/llmclient/client.go | cat -n

echo -e "\n=== Check DoStream around line 344 for circuit breaker errors ==="
sed -n '340,350p' internal/llmclient/client.go | cat -n

Length of output: 4528


@SantiagoDePolonia Thank you for the detailed explanation! You are absolutely correct. I've verified the code and confirm:

The llmclient layer already handles all error conversion to GatewayError types:

  1. DoRaw (lines 283, 296): Calls core.ParseProviderError() for non-200 responses
  2. doRequest (line 493): Wraps network errors with core.NewProviderError()
  3. DoStream (lines 344, 383, 411): Similarly handles circuit breaker errors, network failures, and non-200 responses

The OpenAI provider correctly receives already-typed GatewayError instances from llmclient methods. Adding core.ParseProviderError() in the provider layer would indeed be double-wrapping and incorrect, as ParseProviderError() expects raw HTTP status codes and response bodies, not already-parsed error objects.

The error conversion architecture is:

  • llmclient layer: Converts raw HTTP errors → GatewayError types ✓
  • Provider layer: Receives GatewayError, passes through ✓
  • Server layer: Handles GatewayError via handleError()

I have a learning in the knowledge base that incorrectly suggests using ParseProviderError() in providers. Should I remove it to prevent confusion in future reviews?


⛔ Learnings removed
Learnt from: CR
Repo: ENTERPILOT/GOModel PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-24T12:46:48.489Z
Learning: Applies to internal/providers/**/*.go : Use `core.ParseProviderError()` to convert upstream HTTP errors to the correct `GatewayError` type with typed categories (provider_error, rate_limit_error, invalid_request_error, authentication_error, not_found_error)

🧠 Learnings used
Learnt from: CR
Repo: ENTERPILOT/GOModel PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-24T12:46:48.489Z
Learning: Applies to internal/providers/**/*.go : Use `core.ParseProviderError()` to convert upstream HTTP errors to the correct `GatewayError` type with typed categories (provider_error, rate_limit_error, invalid_request_error, authentication_error, not_found_error)

Learnt from: CR
Repo: ENTERPILOT/GOModel PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-24T12:46:48.489Z
Learning: Applies to internal/server/*.go : Call `handleError()` in HTTP handlers to check for `GatewayError` via `errors.As` and return typed JSON responses with proper HTTP status codes

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor Author

@coderabbitai - make your mind on your own :)

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 24, 2026

Tip

For best results, initiate chat on the files or code changes.

@SantiagoDePolonia Fair enough! 😊

Based on the verified code architecture and your detailed explanation, I'm removing the incorrect learning. The llmclient layer is the correct and only place for error conversion - provider layers should not call ParseProviderError(). Keeping this learning would lead to incorrect advice in future reviews.


⛔ Learnings removed
Learnt from: CR
Repo: ENTERPILOT/GOModel PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-24T18:10:53.752Z
Learning: Applies to internal/providers/**/*.go : Use `core.ParseProviderError()` to convert upstream HTTP errors to the correct `GatewayError` type

🧠 Learnings used
Learnt from: CR
Repo: ENTERPILOT/GOModel PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-24T18:10:53.752Z
Learning: Applies to internal/providers/**/*.go : Use `core.ParseProviderError()` to convert upstream HTTP errors to the correct `GatewayError` type

@SantiagoDePolonia SantiagoDePolonia deleted the fix/o3-o4-model-specific branch March 22, 2026 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant