Skip to content

fix: max_tokens change to max_completion_tokens#68

Merged
galzilber merged 4 commits intomainfrom
gz/fix-max-tokens-decrepet-openai
Sep 7, 2025
Merged

fix: max_tokens change to max_completion_tokens#68
galzilber merged 4 commits intomainfrom
gz/fix-max-tokens-decrepet-openai

Conversation

@galzilber
Copy link
Contributor

@galzilber galzilber commented Sep 7, 2025

Important

Prioritize max_completion_tokens over max_tokens in OpenAIChatCompletionRequest and ensure only one token limit is sent.

  • Behavior:
    • In OpenAIChatCompletionRequest, prioritize max_completion_tokens over max_tokens if provided and >0.
    • Set max_tokens to None to ensure only one token limit is sent.
    • Move reasoning to top-level format for compatibility.
  • Compatibility:
    • No changes to public APIs; integrations remain compatible.

This description was created by Ellipsis for aed8234. You can customize this summary. It will automatically update as commits are pushed.


Summary by CodeRabbit

  • New Features

    • Added an optional cap for completion tokens that, when provided (>0), takes precedence for chat responses.
  • Refactor

    • Ensured only a single effective token limit is sent to the model to avoid conflicting limits.
    • Moved reasoning to the provider’s top-level format for improved compatibility.
    • No changes to public APIs; integrations remain compatible.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 7, 2025

Walkthrough

Converts ChatCompletionRequest into OpenAIChatCompletionRequest by propagating completion token limits into max_completion_tokens (prefer base.max_completion_tokens then base.max_tokens), clearing base.max_tokens and base.reasoning, and placing reasoning into top-level reasoning_effort. No public signatures changed.

Changes

Cohort / File(s) Summary
OpenAI provider request mapping
src/providers/openai/provider.rs
In From<ChatCompletionRequest> for OpenAIChatCompletionRequest: compute and set max_completion_tokens (use base.max_completion_tokens if > 0, else base.max_tokens if > 0), set base.max_tokens = None, clear base.reasoning = None, and map reasoning into top-level reasoning_effort. No exported signatures changed.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ChatReq as ChatCompletionRequest
  participant Mapper as From<ChatCompletionRequest>
  participant OpenAI as OpenAIChatCompletionRequest

  Client->>Mapper: provide ChatCompletionRequest
  alt base.max_completion_tokens > 0
    Mapper->>OpenAI: set max_completion_tokens = base.max_completion_tokens
  else base.max_tokens > 0
    Mapper->>OpenAI: set max_completion_tokens = base.max_tokens
  else
    Mapper->>OpenAI: max_completion_tokens = None
  end
  Mapper->>ChatReq: set base.max_tokens = None
  Mapper->>ChatReq: set base.reasoning = None
  alt reasoning present
    Mapper->>OpenAI: set reasoning_effort = reasoning
  end
  Mapper-->>Client: return OpenAIChatCompletionRequest
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • nirga

Poem

I hop and nudge the request just right,
Tokens find their proper place tonight,
Reasoning rises to the top with grace,
Base fields cleared—no trailing trace.
Rabbit patch applied, into the race! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 25539ca and aed8234.

📒 Files selected for processing (1)
  • src/providers/openai/provider.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/providers/openai/provider.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: Analyze (rust)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch gz/fix-max-tokens-decrepet-openai

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to b24aa11 in 53 seconds. Click for details.
  • Reviewed 45 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/openai/provider.rs:30
  • Draft comment:
    Ensure the logic for selecting max_completion_tokens is intentional. If a value of 0 is provided, it falls back to max_tokens. Confirm that treating 0 as absent is desired.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. src/providers/openai/provider.rs:37
  • Draft comment:
    Clearing both max_tokens and max_completion_tokens from base may have side effects if the base request is used elsewhere; ensure this mutation is safe.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.

Workflow ID: wflow_kQohFylEu5CN06Bs

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 423420b in 37 seconds. Click for details.
  • Reviewed 26 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/openai/provider.rs:32
  • Draft comment:
    Max completion tokens conversion is clear: using max_completion_tokens if >0 and falling back to max_tokens, then clearing both fields to avoid duplication. Confirm that this behavior is intended.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. src/providers/openai/provider.rs:97
  • Draft comment:
    Extraneous whitespace removed for clarity. Ensure this aligns with the project's formatting standards.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_d9Ucibn5dzaDH5lw

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 09f5221 and 423420b.

📒 Files selected for processing (1)
  • src/providers/openai/provider.rs (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/providers/openai/provider.rs (2)
src/providers/azure/provider.rs (1)
  • from (26-36)
src/providers/anthropic/models.rs (3)
  • from (85-166)
  • from (170-206)
  • from (210-231)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: Analyze (rust)
🔇 Additional comments (2)
src/providers/openai/provider.rs (2)

47-47: Wiring looks correct.
Field is plumbed into the payload as intended.


37-40: No action needed—max_tokens and max_completion_tokens are correctly skipped when None

Comment on lines +22 to +23
#[serde(skip_serializing_if = "Option::is_none")]
max_completion_tokens: Option<u32>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Confirm OpenAI model compatibility for max_completion_tokens.

Some non-reasoning OpenAI chat models historically expect max_tokens. Since we now only send max_completion_tokens (and drop both from base), please confirm all target models accept this field to avoid 400s on older models. If not, consider model-conditional mapping.


🌐 Web query:

As of September 2025, which OpenAI Chat Completions models accept the `max_completion_tokens` field vs. `max_tokens`? Is `max_completion_tokens=0` valid, or should it be omitted?

💡 Result:

Short answer

  • Newer "reasoning"/chat models use max_completion_tokens (or max_output_tokens for Responses API); older GPT-3.5/GPT‑4 chat models still use max_tokens. Use max_completion_tokens for o‑series (o1/o3/o4‑mini), the o‑reasoning family and newer gpt‑4.1 / gpt‑4o variants; use max_tokens for legacy gpt‑3.5 / gpt‑4 (turbo) chat endpoints. [1][2][3]

  • Setting max_completion_tokens = 0 is technically accepted by some schemas but is not a meaningful way to ask for “no output.” In practice some platforms/clients may reject an unrecognized param or treat 0 as invalid. Omit the field if you want default behavior (recommended), or set a positive integer if you want to cap generation. [1][4][5]

References

  • OpenAI Help Center — Controlling the length of OpenAI model responses (notes which models use max_completion_tokens vs max_tokens). [1]
  • OpenAI announcement / Responses API docs (max_output_tokens vs chat params). [2]
  • Azure/other provider reports showing errors when max_completion_tokens is not supported for certain deployments (illustrates that support varies by model/deployment). [4]
  • OpenAPI / client docs showing parameter meanings and that some clients treat max_tokens as deprecated in favor of max_completion_tokens. [3]

If you want, tell me which exact model string (e.g., gpt-4o, gpt-4o-2024-08-06, gpt-3.5-turbo-1106, an Azure deployment name) you plan to call and I’ll confirm the exact parameter to use and whether that deployment accepts a zero value.


Implement model-specific token parameter mapping. For legacy gpt-3.5-turbo* and gpt-4-turbo* endpoints send max_tokens; for o-series (o1/o3/o4-mini), o-reasoning, and newer gpt-4.1/gpt-4o variants send max_completion_tokens. Omit the field when the limit is zero or None.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 25539ca in 54 seconds. Click for details.
  • Reviewed 38 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/openai/provider.rs:17
  • Draft comment:
    Removal of the separate max_completion_tokens field clarifies that the effective token limit is now maintained in the base request. Ensure that ChatCompletionRequest properly serializes this field as required by the API.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. src/providers/openai/provider.rs:30
  • Draft comment:
    Directly assigning the computed token limit to base.max_completion_tokens avoids duplication. Verify that clearing base.max_tokens (line 35) fully prevents sending conflicting token limits.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
3. src/providers/openai/provider.rs:40
  • Draft comment:
    Preserving the computed max_completion_tokens in the flattened base ensures the payload includes the proper token cap. Double-check that this aligns with the external API’s expectations and that no unintended null fields are serialized.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to double-check alignment with external API expectations and to ensure no unintended null fields are serialized. This violates the rule against asking the author to confirm or ensure behavior.

Workflow ID: wflow_fOxw5kykV1yLKdgB

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed aed8234 in 41 seconds. Click for details.
  • Reviewed 27 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/openai/provider.rs:28
  • Draft comment:
    The new tuple matching for max_completion_tokens prioritizes valid (>0) values. Confirm that negative or zero values are intended to fall back to None.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. src/providers/openai/provider.rs:41
  • Draft comment:
    Added trailing comma for reasoning_effort in the struct initialization improves readability and future modifications.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_GysNvMLVVMu29tfq

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@galzilber galzilber merged commit 87c8971 into main Sep 7, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants