Skip to content

[Bug]: Fix max_tokens parameter for newer OpenAI models (gpt-5.x, o1, o3) #54

@stevechoi0222

Description

@stevechoi0222

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

Agent base classes use max_tokens parameter which is not supported by newer OpenAI models (gpt-5.x, o1, o3, gpt-4o). These models require max_completion_tokens instead.

Steps to reproduce

  1. Set LLM_BINDING = openai
  2. Set LLM_MODEL = gpt-5.x
  3. Upload knowledge base
  4. Run solver

Expected Behavior

Fix max_tokens parameter for newer OpenAI models (gpt-5.x, o1, o3, gpt-4o)

🐛 Bug Description

Agent base classes use the max_tokens parameter which is not supported by newer OpenAI models (gpt-5.x, o1, o3, gpt-4o). These models require max_completion_tokens instead, causing API calls to fail with a 400 error.

🔴 Error Message

Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}

Stack trace location:

  • src/agents/solve/base_agent.py line 188: kwargs["max_tokens"] = max_tokens
  • src/agents/solve/main_solver.py line 264: result = await self._run_dual_loop_pipeline(question, output_dir)

📋 Affected Models

The following OpenAI models require max_completion_tokens instead of max_tokens:

  • gpt-5.x and later (e.g., gpt-5.2)
  • gpt-4o series
  • o1, o3 series (reasoning models)

🔍 Root Cause

A utility function _get_token_limit_kwargs() already exists in src/api/routers/system.py (lines 46-59) that correctly handles both cases, but it's:

  1. Private (prefixed with _) and scoped to system.py only
  2. Only used in the /test/llm endpoint
  3. Not applied to any agent base classes

📁 Affected Files

All agent base classes hardcode max_tokens without checking the model type:

  1. src/agents/solve/base_agent.py (line 188)

    if max_tokens:
        kwargs["max_tokens"] = max_tokens  # ❌ Fails for gpt-5.x
  2. src/agents/research/agents/base_agent.py (line 134)

    if max_tokens:
        kwargs["max_tokens"] = max_tokens  # ❌ Fails for gpt-5.x
  3. src/agents/guide/agents/base_guide_agent.py (line 131)

    "max_tokens": max_tokens,  # ❌ Fails for gpt-5.x
  4. src/agents/ideagen/base_idea_agent.py (line 120)

    "max_tokens": max_tokens,  # ❌ Fails for gpt-5.x

✅ Existing Solution (Not Used)

The correct implementation already exists but is unused:

src/api/routers/system.py (lines 19-59):

def _uses_max_completion_tokens(model: str) -> bool:
    """Check if model requires max_completion_tokens"""
    patterns = [
        r"^o[13]",      # o1, o3 models
        r"^gpt-4o",     # gpt-4o models
        r"^gpt-[5-9]",  # gpt-5.x and later
        r"^gpt-\d{2,}", # gpt-10+ (future proofing)
    ]
    # ... checks model name against patterns

def _get_token_limit_kwargs(model: str, max_tokens: int) -> dict:
    """Get appropriate token limit parameter"""
    if _uses_max_completion_tokens(model):
        return {"max_completion_tokens": max_tokens}
    return {"max_tokens": max_tokens}

Used correctly in system.py line 154:

token_kwargs = _get_token_limit_kwargs(model, max_tokens=20)
response = await openai_complete_if_cache(..., **token_kwargs)

🛠️ Proposed Solution

  1. Move utility functions to shared location:

    • Move _uses_max_completion_tokens()uses_max_completion_tokens() (make public)
    • Move _get_token_limit_kwargs()get_token_limit_kwargs() (make public)
    • Location: src/core/core.py (alongside other LLM configuration utilities)
  2. Update all agent base classes:

    • Import get_token_limit_kwargs from src.core.core
    • Replace direct kwargs["max_tokens"] = max_tokens with:
      if max_tokens:
          kwargs.update(get_token_limit_kwargs(model, max_tokens))
  3. Update system.py:

    • Import from src.core.core instead of local definition
    • Remove duplicate code

📝 Expected Behavior

After fix:

  • Older models (gpt-3.5, gpt-4) continue using max_tokens
  • Newer models (gpt-5.x, o1, o3, gpt-4o) automatically use max_completion_tokens
  • No breaking changes for existing functionality ✅

🔄 Steps to Reproduce

  1. Set LLM_MODEL=gpt-5.2 in .env
  2. Attempt to use Solve module (or any agent module)
  3. Observe 400 error with message about max_tokens being unsupported

💡 Impact

  • Severity: High - Blocks usage of newer OpenAI models
  • Affected modules: Solve, Research, Guide, IdeaGen (and potentially others)
  • User impact: Cannot use gpt-5.x, o1, o3, or gpt-4o models in production

🔗 Related

This issue affects the same problem domain as:

  • Any code path that calls LLM APIs through agent base classes
  • Future model support (gpt-6.x, etc.) will have the same issue

📌 Additional Notes

  • The fix is straightforward and low-risk (utility function already exists and works)
  • No breaking changes expected
  • Backward compatible with existing models

Related Module

Smart Solver

Configuration Used

No response

Logs and screenshots

[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Dual-Loop Solver Initializing
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Knowledge Base: RAG101
[Backend] [Solver] → Initializing agents...
[Backend] [Solver] ○ InvestigateAgent initialized
[Backend] [Solver] ○ NoteAgent initialized
[Backend] [Solver] ○ Solve Loop agents (lazy init)
[Backend] [Solver] ✓ Solver ready
[Backend] [SolveAPI] ○ [solve_20260106_170244_9c630be7] Solving: Calculate the linear convolution of x=[1,2,3] and ...
[Backend] [SolveAPI] → [solve_20260106_170244_9c630be7] Solving started
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Problem Solving Started
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Question: Calculate the linear convolution of x=[1,2,3] and h=[4,5]
[Backend] [Solver] ○ Output: C:\Users\kimseon\Documents\GitHub\DeepTutor\data\user\solve\solve_20260106_170244
[Backend] [Solver] ○ Pipeline: Analysis Loop → Solve Loop
[Backend] [Solver] ▶ Analysis Loop started | Understanding the question
[Backend] [Solver] ▶ AnalysisLoop started | max_iterations=3
[Backend] [Solver] ○ AnalysisLoop running | round=1
[Backend] ERROR: OpenAI API Call Failed,
[Backend] Model: gpt-5.2,
[Backend] Params: {'temperature': 0.3, 'response_format': {'type': 'json_object'}, 'max_tokens': 8192}, Got: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] [Solver] ✗ Solving failed: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] [Solver] ✗ Traceback (most recent call last):
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\main_solver.py", line 264, in solve
[Backend] result = await self.run_dual_loop_pipeline(question, output_dir)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\main_solver.py", line 342, in run_dual_loop_pipeline
[Backend] investigate_result = await self.investigate_agent.process(
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\analysis_loop\investigate_agent.py", line 85, in process
[Backend] response = await self.call_llm(
[Backend] ^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\base_agent.py", line 225, in call_llm
[Backend] response = await openai_complete_if_cache(**kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init
.py", line 189, in async_wrapped
[Backend] return await copy(fn, *args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init
.py", line 111, in call
[Backend] do = await self.iter(retry_state=retry_state)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init_.py", line 153, in iter
[Backend] result = await action(retry_state)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity_utils.py", line 99, in inner
[Backend] return call(*args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity_init_.py", line 400, in
[Backend] self._add_action_func(lambda rs: rs.outcome.result())
[Backend] ^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\AppData\Roaming\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures_base.py", line 449, in result
[Backend] return self.__get_result()
[Backend] ^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\AppData\Roaming\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures_base.py", line 401, in __get_result
[Backend] raise self.exception
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init
.py", line 114, in call
[Backend] result = await fn(*args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\lightrag\llm\openai.py", line 230, in openai_complete_if_cache
[Backend] response = await openai_async_client.beta.chat.completions.parse(
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai\resources\chat\completions\completions.py", line 1670, in parse
[Backend] return await self._post(
[Backend] ^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai_base_client.py", line 1797, in post
[Backend] return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai_base_client.py", line 1597, in request
[Backend] raise self._make_status_error_from_response(err.response) from None
[Backend] openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend]
[Backend] [SolveAPI] ✗ [solve_20260106_170244_9c630be7] Solving failed: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] INFO: connection closed

Additional Information

  • AI-Tutor Version:opentutor-web@0.2.0 dev
  • Operating System: Windows NT 10.0; Win64; x64
  • Python Version:3.12.11
  • Node.js Version:node.js v22.17.0
  • Browser (if applicable):
  • Related Issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions