Do you need to file an issue?
Describe the bug
Agent base classes use max_tokens parameter which is not supported by newer OpenAI models (gpt-5.x, o1, o3, gpt-4o). These models require max_completion_tokens instead.
Steps to reproduce
- Set LLM_BINDING = openai
- Set LLM_MODEL = gpt-5.x
- Upload knowledge base
- Run solver
Expected Behavior
Fix max_tokens parameter for newer OpenAI models (gpt-5.x, o1, o3, gpt-4o)
🐛 Bug Description
Agent base classes use the max_tokens parameter which is not supported by newer OpenAI models (gpt-5.x, o1, o3, gpt-4o). These models require max_completion_tokens instead, causing API calls to fail with a 400 error.
🔴 Error Message
Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
Stack trace location:
src/agents/solve/base_agent.py line 188: kwargs["max_tokens"] = max_tokens
src/agents/solve/main_solver.py line 264: result = await self._run_dual_loop_pipeline(question, output_dir)
📋 Affected Models
The following OpenAI models require max_completion_tokens instead of max_tokens:
gpt-5.x and later (e.g., gpt-5.2)
gpt-4o series
o1, o3 series (reasoning models)
🔍 Root Cause
A utility function _get_token_limit_kwargs() already exists in src/api/routers/system.py (lines 46-59) that correctly handles both cases, but it's:
- Private (prefixed with
_) and scoped to system.py only
- Only used in the
/test/llm endpoint
- Not applied to any agent base classes
📁 Affected Files
All agent base classes hardcode max_tokens without checking the model type:
-
src/agents/solve/base_agent.py (line 188)
if max_tokens:
kwargs["max_tokens"] = max_tokens # ❌ Fails for gpt-5.x
-
src/agents/research/agents/base_agent.py (line 134)
if max_tokens:
kwargs["max_tokens"] = max_tokens # ❌ Fails for gpt-5.x
-
src/agents/guide/agents/base_guide_agent.py (line 131)
"max_tokens": max_tokens, # ❌ Fails for gpt-5.x
-
src/agents/ideagen/base_idea_agent.py (line 120)
"max_tokens": max_tokens, # ❌ Fails for gpt-5.x
✅ Existing Solution (Not Used)
The correct implementation already exists but is unused:
src/api/routers/system.py (lines 19-59):
def _uses_max_completion_tokens(model: str) -> bool:
"""Check if model requires max_completion_tokens"""
patterns = [
r"^o[13]", # o1, o3 models
r"^gpt-4o", # gpt-4o models
r"^gpt-[5-9]", # gpt-5.x and later
r"^gpt-\d{2,}", # gpt-10+ (future proofing)
]
# ... checks model name against patterns
def _get_token_limit_kwargs(model: str, max_tokens: int) -> dict:
"""Get appropriate token limit parameter"""
if _uses_max_completion_tokens(model):
return {"max_completion_tokens": max_tokens}
return {"max_tokens": max_tokens}
Used correctly in system.py line 154:
token_kwargs = _get_token_limit_kwargs(model, max_tokens=20)
response = await openai_complete_if_cache(..., **token_kwargs)
🛠️ Proposed Solution
-
Move utility functions to shared location:
- Move
_uses_max_completion_tokens() → uses_max_completion_tokens() (make public)
- Move
_get_token_limit_kwargs() → get_token_limit_kwargs() (make public)
- Location:
src/core/core.py (alongside other LLM configuration utilities)
-
Update all agent base classes:
- Import
get_token_limit_kwargs from src.core.core
- Replace direct
kwargs["max_tokens"] = max_tokens with:
if max_tokens:
kwargs.update(get_token_limit_kwargs(model, max_tokens))
-
Update system.py:
- Import from
src.core.core instead of local definition
- Remove duplicate code
📝 Expected Behavior
After fix:
- Older models (gpt-3.5, gpt-4) continue using
max_tokens ✅
- Newer models (gpt-5.x, o1, o3, gpt-4o) automatically use
max_completion_tokens ✅
- No breaking changes for existing functionality ✅
🔄 Steps to Reproduce
- Set
LLM_MODEL=gpt-5.2 in .env
- Attempt to use Solve module (or any agent module)
- Observe 400 error with message about
max_tokens being unsupported
💡 Impact
- Severity: High - Blocks usage of newer OpenAI models
- Affected modules: Solve, Research, Guide, IdeaGen (and potentially others)
- User impact: Cannot use gpt-5.x, o1, o3, or gpt-4o models in production
🔗 Related
This issue affects the same problem domain as:
- Any code path that calls LLM APIs through agent base classes
- Future model support (gpt-6.x, etc.) will have the same issue
📌 Additional Notes
- The fix is straightforward and low-risk (utility function already exists and works)
- No breaking changes expected
- Backward compatible with existing models
Related Module
Smart Solver
Configuration Used
No response
Logs and screenshots
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Dual-Loop Solver Initializing
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Knowledge Base: RAG101
[Backend] [Solver] → Initializing agents...
[Backend] [Solver] ○ InvestigateAgent initialized
[Backend] [Solver] ○ NoteAgent initialized
[Backend] [Solver] ○ Solve Loop agents (lazy init)
[Backend] [Solver] ✓ Solver ready
[Backend] [SolveAPI] ○ [solve_20260106_170244_9c630be7] Solving: Calculate the linear convolution of x=[1,2,3] and ...
[Backend] [SolveAPI] → [solve_20260106_170244_9c630be7] Solving started
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Problem Solving Started
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Question: Calculate the linear convolution of x=[1,2,3] and h=[4,5]
[Backend] [Solver] ○ Output: C:\Users\kimseon\Documents\GitHub\DeepTutor\data\user\solve\solve_20260106_170244
[Backend] [Solver] ○ Pipeline: Analysis Loop → Solve Loop
[Backend] [Solver] ▶ Analysis Loop started | Understanding the question
[Backend] [Solver] ▶ AnalysisLoop started | max_iterations=3
[Backend] [Solver] ○ AnalysisLoop running | round=1
[Backend] ERROR: OpenAI API Call Failed,
[Backend] Model: gpt-5.2,
[Backend] Params: {'temperature': 0.3, 'response_format': {'type': 'json_object'}, 'max_tokens': 8192}, Got: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] [Solver] ✗ Solving failed: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] [Solver] ✗ Traceback (most recent call last):
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\main_solver.py", line 264, in solve
[Backend] result = await self.run_dual_loop_pipeline(question, output_dir)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\main_solver.py", line 342, in run_dual_loop_pipeline
[Backend] investigate_result = await self.investigate_agent.process(
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\analysis_loop\investigate_agent.py", line 85, in process
[Backend] response = await self.call_llm(
[Backend] ^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\base_agent.py", line 225, in call_llm
[Backend] response = await openai_complete_if_cache(**kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init.py", line 189, in async_wrapped
[Backend] return await copy(fn, *args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init.py", line 111, in call
[Backend] do = await self.iter(retry_state=retry_state)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init_.py", line 153, in iter
[Backend] result = await action(retry_state)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity_utils.py", line 99, in inner
[Backend] return call(*args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity_init_.py", line 400, in
[Backend] self._add_action_func(lambda rs: rs.outcome.result())
[Backend] ^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\AppData\Roaming\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures_base.py", line 449, in result
[Backend] return self.__get_result()
[Backend] ^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\AppData\Roaming\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures_base.py", line 401, in __get_result
[Backend] raise self.exception
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init.py", line 114, in call
[Backend] result = await fn(*args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\lightrag\llm\openai.py", line 230, in openai_complete_if_cache
[Backend] response = await openai_async_client.beta.chat.completions.parse(
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai\resources\chat\completions\completions.py", line 1670, in parse
[Backend] return await self._post(
[Backend] ^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai_base_client.py", line 1797, in post
[Backend] return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai_base_client.py", line 1597, in request
[Backend] raise self._make_status_error_from_response(err.response) from None
[Backend] openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend]
[Backend] [SolveAPI] ✗ [solve_20260106_170244_9c630be7] Solving failed: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] INFO: connection closed
Additional Information
- AI-Tutor Version:opentutor-web@0.2.0 dev
- Operating System: Windows NT 10.0; Win64; x64
- Python Version:3.12.11
- Node.js Version:node.js v22.17.0
- Browser (if applicable):
- Related Issues:
Do you need to file an issue?
Describe the bug
Agent base classes use
max_tokensparameter which is not supported by newer OpenAI models (gpt-5.x, o1, o3, gpt-4o). These models requiremax_completion_tokensinstead.Steps to reproduce
Expected Behavior
Fix
max_tokensparameter for newer OpenAI models (gpt-5.x, o1, o3, gpt-4o)🐛 Bug Description
Agent base classes use the
max_tokensparameter which is not supported by newer OpenAI models (gpt-5.x, o1, o3, gpt-4o). These models requiremax_completion_tokensinstead, causing API calls to fail with a 400 error.🔴 Error Message
Stack trace location:
src/agents/solve/base_agent.pyline 188:kwargs["max_tokens"] = max_tokenssrc/agents/solve/main_solver.pyline 264:result = await self._run_dual_loop_pipeline(question, output_dir)📋 Affected Models
The following OpenAI models require
max_completion_tokensinstead ofmax_tokens:gpt-5.xand later (e.g.,gpt-5.2)gpt-4oserieso1,o3series (reasoning models)🔍 Root Cause
A utility function
_get_token_limit_kwargs()already exists insrc/api/routers/system.py(lines 46-59) that correctly handles both cases, but it's:_) and scoped tosystem.pyonly/test/llmendpoint📁 Affected Files
All agent base classes hardcode
max_tokenswithout checking the model type:src/agents/solve/base_agent.py(line 188)src/agents/research/agents/base_agent.py(line 134)src/agents/guide/agents/base_guide_agent.py(line 131)src/agents/ideagen/base_idea_agent.py(line 120)✅ Existing Solution (Not Used)
The correct implementation already exists but is unused:
src/api/routers/system.py(lines 19-59):Used correctly in
system.pyline 154:🛠️ Proposed Solution
Move utility functions to shared location:
_uses_max_completion_tokens()→uses_max_completion_tokens()(make public)_get_token_limit_kwargs()→get_token_limit_kwargs()(make public)src/core/core.py(alongside other LLM configuration utilities)Update all agent base classes:
get_token_limit_kwargsfromsrc.core.corekwargs["max_tokens"] = max_tokenswith:Update
system.py:src.core.coreinstead of local definition📝 Expected Behavior
After fix:
max_tokens✅max_completion_tokens✅🔄 Steps to Reproduce
LLM_MODEL=gpt-5.2in.envmax_tokensbeing unsupported💡 Impact
🔗 Related
This issue affects the same problem domain as:
📌 Additional Notes
Related Module
Smart Solver
Configuration Used
No response
Logs and screenshots
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Dual-Loop Solver Initializing
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Knowledge Base: RAG101
[Backend] [Solver] → Initializing agents...
[Backend] [Solver] ○ InvestigateAgent initialized
[Backend] [Solver] ○ NoteAgent initialized
[Backend] [Solver] ○ Solve Loop agents (lazy init)
[Backend] [Solver] ✓ Solver ready
[Backend] [SolveAPI] ○ [solve_20260106_170244_9c630be7] Solving: Calculate the linear convolution of x=[1,2,3] and ...
[Backend] [SolveAPI] → [solve_20260106_170244_9c630be7] Solving started
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Problem Solving Started
[Backend] [Solver] ○ ============================================================
[Backend] [Solver] ○ Question: Calculate the linear convolution of x=[1,2,3] and h=[4,5]
[Backend] [Solver] ○ Output: C:\Users\kimseon\Documents\GitHub\DeepTutor\data\user\solve\solve_20260106_170244
[Backend] [Solver] ○ Pipeline: Analysis Loop → Solve Loop
[Backend] [Solver] ▶ Analysis Loop started | Understanding the question
[Backend] [Solver] ▶ AnalysisLoop started | max_iterations=3
[Backend] [Solver] ○ AnalysisLoop running | round=1
[Backend] ERROR: OpenAI API Call Failed,
[Backend] Model: gpt-5.2,
[Backend] Params: {'temperature': 0.3, 'response_format': {'type': 'json_object'}, 'max_tokens': 8192}, Got: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] [Solver] ✗ Solving failed: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] [Solver] ✗ Traceback (most recent call last):
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\main_solver.py", line 264, in solve
[Backend] result = await self.run_dual_loop_pipeline(question, output_dir)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\main_solver.py", line 342, in run_dual_loop_pipeline
[Backend] investigate_result = await self.investigate_agent.process(
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\analysis_loop\investigate_agent.py", line 85, in process
[Backend] response = await self.call_llm(
[Backend] ^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor\src\agents\solve\base_agent.py", line 225, in call_llm
[Backend] response = await openai_complete_if_cache(**kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init.py", line 189, in async_wrapped
[Backend] return await copy(fn, *args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init.py", line 111, in call
[Backend] do = await self.iter(retry_state=retry_state)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init_.py", line 153, in iter
[Backend] result = await action(retry_state)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity_utils.py", line 99, in inner
[Backend] return call(*args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity_init_.py", line 400, in
[Backend] self._add_action_func(lambda rs: rs.outcome.result())
[Backend] ^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\AppData\Roaming\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures_base.py", line 449, in result
[Backend] return self.__get_result()
[Backend] ^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\AppData\Roaming\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\concurrent\futures_base.py", line 401, in __get_result
[Backend] raise self.exception
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\tenacity\asyncio_init.py", line 114, in call
[Backend] result = await fn(*args, **kwargs)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\lightrag\llm\openai.py", line 230, in openai_complete_if_cache
[Backend] response = await openai_async_client.beta.chat.completions.parse(
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai\resources\chat\completions\completions.py", line 1670, in parse
[Backend] return await self._post(
[Backend] ^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai_base_client.py", line 1797, in post
[Backend] return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
[Backend] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Backend] File "C:\Users\kimseon\Documents\GitHub\DeepTutor.venv\Lib\site-packages\openai_base_client.py", line 1597, in request
[Backend] raise self._make_status_error_from_response(err.response) from None
[Backend] openai.BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend]
[Backend] [SolveAPI] ✗ [solve_20260106_170244_9c630be7] Solving failed: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
[Backend] INFO: connection closed
Additional Information