Bug Description
Bug: Session title leaks model's thinking/reasoning tokens when using thinking models (e.g. MiniMax-M2.7)
Summary
When auxiliary.title_generation is configured to use a thinking model (e.g. MiniMax-M2.7, DeepSeek-R1, Qwen-QwQ), the auto-generated session title contains the model's internal reasoning/thinking output instead of the actual title. This results in titles like:
<tool_call>The user is asking me to generate a title for a conversation. Let me a...
Root Cause
In agent/title_generator.py line 64:
title = (response.choices[0].message.content or "").strip()
The code reads message.content directly without stripping inline thinking/reasoning blocks. Thinking models (MiniMax-M2.7, DeepSeek-R1, etc.) embed their reasoning process in the content, typically wrapped in <think>...</think> tags or as raw text. The title generator takes this raw content, truncates it to 80 chars, and saves it as the session title.
However, agent/auxiliary_client.py already provides extract_content_or_reasoning() (line 3561) that properly handles this:
def extract_content_or_reasoning(response) -> str:
"""Extract content from an LLM response, falling back to reasoning fields.
Resolution order:
1. message.content — strip inline think/reasoning blocks
2. message.reasoning / message.reasoning_content — structured fields
3. message.reasoning_details — OpenRouter unified array format
"""
This function is already used by session_search_tool.py, web_tools.py, vision_tools.py, etc. — but not by title_generator.py.
Evidence
Config
auxiliary:
title_generation:
provider: custom
model: MiniMax-M2.7
base_url: https://api.minimax.chat/v1
Database
SELECT id, title FROM sessions WHERE title IS NOT NULL;
| session_id |
title |
| 20260502_011322_75688e |
🕐The user is asking me to generate a title for a conversation. Let me a... |
| 20260502_005938_26e5ff |
🕐The user is asking me to check the kagi tools to see if they can searc... |
Logs
2026-05-02 01:13:57 INFO agent.auxiliary_client: Auxiliary title_generation: using custom (MiniMax-M2.7) at https://api.minimax.chat/v1/
No error is logged — the title generation "succeeds", it just saves thinking output as the title.
Proposed Fix
In agent/title_generator.py, replace the direct message.content access with extract_content_or_reasoning():
# Before (line 64):
title = (response.choices[0].message.content or "").strip()
# After:
from agent.auxiliary_client import extract_content_or_reasoning
title = extract_content_or_reasoning(response)
This is a one-line change that reuses the existing, well-tested stripping logic.
Additional Observations
-
max_tokens=500 is excessive for a 3-7 word title. Reducing to 50-100 would save tokens and reduce the chance of thinking output overwhelming the actual title.
-
Title prompt is too vague for thinking models. The prompt says "Return ONLY the title text, nothing else" but thinking models may still emit reasoning before the title. A more explicit prompt like adding Do NOT include any reasoning, thinking, or explanation. could help, but the real fix is stripping at the code level.
-
3 of 5 sessions have (null) titles, suggesting title generation silently fails for some provider configurations. The failure_callback mechanism exists but failures may not be surfaced to the user effectively.
Environment
- Hermes Agent: latest (installed via pip in venv)
- Main model: glm-5.1 (Volcengine Ark)
- Title generation model: MiniMax-M2.7 (custom provider)
- OS: WSL (Debian)
Steps to Reproduce
just use the title generate function
Expected Behavior
exclude all think tag
Actual Behavior
output with tag
Affected Component
Tools (terminal, file ops, web, code execution, etc.)
Messaging Platform (if gateway-related)
No response
Debug Report
⚠️ This will upload the following to a public paste service:
• System info (OS, Python version, Hermes version, provider, which API keys
are configured — NOT the actual keys)
• Recent log lines (agent.log, errors.log, gateway.log — may contain
conversation fragments and file paths)
• Full agent.log and gateway.log (up to 512 KB each — likely contains
conversation content, tool outputs, and file paths)
Pastes auto-delete after 6 hours.
Collecting debug report...
Uploading...
Debug report uploaded:
Report https://paste.rs/nNzI3
agent.log https://paste.rs/ymel2
gateway.log https://paste.rs/IEi2h
⏱ Pastes will auto-delete in 6 hours.
To delete now: hermes debug delete <url>
Share these links with the Hermes team for support.
Operating System
Debian
Python Version
No response
Hermes Version
No response
Additional Logs / Traceback (optional)
Root Cause Analysis (optional)
No response
Proposed Fix (optional)
No response
Are you willing to submit a PR for this?
Bug Description
Bug: Session title leaks model's thinking/reasoning tokens when using thinking models (e.g. MiniMax-M2.7)
Summary
When
auxiliary.title_generationis configured to use a thinking model (e.g. MiniMax-M2.7, DeepSeek-R1, Qwen-QwQ), the auto-generated session title contains the model's internal reasoning/thinking output instead of the actual title. This results in titles like:Root Cause
In
agent/title_generator.pyline 64:The code reads
message.contentdirectly without stripping inline thinking/reasoning blocks. Thinking models (MiniMax-M2.7, DeepSeek-R1, etc.) embed their reasoning process in the content, typically wrapped in<think>...</think>tags or as raw text. The title generator takes this raw content, truncates it to 80 chars, and saves it as the session title.However,
agent/auxiliary_client.pyalready providesextract_content_or_reasoning()(line 3561) that properly handles this:This function is already used by
session_search_tool.py,web_tools.py,vision_tools.py, etc. — but not bytitle_generator.py.Evidence
Config
Database
Logs
No error is logged — the title generation "succeeds", it just saves thinking output as the title.
Proposed Fix
In
agent/title_generator.py, replace the directmessage.contentaccess withextract_content_or_reasoning():This is a one-line change that reuses the existing, well-tested stripping logic.
Additional Observations
max_tokens=500is excessive for a 3-7 word title. Reducing to 50-100 would save tokens and reduce the chance of thinking output overwhelming the actual title.Title prompt is too vague for thinking models. The prompt says "Return ONLY the title text, nothing else" but thinking models may still emit reasoning before the title. A more explicit prompt like adding
Do NOT include any reasoning, thinking, or explanation.could help, but the real fix is stripping at the code level.3 of 5 sessions have
(null)titles, suggesting title generation silently fails for some provider configurations. Thefailure_callbackmechanism exists but failures may not be surfaced to the user effectively.Environment
Steps to Reproduce
just use the title generate function
Expected Behavior
exclude all think tag
Actual Behavior
output with tag
Affected Component
Tools (terminal, file ops, web, code execution, etc.)
Messaging Platform (if gateway-related)
No response
Debug Report
⚠️ This will upload the following to a public paste service: • System info (OS, Python version, Hermes version, provider, which API keys are configured — NOT the actual keys) • Recent log lines (agent.log, errors.log, gateway.log — may contain conversation fragments and file paths) • Full agent.log and gateway.log (up to 512 KB each — likely contains conversation content, tool outputs, and file paths) Pastes auto-delete after 6 hours. Collecting debug report... Uploading... Debug report uploaded: Report https://paste.rs/nNzI3 agent.log https://paste.rs/ymel2 gateway.log https://paste.rs/IEi2h ⏱ Pastes will auto-delete in 6 hours. To delete now: hermes debug delete <url> Share these links with the Hermes team for support.Operating System
Debian
Python Version
No response
Hermes Version
No response
Additional Logs / Traceback (optional)
Root Cause Analysis (optional)
No response
Proposed Fix (optional)
No response
Are you willing to submit a PR for this?