When a model includes <think>...</think> reasoning blocks in its response, the raw XML tags are displayed to the end user instead of being stripped. This affects both the CLI and all messaging gateway platforms (Telegram, Discord, Slack, WhatsApp).
Root cause:
AIAgent._strip_think_blocks() exists (run_agent.py:490) but is never called on the final response in the normal completion path.
It's only used for:
- Quiet-mode intermediate previews (line ~2333) — cosmetic logging only
- The max-iterations summary in _handle_max_iterations() (line ~1683) — an edge case
The main path at the end of _run_agent_loop() assigns final_response = assistant_message.content or "" directly (line ~2358) and
returns it without stripping. The fallback path (_last_content_with_tools, line ~2401) has the same issue.
Fix:
Apply _strip_think_blocks() to final_response in two places in _run_agent_loop():
-
The normal completion path (after line 2358):
final_response = assistant_message.content or ""
final_response = self._strip_think_blocks(final_response).strip()
-
The fallback path (line ~2401):
final_response = self._strip_think_blocks(fallback).strip()
Note: the _has_content_after_think_block check on the line after must use the raw assistant_message.content (not the
already-stripped final_response) to correctly detect think-only responses:
if not self._has_content_after_think_block(assistant_message.content or ""):
Reasoning is unaffected — _build_assistant_message() extracts it separately via _extract_reasoning() from the raw API response
object and stores it in the message's "reasoning" field for trajectory export.
When a model includes
<think>...</think>reasoning blocks in its response, the raw XML tags are displayed to the end user instead of being stripped. This affects both the CLI and all messaging gateway platforms (Telegram, Discord, Slack, WhatsApp).Root cause:
AIAgent._strip_think_blocks() exists (run_agent.py:490) but is never called on the final response in the normal completion path.
It's only used for:
The main path at the end of _run_agent_loop() assigns final_response = assistant_message.content or "" directly (line ~2358) and
returns it without stripping. The fallback path (_last_content_with_tools, line ~2401) has the same issue.
Fix:
Apply _strip_think_blocks() to final_response in two places in _run_agent_loop():
The normal completion path (after line 2358):
final_response = assistant_message.content or ""
final_response = self._strip_think_blocks(final_response).strip()
The fallback path (line ~2401):
final_response = self._strip_think_blocks(fallback).strip()
Note: the _has_content_after_think_block check on the line after must use the raw assistant_message.content (not the
already-stripped final_response) to correctly detect think-only responses:
if not self._has_content_after_think_block(assistant_message.content or ""):
Reasoning is unaffected — _build_assistant_message() extracts it separately via _extract_reasoning() from the raw API response
object and stores it in the message's "reasoning" field for trajectory export.