Skip to content

fix(compression): use extract_content_or_reasoning for reasoning model summaries#4603

Open
airudotsh wants to merge 1 commit into
NousResearch:mainfrom
airudotsh:fix/compressor-reasoning-extract
Open

fix(compression): use extract_content_or_reasoning for reasoning model summaries#4603
airudotsh wants to merge 1 commit into
NousResearch:mainfrom
airudotsh:fix/compressor-reasoning-extract

Conversation

@airudotsh

Copy link
Copy Markdown

What changed

Use extract_content_or_reasoning() (from auxiliary_client) instead of raw response.choices[0].message.content in the context compressor's _generate_summary().

Why

Reasoning models (DeepSeek-R1, Qwen-QwQ, glm-5-turbo) sometimes put all output inside think/reasoning blocks with an empty content field. The compressor was reading raw content directly, getting an empty string, and silently dropping middle conversation turns without a meaningful summary — causing context continuity loss.

How it fixes it

extract_content_or_reasoning() already handles:

  1. Empty content + structured reasoning field fallback — reads message.reasoning / message.reasoning_content when content is empty
  2. XML-style think tag stripping — removes <think/>, <thinking/>, <reasoning/> blocks from content

Also normalizes dict content (llama.cpp style responses) before extraction to prevent type errors.

Related

Complements #4243 (summary fallback on provider failure) — these fix different bugs and do not conflict. This PR fixes the case where the summary call succeeds but the model puts output in reasoning-only mode.

Tests

3 new test cases in TestReasoningOnlyExtraction:

  • Reasoning field extracted when content is empty
  • Normal content passed through without modification
  • XML think tags stripped from content

All 37 tests pass.

…l summaries

Reasoning models (DeepSeek-R1, Qwen-QwQ, glm-5-turbo) sometimes put
all output inside think/reasoning blocks with an empty content field.
The compressor was reading raw response.choices[0].message.content
directly, getting an empty string, and silently dropping middle turns
without a meaningful summary.

Use the existing extract_content_or_reasoning() helper (from
auxiliary_client) which already handles:
- Empty content + structured reasoning field fallback
- XML-style think/thinking/reasoning tag stripping

Also normalize dict content (llama.cpp) before extraction to prevent
type errors.

Tests: 3 new cases covering reasoning-only, think-tag, and normal
content extraction paths.
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists labels May 1, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #14847 — same fix: use extract_content_or_reasoning() in context compressor _generate_summary() for reasoning-only model responses.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #14847 — same fix: use extract_content_or_reasoning() in context compressor _generate_summary() for reasoning-only model responses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants