fix: fallback content to reasoning_content when DeepSeek returns empty content field#428
Merged
pancacake merged 8 commits intoMay 4, 2026
Conversation
…y content field DeepSeek models (v4-flash, reasoner, etc.) return the actual response in the reasoning_content field while leaving content empty when thinking mode is enabled. The _parse() method only fell back to m.reasoning (DashScope-style), causing "LLM returned empty response" errors. Also fix idea_agent to handle LLM responses that are JSON arrays instead of objects with an "ideas" key.
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes DeepSeek “thinking mode” responses and improves idea generation robustness by handling alternate response shapes across OpenAI-compatible providers and agents.
Changes:
- Update OpenAI-compat
_parse()(services provider_core + tutorbot) to fall back toreasoning_contentwhencontentis empty. - Update
IdeaAgentto accept JSON array payloads by normalizing them into an{"ideas": ...}object.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| deeptutor/tutorbot/providers/openai_compat_provider.py | Adds reasoning_content → content fallback during response parsing. |
| deeptutor/services/llm/provider_core/openai_compat_provider.py | Adds the same reasoning_content → content fallback in the services-layer provider. |
| deeptutor/agents/question/agents/idea_agent.py | Wraps array-style JSON responses into an object with an "ideas" key to prevent crashes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…dence - Swap reasoning_content/reasoning fallback order in _parse() to match the precedence used when extracting reasoning_content (per Copilot review) - Add reasoning_content → content fallback in _parse_chunks() for both provider_core and tutorbot providers - Fix streaming path in factory.py: when only reasoning chunks are emitted (no direct content), fall back to response.content so downstream consumers receive non-empty responses - Harden idea_agent against non-dict JSON payloads
- Read LLM_REASONING_EFFORT from environment in config.py - Pass config.reasoning_effort to chat_with_retry/chat_stream_with_retry in both complete() and stream() (was previously dropped) - Set to "low"/"medium"/"high" to enable thinking, leave empty to use automatic detection based on reasoning_model_patterns
DeepSeek only supports high/max for reasoning_effort; minimal is a DeepTutor convention that maps to thinking.type=disabled. Update comments to be accurate for each provider.
Replace fragile json.loads() with parse_json_response() in code, flash_cards, timeline, and deep_dive generators. Handles LLM responses with markdown fences, preamble text, and malformed JSON.
get_llm_config() prefers the resolver path over .env, so the env var was silently ignored. Apply it as an override regardless of which path produced the config.
5 tasks
…-content-fallback # Conflicts: # deeptutor/book/blocks/code.py # deeptutor/book/blocks/deep_dive.py # deeptutor/book/blocks/flash_cards.py # deeptutor/book/blocks/timeline.py
Collaborator
|
Thanks for your contribution! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
DeepSeek v4-flash and v4-pro default to thinking mode enabled (per API docs), returning responses in
reasoning_contentwhile leavingcontentempty. This caused cascade failures across multiple layers:LLM returned empty response_parse()/_parse_chunks()only fell back tom.reasoning(DashScope), notm.reasoning_content(DeepSeek)No JSON object foundin book blocksCodeGenerator/FlashCardsGenerator/TimelineGenerator/DeepDiveGeneratorused fragilejson.loads()directlyfactory.pywrapped reasoning in<think>tags stripped byclean_thinking_tags()'list' object has no attribute 'get'IdeaAgentdidn't guard against array-type JSON responsesLLM_REASONING_EFFORTenv var silently ignored.env; env override not appliedFix
1. Content fallback + precedence alignment
_parse()/_parse_chunks()in bothprovider_coreandtutorbot: fall backreasoning_content→content, checkreasoning_contentbeforereasoning2. Streaming path
factory.py_runner: when only reasoning chunks are emitted (no_on_content_delta), emitresponse.contentas fallback3. Book block JSON parsing
code.py,flash_cards.py,timeline.py,deep_dive.py: replace rawjson.loads()→parse_json_response()which handles markdown fences, preamble, andjson_repairfallback4.
LLM_REASONING_EFFORTenv varconfig.py: read from env and apply as override regardless of resolver pathfactory.py: passconfig.reasoning_efforttochat_with_retry/chat_stream_with_retryhigh/maxenables thinking,minimalsendsthinking: {type: "disabled"}, empty → auto-detect.env.example/.env.example_CN: documented5. idea_agent robustness
Scope note
This PR is a bug fix — it prevents crashes and empty responses when thinking mode is active. Full thinking mode support (leveraging
reasoning_contentto improve generation quality, token budget allocation, animation pipeline adaptation) is tracked in #430.Also tracked in #430: book generation UX improvements (force regenerate, failure diagnostics, prompt leakage prevention, i18n).
Test plan
deepseek-v4-flash+LLM_REASONING_EFFORT=minimal: all book blocks generate without errorsdeepseek-v4-flash+LLM_REASONING_EFFORT=high: reasoning_content falls back correctly, no empty responsesdeepseek-chat/gpt-4oregression: normal operation unchanged