fix: set empty reasoning_content for DeepSeek when tool_calls present#14941
fix: set empty reasoning_content for DeepSeek when tool_calls present#14941tseek wants to merge 1 commit into
Conversation
When the model returns tool_calls alongside reasoning_content, some DeepSeek models (e.g., deepseek-v4-flash) include non-empty reasoning_content which can cause downstream issues. Apply the same empty-string normalization already done for Kimi models.
|
Fixes #14938 (DeepSeek V4 Flash tool call failures with reasoning_content error). |
|
I hit the same deepseek-v4-pro 400 error today, and monkey-patched it locally. Your PR sets empty reasoning_content for DeepSeek when tool_calls are present in the message. My session broke differently: an assistant reply with no reasoning and no tool_calls — just a plain text response sitting between two tool-call turns. DeepSeek still rejected the request because that assistant message lacks reasoning_content entirely. The fix I ended up with adds an _is_deepseek_endpoint() check in _copy_reasoning_content_for_api() that applies to all assistant messages, not just ones with tool_calls: This also covered two edge cases your PR doesn't touch:
These are edge cases (iteration budget exhaustion + error recovery + resume) but they'd hit the same 400. Your fix + #14973 (extract from model_extra) + this broader replay guard would be the complete set. I am not a coder, I am just telling you what my hermes agent told me it had to do to get deepseek-v4-pro working without breaking. |
|
Confirming this PR resolves the issue from #14938 / #14933. Tested on Hermes main (commit e5d41f0) with the patch applied locally. Repro that previously failedConfig ( model:
base_url: https://api.deepseek.com
default: deepseek-v4-flash
provider: deepseekEnvironment:
Error before patch (in Happened on the first tool-call turn of an interactive After applying this patchSame session, same config, multiple tool-calling turns over ~45 min — zero Notes on the fix
LGTM 👍 |
|
Thanks for the fix, @tseek — this was a real and impactful bug. This automated hermes-sweeper review found that the same fix has already landed on
Your analysis was correct and your patch was sound. The fix you contributed triggered the right fix to land — closing as implemented. |
When the model returns
tool_callsalongsidereasoning_content, some DeepSeek models (e.g., deepseek-v4-flash) include non-emptyreasoning_contentwhich can cause downstream issues.This PR applies the same empty-string normalization already done for Kimi models, by checking if the provider is DeepSeek (or custom pointing to api.deepseek.com) and setting
reasoning_contentto empty string whentool_callsare present.Tested and verified working with deepseek-v4-flash.