fix(kilo): workaround DeepSeek thinking-mode 400 on Kilo gateway (v4-pro/v4-flash/reasoner) by asin76-svg · Pull Request #15323 · NousResearch/hermes-agent

asin76-svg · 2026-04-24T20:11:22Z

Summary

This PR fixes HTTP 400 errors ("The reasoning_content in the thinking mode must be passed back to the API.") for DeepSeek V3.2+ thinking models (deepseek-v4-pro, deepseek-v4-flash, deepseek-reasoner) accessed via the Kilo Code Gateway (https://api.kilo.ai/api/gateway).

Why this is distinct from #14941, #15228, #15237

Existing open/closed DeepSeek reasoning_content PRs target direct DeepSeek API access (api.deepseek.com) and all assume that setting reasoning_content = "" is sufficient. That assumption does not hold through the Kilo gateway:

Kilo strips reasoning / reasoning_content / reasoning_details fields on forward — see Kilo-Org/cloud:apps/web/src/lib/ai-gateway/providers/openrouter/request-helpers.ts, function removeChatCompletionsReasoning, and also injectReasoningIntoContent which falls back to if (reasoning) — a truthy check that drops empty strings. So reasoning_content = "" reaches the upstream DeepSeek provider as absent.
DeepSeek V3.2+ thinking mode additionally rejects conversations shaped as tool → user — i.e. any user turn after the last tool result causes the 400 even when reasoning_content is present. I reproduced this directly against api.kilo.ai with every combination of reasoning / reasoning_content / reasoning_details (empty, whitespace, non-empty, native and unified naming). The only shapes that succeed are:
- No trailing user/assistant after the last tool, or
- Such trailing text merged into the last tool message's content.

This is the same workaround Roo Code adopted in kilocode-legacy/src/api/transform/r1-format.ts (convertToR1Format(..., { mergeToolResultText: true })). Their comment:

"environment_details text after tool_results would create user messages that cause DeepSeek to drop all previous reasoning_content."

This PR ports that insight into Hermes.

Changes

All in run_agent.py, plus a new test file.

1. `_needs_deepseek_thinking_tool_merge()` (new)

Returns True when base_url is api.kilo.ai and the model contains "deepseek" but isn't deepseek-chat (which is non-thinking and unaffected).

2. `_copy_reasoning_content_for_api()` extended

New terminal branch: when the above predicate is true and source_msg has tool_calls but no reasoning_content/reasoning, inject reasoning_content = ".".

Why "." and not "": empirically verified against api.kilo.ai/api/gateway/chat/completions with deepseek-v4-pro:

reasoning_content = "" → HTTP 400 (falsy, gateway drops it).
reasoning_content = " " (single space) → HTTP 200.
reasoning_content = "." → HTTP 200.

Minimum-cost placeholder that survives the gateway's truthy filter.

3. `_merge_post_tool_text_into_tool()` (new, static)

Walks api_messages, finds the last tool message, and if only user/assistant-text messages (no further tool_calls) follow it, merges their text into the tool's content and truncates the list there. Does not mutate the input; returns a new list. Bails out if another tool-call round follows the last tool result, to preserve structure.

4. Invoked at both `api_messages` assembly sites

Main agent loop (after the per-message copy+sanitize pass).
flush_memories().

Testing

Unit tests (new file `tests/run_agent/test_kilo_deepseek_thinking.py`)

20 tests covering:

_needs_deepseek_thinking_tool_merge() matrix: kilo × [v4-pro, v4-flash, deepseek-reasoner, deepseek-chat, non-deepseek] + openrouter-with-deepseek (must be False).
_merge_post_tool_text_into_tool(): empty list, no tool, no trailing, single trailing user, multiple trailing users, mixed asst-text + user trailing, trailing asst-with-tool-calls (must bail), multi-round with merge only on last, input immutability.
_copy_reasoning_content_for_api() kilo+deepseek branch: "." injection, preservation of explicit reasoning_content, conversion from reasoning, non-injection on non-kilo, non-injection without tool_calls.

All 20 pass. Existing tests/run_agent/test_provider_parity.py (76 tests) still passes.

End-to-end verification

I replayed a real failing request_dump_*.json captured from a user-session (DeepSeek v4-pro via kilo, mid-agent-loop with 7 trailing user messages after the last tool result — caused by plugin context injection). Before fix: HTTP 400. After fix: HTTP 200 with proper content and reasoning fields returned by the model.

Interaction with #14941, #15228

Compatible. This PR only fires when base_url is api.kilo.ai; #15228 paths run for api.deepseek.com. They do not conflict at runtime. If #15228 merges first, a small rebase of this PR removes any duplicated condition around _copy_reasoning_content_for_api. If this merges first, #15228 can follow up unchanged.

References

DeepSeek Thinking Mode Docs
Kilo gateway request-helpers (field stripping)
Roo Code's mergeToolResultText workaround
Related issues in other agents: SillyTavern #4857, Continue.dev #8989, Roo-Code #10171.

Kilo gateway (https://api.kilo.ai/api/gateway) при форварде в upstream DeepSeek срезает reasoning/reasoning_content/reasoning_details (см. Kilo-Org/cloud:apps/web/src/lib/ai-gateway/providers/openrouter/ request-helpers.ts — removeChatCompletionsReasoning). В результате DeepSeek v3.2+ thinking-модели (deepseek-v4-pro, v4-flash, deepseek-reasoner) возвращают 400 'The reasoning_content in the thinking mode must be passed back to the API' в двух случаях: 1) У assistant с tool_calls отсутствует непустой reasoning_content (типично при переносе сессии с Codex API, где reasoning хранится в зашифрованном codex_reasoning_items и недоступен для форварда); 2) После последнего tool-результата идёт user-сообщение (либо цепочка user-сообщений от plugin-инъекций контекста). Решение — по аналогии с Roo Code (convertToR1Format с mergeToolResultText: true): - _needs_deepseek_thinking_tool_merge() — детектит связку api.kilo.ai + модель с 'deepseek' (кроме deepseek-chat, он non-thinking). - _copy_reasoning_content_for_api расширен: для kilo+DeepSeek thinking на assistant+tool_calls без reasoning подставляет reasoning_content='.' (пустая строка отбрасывается как falsy на стороне gateway, нужен минимум один символ). - _merge_post_tool_text_into_tool() — сливает user/assistant-text после последнего tool прямо в его content и обрезает хвост, чтобы структура не имела tool -> user. Вызовы добавлены в обоих местах сборки api_messages: основной agent loop и flush_memories. Ссылки: - https://api-docs.deepseek.com/guides/thinking_mode - RooCodeInc/Roo-Code#10171 - SillyTavern/SillyTavern#4857 Тесты: tests/run_agent/test_kilo_deepseek_thinking.py (20 штук). Реальный упавший request_dump после фикса возвращает HTTP 200. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alt-glitch · 2026-04-24T20:25:08Z

Related to #15228 and #15250 but distinct — this addresses the Kilo gateway specifically stripping reasoning_content fields and rejecting tool→user message shapes, requiring a different workaround than direct DeepSeek API fixes.

E2E-тесты через api.kilo.ai выявили, что первая версия фикса (PR NousResearch#15323) покрывает только подмножество случаев: DeepSeek thinking-моделям через kilo gateway нужно больше. Что добавлено: 1. reasoning_content="." инжектируется теперь в ЛЮБОЕ assistant-сообщение (а не только в assistant с tool_calls). Если в истории есть хотя бы один tool-результат, следующий plain-text assistant без rc тоже вызывает 400. 2. _merge_post_tool_text_into_tool переработан: теперь сливает КАЖДОЕ user-сообщение, идущее в истории после tool-результата, внутрь ближайшего предыдущего tool-сообщения. Раньше сливались только trailing user после последнего tool. Новая версия покрывает реальный сценарий hermes: пользователь вводит новый запрос между двумя tool-циклами (старая версия такое не трогала, и kilo возвращал 400 даже с rc="."). 3. assistant-text между tool и user остаётся на своём месте (не портит структуру), но user всё равно сливается в tool — e2e показали, что наличие assistant-text между tool и user не защищает от 400. Обновлены тесты (test_kilo_deepseek_thinking.py): добавлены test_user_between_tool_cycles_merged_into_prev_tool, test_asst_text_between_tool_and_user_preserved, test_first_user_before_any_tool_preserved, test_dot_injected_on_assistant_text_without_tool_calls, test_no_injection_on_non_assistant_role.

teknium1 · 2026-04-27T10:39:03Z

Closing as redundant — the DeepSeek reasoning_content thinking-mode 400 and cross-provider leak chain of issues is now fully covered on main:

Direct provider detection + same-provider tool-call pad: commits 93a2d6b, d58b305, ad0ac89, 5ae6081
All-assistant-messages pad rule: commit ad0ac89
Ordering / cross-provider isolation: commit 9daa062 + regression guard 63bf7a2
Cross-provider leak (MiniMax reasoning → DeepSeek): PR fix(agent): block cross-provider reasoning leak to DeepSeek/Kimi (#15748) #16500 (merging shortly)

21 regression tests in tests/run_agent/test_deepseek_reasoning_content_echo.py + 2 new tests for the cross-provider scenario exercise every known path. Thanks for the submission — appreciate the digging on this area.

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/kilo Kilo Code labels Apr 24, 2026

alt-glitch mentioned this pull request Apr 24, 2026

fix(deepseek): inject empty reasoning_content on replay for OpenRouter DeepSeek #15325

Closed

teknium1 closed this Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(kilo): workaround DeepSeek thinking-mode 400 on Kilo gateway (v4-pro/v4-flash/reasoner)#15323

fix(kilo): workaround DeepSeek thinking-mode 400 on Kilo gateway (v4-pro/v4-flash/reasoner)#15323
asin76-svg wants to merge 2 commits into
NousResearch:mainfrom
asin76-svg:fix/kilo-deepseek-thinking-mode

asin76-svg commented Apr 24, 2026

Uh oh!

alt-glitch commented Apr 24, 2026

Uh oh!

teknium1 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

asin76-svg commented Apr 24, 2026

Summary

Why this is distinct from #14941, #15228, #15237

Changes

1. _needs_deepseek_thinking_tool_merge() (new)

2. _copy_reasoning_content_for_api() extended

3. _merge_post_tool_text_into_tool() (new, static)

4. Invoked at both api_messages assembly sites

Testing

Unit tests (new file tests/run_agent/test_kilo_deepseek_thinking.py)

End-to-end verification

Interaction with #14941, #15228

References

Uh oh!

alt-glitch commented Apr 24, 2026

Uh oh!

teknium1 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. `_needs_deepseek_thinking_tool_merge()` (new)

2. `_copy_reasoning_content_for_api()` extended

3. `_merge_post_tool_text_into_tool()` (new, static)

4. Invoked at both `api_messages` assembly sites

Unit tests (new file `tests/run_agent/test_kilo_deepseek_thinking.py`)