fix(subagents): use streaming LLM path so reasoning content isn't dropped#651
Merged
Conversation
…pped The non-streaming GetResponseAsync call drops reasoning content for providers that emit it separately (e.g., Qwen <think> blocks surface as TextReasoningContent only in streaming mode), leaving the subagent's assistant message with no TextContent. ExtractText then returned the literal string "(no response)" and Complete reported success=True with 13 characters of output, so the parent session received an empty tool result and commonly hallucinated "subagent still processing" from it. Switch InvokeLlmAsync to GetStreamingResponseAsync + updates.ToChatResponse() to match LlmSessionActor's streaming path without emitting deltas to the parent, and surface empty final responses as failure instead of fake success so parent sessions see a real error. Update the SubAgents FakeChatClient to delegate GetStreamingResponseAsync through GetResponseAsync + ToChatResponseUpdates so existing test cases keep working against the streaming path. Verified live against Qwen3.5-27B-UD: research-assistant now produces real cited findings (1065 chars, 5 iterations, 50s) where the broken path previously returned 13 chars and iterations=0.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SubAgentActor.InvokeLlmAsyncfromGetResponseAsynctoGetStreamingResponseAsync+updates.ToChatResponse(), matching the streaming pathLlmSessionActoralready uses viaSessionLlmInvoker.StreamAsync. Non-streaming was droppingTextReasoningContentfor providers that emit thinking blocks (e.g., Qwen), leaving the subagent's final assistant message with noTextContentand causingExtractTextto return the literal string"(no response)"whileCompletereportedsuccess=Truewith 13 chars of output.ExtractTextreturns the empty-response marker, surface asComplete(false, ...)with a loud warning instead of fake success. Parent sessions used to receive(no response)as a tool result and hallucinate "subagent still processing" from it.SubAgentsFakeChatClientto delegateGetStreamingResponseAsyncthroughGetResponseAsync + ToChatResponseUpdates()so existing test cases exercise the new streaming path.Why
Pre-fix daemon logs and live smoke testing both reproduced the symptom:
research-assistantandcode-analystsubagent spawns frequently returned 13-character outputs (exact length of"(no response)") while the actor reported success anditerations=0. From the parent LLM's perspective,spawn_agentappeared to return nothing, leading the frontline model to fabricate "the subagent is still processing" responses and to eventually stop reaching for the tool. This is the root cause of the "launch pad crash" and "0% success rate" symptoms reported in #264.Test plan
dotnet test src/Netclaw.Actors.Tests --filter "FullyQualifiedName~SubAgent"— 25/25 greensuccess=True, output=1065 chars, iterations=5, duration=49958mswith real cited findings, vs. the broken path'ssuccess=True, output=13 chars, iterations=0web_search/web_fetch/file_readwork before terminating naturally on a second successful runRelated follow-ups (not closed by this PR)
AGENTS.md(adoption gap, orthogonal to this streaming fix)Closes #264