Skip to content

fix(subagents): use streaming LLM path so reasoning content isn't dropped#651

Merged
Aaronontheweb merged 2 commits into
devfrom
claude-wt-subagents
Apr 14, 2026
Merged

fix(subagents): use streaming LLM path so reasoning content isn't dropped#651
Aaronontheweb merged 2 commits into
devfrom
claude-wt-subagents

Conversation

@Aaronontheweb

@Aaronontheweb Aaronontheweb commented Apr 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Switch SubAgentActor.InvokeLlmAsync from GetResponseAsync to GetStreamingResponseAsync + updates.ToChatResponse(), matching the streaming path LlmSessionActor already uses via SessionLlmInvoker.StreamAsync. Non-streaming was dropping TextReasoningContent for providers that emit thinking blocks (e.g., Qwen), leaving the subagent's final assistant message with no TextContent and causing ExtractText to return the literal string "(no response)" while Complete reported success=True with 13 chars of output.
  • Guard against future empty-response regressions: if ExtractText returns the empty-response marker, surface as Complete(false, ...) with a loud warning instead of fake success. Parent sessions used to receive (no response) as a tool result and hallucinate "subagent still processing" from it.
  • Update SubAgents FakeChatClient to delegate GetStreamingResponseAsync through GetResponseAsync + ToChatResponseUpdates() so existing test cases exercise the new streaming path.

Why

Pre-fix daemon logs and live smoke testing both reproduced the symptom: research-assistant and code-analyst subagent spawns frequently returned 13-character outputs (exact length of "(no response)") while the actor reported success and iterations=0. From the parent LLM's perspective, spawn_agent appeared to return nothing, leading the frontline model to fabricate "the subagent is still processing" responses and to eventually stop reaching for the tool. This is the root cause of the "launch pad crash" and "0% success rate" symptoms reported in #264.

Test plan

  • dotnet test src/Netclaw.Actors.Tests --filter "FullyQualifiedName~SubAgent" — 25/25 green
  • Live smoke test (Phase 0.1): explicit-delegation prompt against Qwen3.5-27B-UD produced success=True, output=1065 chars, iterations=5, duration=49958ms with real cited findings, vs. the broken path's success=True, output=13 chars, iterations=0
  • Live daemon deployment via local binary swap; sub-agent performed 10+ tool iterations of real web_search / web_fetch / file_read work before terminating naturally on a second successful run
  • Phase 0.2 organic-delegation test already run: confirmed the streaming fix is orthogonal to the adoption gap — Qwen still does the research in-process instead of delegating. Adoption work tracked in Expand subagent delegation guidance #522 and the new sub-agent format work filed in feat(subagents): single-file markdown format + contextual prompt + project-scoped discovery #647.

Related follow-ups (not closed by this PR)

Closes #264

…pped

The non-streaming GetResponseAsync call drops reasoning content for providers
that emit it separately (e.g., Qwen <think> blocks surface as
TextReasoningContent only in streaming mode), leaving the subagent's assistant
message with no TextContent. ExtractText then returned the literal string
"(no response)" and Complete reported success=True with 13 characters of
output, so the parent session received an empty tool result and commonly
hallucinated "subagent still processing" from it.

Switch InvokeLlmAsync to GetStreamingResponseAsync + updates.ToChatResponse()
to match LlmSessionActor's streaming path without emitting deltas to the
parent, and surface empty final responses as failure instead of fake success
so parent sessions see a real error.

Update the SubAgents FakeChatClient to delegate GetStreamingResponseAsync
through GetResponseAsync + ToChatResponseUpdates so existing test cases keep
working against the streaming path.

Verified live against Qwen3.5-27B-UD: research-assistant now produces real
cited findings (1065 chars, 5 iterations, 50s) where the broken path
previously returned 13 chars and iterations=0.
@Aaronontheweb Aaronontheweb added bug Something isn't working sessions LLM session actor, turn lifecycle, pipelines labels Apr 14, 2026
@Aaronontheweb Aaronontheweb merged commit a543b52 into dev Apr 14, 2026
4 of 5 checks passed
@Aaronontheweb Aaronontheweb deleted the claude-wt-subagents branch April 14, 2026 00:30
@Aaronontheweb Aaronontheweb added the subagents spawn_agent, SubAgentActor, definition loader, discovery context layer, and related features label Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working sessions LLM session actor, turn lifecycle, pipelines subagents spawn_agent, SubAgentActor, definition loader, discovery context layer, and related features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Subagent spawn_agent never executes - 0% success rate

1 participant