fix(sessions): stop tool-loaded images from being dropped before the next LLM call#1265
Merged
Aaronontheweb merged 1 commit intoJun 1, 2026
Conversation
…next LLM call The streaming tool-completion path (LlmSessionActor.ApplyToolCallRecorded / CompleteToolBatch) handed its mutable _pendingModelInputMediaReferences accumulator to the model-input media nudge and then Clear()ed that same list instance. SerializableChatMessage stored the reference without copying, so the Clear() emptied the nudge's media references before the next LLM call hydrated them. The model was told "Image loaded for model-visible inspection" but never received the image bytes, and confabulated the contents. Fix: defensively snapshot the caller's media list ([.. mediaReferences]) in SessionState.BuildNudgeMessage and AddUserMessage so the immutable message owns its own copy regardless of caller behavior. Tests: - unit regression tests (caller reuses/clears the list) in SessionStateTests - an actor-level integration test that drives a streaming file_read image load and asserts image DataContent reaches the chat client on the next LLM call; it fails without the snapshot Verified end-to-end against a vision model: with the fix the image bytes reach the model and it describes the image correctly instead of hallucinating. Fixes netclaw-dev#1264
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Multimodal
file_readshipped in v0.22.0, but on the main streaming session path a tool-loaded image silently never reached the model — the agent was told "Image loaded for model-visible inspection" and then confabulated the contents (e.g. shown the akka.net logo, it described an invented "NetClaw dashboard wireframe").Fixes #1264.
Root cause
The streaming tool-completion path hands its mutable accumulator to the media nudge and immediately clears it:
SessionState.BuildNudgeMessage/AddUserMessagestored that list by reference intoSerializableChatMessage.MediaReferences(aninitproperty with no defensive copy). The.Clear()then emptied the nudge's media references beforeFireLlmCall → SessionMessageAssembler.Assemble → ChatMessageConverter.ToAiMessagehydrated them intoDataContent. Net effect: text-only message, no image, hallucination.Why it was sneaky:
DataContentimmediately at nudge-creation; the main session defers hydration to assembly time (for prefix-cache stability), leaving a window for theClear()to corrupt the aliased list.Fix
Defensively snapshot the caller's list (
[.. mediaReferences]) at the twoSessionStatemessage constructors, so the immutable persistence message owns its own copy regardless of caller behavior. Two-line production change, with comments documenting the aliasing/clear hazard.Verification
SessionStateTests): build aList, hand it toAddSystemNudge/AddUserMessage,.Clear()it, assert the message still carries the media reference. Fail without the snapshot.LlmSessionImageDeliveryTests): drives a streamingfile_readimage load throughLlmSessionActorand asserts imageDataContentreaches the chat client on the next LLM call. Confirmed it fails without the fix (exact message: "Tool-loaded image never reached the model…") and passes with it — this was the missing coverage that let the bug ship.DataContent mediaType=image/pngon the wire.Netclaw.Actors.Testssuite: 2175 passing.dotnet slopwatch analyze: 0 issues. Copyright headers verified.Files
src/Netclaw.Actors/Sessions/SessionState.cs— the fixsrc/Netclaw.Actors.Tests/Sessions/SessionStateTests.cs— unit regression testssrc/Netclaw.Actors.Tests/Sessions/LlmSessionImageDeliveryTests.cs— new integration testFollow-up
Audio/video model input is out of scope for this fix (images only today) and is tracked separately in #1266.