Bug Description
In streaming mode, the final assistant response can be silently dropped — the response is generated and stored in the database, but never delivered to the user. The user sees tool progress messages but the final answer never arrives.
Root Cause
Two locations in stream_consumer.py set final_response_sent = True without verifying the final response was actually delivered:
-
Line 325 — self._final_response_sent = self._already_sent after chunked send. If earlier tool-progress messages set _already_sent = True but the final content chunk was not sent (e.g. empty accumulator after segment break), this incorrectly marks the final response as delivered.
-
Lines 417-418 — On CancelledError (e.g. the 5-second stream_task timeout in gateway/run.py), the consumer sets _final_response_sent = True whenever _already_sent is truthy. This conflates "some content was streamed earlier" with "the final response was delivered."
Then in gateway/run.py:4085-4092, the gateway checks already_sent and returns None (skipping independent delivery), trusting that the stream consumer already delivered the response.
# gateway/run.py:4085
if agent_result.get("already_sent") and not agent_result.get("failed"):
if response:
await self._deliver_media_from_response(response, event, _media_adapter)
return None # <-- final response silently dropped
Steps to Reproduce
- Enable streaming mode (
streaming: true in config.yaml)
- Send a message that triggers multiple tool calls followed by a text response
- Observe that tool progress messages appear, but the final assistant response is missing
- Check the database — the complete response is stored with
finish_reason=stop
Evidence
Live reproduction: message id=6882 contains a complete 349-char response stored in the database but never delivered to the Telegram user. The stream consumer had already_sent=True from earlier tool-progress edits, causing the gateway to skip independent delivery.
Expected Behavior
The final assistant response should always be delivered to the user, regardless of whether earlier streaming content was sent.
Proposed Fix
Add a validation gate in gateway/run.py before return None: only skip independent delivery when final_response_sent is explicitly True (meaning the stream consumer confirmed it delivered the final response content, not just earlier progress messages).
Additionally, stream_consumer.py should only set _final_response_sent = True when it can confirm the final accumulated text was actually sent (non-empty accumulator successfully delivered), not merely because _already_sent was set by earlier messages.
Environment
- OS: Ubuntu 24.04 (x86_64)
- Python: 3.11
- Hermes: v0.9.0
- Platform: Telegram
- Streaming: enabled
- Config:
streaming: true in config.yaml
Bug Description
In streaming mode, the final assistant response can be silently dropped — the response is generated and stored in the database, but never delivered to the user. The user sees tool progress messages but the final answer never arrives.
Root Cause
Two locations in
stream_consumer.pysetfinal_response_sent = Truewithout verifying the final response was actually delivered:Line 325 —
self._final_response_sent = self._already_sentafter chunked send. If earlier tool-progress messages set_already_sent = Truebut the final content chunk was not sent (e.g. empty accumulator after segment break), this incorrectly marks the final response as delivered.Lines 417-418 — On
CancelledError(e.g. the 5-secondstream_tasktimeout ingateway/run.py), the consumer sets_final_response_sent = Truewhenever_already_sentis truthy. This conflates "some content was streamed earlier" with "the final response was delivered."Then in
gateway/run.py:4085-4092, the gateway checksalready_sentand returnsNone(skipping independent delivery), trusting that the stream consumer already delivered the response.Steps to Reproduce
streaming: truein config.yaml)finish_reason=stopEvidence
Live reproduction: message
id=6882contains a complete 349-char response stored in the database but never delivered to the Telegram user. The stream consumer hadalready_sent=Truefrom earlier tool-progress edits, causing the gateway to skip independent delivery.Expected Behavior
The final assistant response should always be delivered to the user, regardless of whether earlier streaming content was sent.
Proposed Fix
Add a validation gate in
gateway/run.pybeforereturn None: only skip independent delivery whenfinal_response_sentis explicitlyTrue(meaning the stream consumer confirmed it delivered the final response content, not just earlier progress messages).Additionally,
stream_consumer.pyshould only set_final_response_sent = Truewhen it can confirm the final accumulated text was actually sent (non-empty accumulator successfully delivered), not merely because_already_sentwas set by earlier messages.Environment
streaming: truein config.yaml