Skip to content

fix(gateway): streaming mode silently drops final response when already_sent is true #10748

@wpsl5168

Description

@wpsl5168

Bug Description

In streaming mode, the final assistant response can be silently dropped — the response is generated and stored in the database, but never delivered to the user. The user sees tool progress messages but the final answer never arrives.

Root Cause

Two locations in stream_consumer.py set final_response_sent = True without verifying the final response was actually delivered:

  1. Line 325self._final_response_sent = self._already_sent after chunked send. If earlier tool-progress messages set _already_sent = True but the final content chunk was not sent (e.g. empty accumulator after segment break), this incorrectly marks the final response as delivered.

  2. Lines 417-418 — On CancelledError (e.g. the 5-second stream_task timeout in gateway/run.py), the consumer sets _final_response_sent = True whenever _already_sent is truthy. This conflates "some content was streamed earlier" with "the final response was delivered."

Then in gateway/run.py:4085-4092, the gateway checks already_sent and returns None (skipping independent delivery), trusting that the stream consumer already delivered the response.

# gateway/run.py:4085
if agent_result.get("already_sent") and not agent_result.get("failed"):
    if response:
        await self._deliver_media_from_response(response, event, _media_adapter)
    return None  # <-- final response silently dropped

Steps to Reproduce

  1. Enable streaming mode (streaming: true in config.yaml)
  2. Send a message that triggers multiple tool calls followed by a text response
  3. Observe that tool progress messages appear, but the final assistant response is missing
  4. Check the database — the complete response is stored with finish_reason=stop

Evidence

Live reproduction: message id=6882 contains a complete 349-char response stored in the database but never delivered to the Telegram user. The stream consumer had already_sent=True from earlier tool-progress edits, causing the gateway to skip independent delivery.

Expected Behavior

The final assistant response should always be delivered to the user, regardless of whether earlier streaming content was sent.

Proposed Fix

Add a validation gate in gateway/run.py before return None: only skip independent delivery when final_response_sent is explicitly True (meaning the stream consumer confirmed it delivered the final response content, not just earlier progress messages).

Additionally, stream_consumer.py should only set _final_response_sent = True when it can confirm the final accumulated text was actually sent (non-empty accumulator successfully delivered), not merely because _already_sent was set by earlier messages.

Environment

  • OS: Ubuntu 24.04 (x86_64)
  • Python: 3.11
  • Hermes: v0.9.0
  • Platform: Telegram
  • Streaming: enabled
  • Config: streaming: true in config.yaml

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions