Skip to content

[Bug]: Streaming /v1/responses cannot be recovered after client disconnect because partial/in-progress responses are not persisted #15026

@zhboner

Description

@zhboner

Bug Description

When using the OpenAI-compatible POST /v1/responses endpoint with:

{
  "stream": true,
  "store": true
}

the server currently only persists the response to ResponseStore after the streaming run completes successfully.

If the client disconnects mid-stream — for example app killed, network interruption, foreground/background transition, or SSE connection dropped — Hermes interrupts and cancels the agent task. The partially generated response is not stored, and subsequent:

GET /v1/responses/{response_id}

returns 404 Response not found.

This makes it impossible for clients to implement ChatGPT/Claude-like recovery after stream interruption, even though the client already received a response.created event with a valid response_id.

Steps to Reproduce

  1. Start a streaming response:
curl http://127.0.0.1:8642/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "hermes-agent",
    "input": "Do something that takes a while or calls tools.",
    "stream": true,
    "store": true
  }'
  1. Wait until the client receives:
event: response.created
data: {"type":"response.created","response":{"id":"resp_xxx", ...}}
  1. Optionally wait until some text deltas or tool events arrive:
response.output_text.delta
response.output_item.added
response.output_item.done
  1. Disconnect the client before response.completed.

  2. Try to retrieve the response:

curl http://127.0.0.1:8642/v1/responses/resp_xxx \
  -H "Authorization: Bearer $API_KEY"

Expected Behavior

When store=true, after the server emits response.created, the response should be queryable via:

GET /v1/responses/{response_id}

At minimum, the store should contain a recoverable snapshot with a meaningful status:

{
  "id": "resp_xxx",
  "object": "response",
  "status": "in_progress",
  "output": []
}

During streaming, the stored response should be updated as output arrives:

  • response.output_text.delta
  • response.output_text.done
  • response.output_item.added
  • response.output_item.done
  • response.failed
  • response.completed

If the client disconnects before completion, the server should continue running in background.

For store=true, do not cancel the agent task on SSE client disconnect. Let the response continue to completion and persist the final response to ResponseStore.

This gives ChatGPT/Claude-like recovery:

  1. client disconnects
  2. server continues generation
  3. client relaunches
  4. client calls GET /v1/responses/{response_id}
  5. final response is available once completed

Actual Behavior

GET /v1/responses/{response_id} returns:

{
  "error": {
    "message": "Response not found: resp_xxx",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

The client cannot recover:

  • partial assistant text
  • completed tool calls
  • tool outputs
  • final assistant message

The response id previously emitted by response.created is therefore not useful for recovery unless the stream reaches response.completed.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp)

Messaging Platform (if gateway-related)

No response

Debug Report

Report       https://paste.rs/veGtu
  agent.log    https://paste.rs/3eRBA
  gateway.log  https://paste.rs/BJmsi

Operating System

ubuntu 26.04

Python Version

3.11.15

Hermes Version

0.11.0

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

For store=true, decouple the agent task from the SSE connection:

  • SSE connection is only a live transport
  • response execution continues independently
  • ResponseStore is the source of truth
  • GET /v1/responses/{response_id} can return:
    • queued
    • in_progress
    • completed
    • failed
    • incomplete

Client disconnect should not necessarily cancel response execution if the user requested store=true.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions