Skip to content

[Bug]: chat completions SSE does not emit tool completion progress #16588

@loulin

Description

@loulin

Bug Description

/v1/chat/completions streaming emits custom event: hermes.tool.progress events for tool start, but it does not emit a matching completed progress event with the exact tool call id.

This leaves API-server clients that render tool lifecycle UI stuck in a running state until the full assistant response ends or until the client guesses completion locally. The issue is easy to miss on IM-style platforms because many adapters render transient progress messages differently, but API-server SSE consumers need explicit lifecycle events to update tool cards correctly.

Steps to Reproduce

  1. Start the API server gateway.
  2. Send a streaming /v1/chat/completions request that causes a tool call.
  3. Inspect the SSE stream.
  4. Observe event: hermes.tool.progress for the tool start, then text deltas/final response, but no status: completed event tied to the same toolCallId.

A minimal regression test can patch _run_agent to call tool_start_callback("call_terminal_1", "terminal", {...}), then tool_complete_callback("call_terminal_1", "terminal", {...}, result), and assert both custom SSE events are emitted.

Expected Behavior

For each non-internal tool call in chat-completions streaming, the API server should emit lifecycle progress events with stable correlation:

  • event: hermes.tool.progress with status: running and toolCallId
  • event: hermes.tool.progress with status: completed and the same toolCallId

The events should remain outside normal delta.content so clients do not persist progress markers into conversation history.

Actual Behavior

The chat-completions streaming branch only wires the legacy tool_progress_callback, which receives start-style events without exact call-id correlation. It does not pass tool_start_callback / tool_complete_callback into _run_agent, even though the Responses API streaming branch already uses those callbacks for structured function call lifecycle output.

Root Cause Analysis

In gateway/platforms/api_server.py, the /v1/chat/completions streaming branch defines _on_tool_progress() and passes only tool_progress_callback=_on_tool_progress to _run_agent().

AIAgent already emits exact lifecycle callbacks through tool_start_callback(tool_call_id, name, args) and tool_complete_callback(tool_call_id, name, args, result) in the tool execution paths. Those callbacks are available in the API server adapter and are used by the /v1/responses streaming path, but chat completions does not currently wire them.

Proposed Fix

Wire chat-completions streaming to tool_start_callback and tool_complete_callback, and queue custom hermes.tool.progress SSE payloads that include:

  • toolCallId
  • tool / name
  • status: running | completed
  • input / args
  • output / result for completed events

Keep the legacy tool_progress_callback as a start-only fallback for producers that do not provide exact call ids, and continue filtering internal tool names beginning with _.

Verification

A PR with regression coverage will run:

scripts/run_tests.sh tests/gateway/test_api_server.py -k 'tool_progress or tool_completed_progress'

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions