Bug Description
/v1/chat/completions streaming emits custom event: hermes.tool.progress events for tool start, but it does not emit a matching completed progress event with the exact tool call id.
This leaves API-server clients that render tool lifecycle UI stuck in a running state until the full assistant response ends or until the client guesses completion locally. The issue is easy to miss on IM-style platforms because many adapters render transient progress messages differently, but API-server SSE consumers need explicit lifecycle events to update tool cards correctly.
Steps to Reproduce
- Start the API server gateway.
- Send a streaming
/v1/chat/completions request that causes a tool call.
- Inspect the SSE stream.
- Observe
event: hermes.tool.progress for the tool start, then text deltas/final response, but no status: completed event tied to the same toolCallId.
A minimal regression test can patch _run_agent to call tool_start_callback("call_terminal_1", "terminal", {...}), then tool_complete_callback("call_terminal_1", "terminal", {...}, result), and assert both custom SSE events are emitted.
Expected Behavior
For each non-internal tool call in chat-completions streaming, the API server should emit lifecycle progress events with stable correlation:
event: hermes.tool.progress with status: running and toolCallId
event: hermes.tool.progress with status: completed and the same toolCallId
The events should remain outside normal delta.content so clients do not persist progress markers into conversation history.
Actual Behavior
The chat-completions streaming branch only wires the legacy tool_progress_callback, which receives start-style events without exact call-id correlation. It does not pass tool_start_callback / tool_complete_callback into _run_agent, even though the Responses API streaming branch already uses those callbacks for structured function call lifecycle output.
Root Cause Analysis
In gateway/platforms/api_server.py, the /v1/chat/completions streaming branch defines _on_tool_progress() and passes only tool_progress_callback=_on_tool_progress to _run_agent().
AIAgent already emits exact lifecycle callbacks through tool_start_callback(tool_call_id, name, args) and tool_complete_callback(tool_call_id, name, args, result) in the tool execution paths. Those callbacks are available in the API server adapter and are used by the /v1/responses streaming path, but chat completions does not currently wire them.
Proposed Fix
Wire chat-completions streaming to tool_start_callback and tool_complete_callback, and queue custom hermes.tool.progress SSE payloads that include:
toolCallId
tool / name
status: running | completed
input / args
output / result for completed events
Keep the legacy tool_progress_callback as a start-only fallback for producers that do not provide exact call ids, and continue filtering internal tool names beginning with _.
Verification
A PR with regression coverage will run:
scripts/run_tests.sh tests/gateway/test_api_server.py -k 'tool_progress or tool_completed_progress'
Bug Description
/v1/chat/completionsstreaming emits customevent: hermes.tool.progressevents for tool start, but it does not emit a matching completed progress event with the exact tool call id.This leaves API-server clients that render tool lifecycle UI stuck in a running state until the full assistant response ends or until the client guesses completion locally. The issue is easy to miss on IM-style platforms because many adapters render transient progress messages differently, but API-server SSE consumers need explicit lifecycle events to update tool cards correctly.
Steps to Reproduce
/v1/chat/completionsrequest that causes a tool call.event: hermes.tool.progressfor the tool start, then text deltas/final response, but nostatus: completedevent tied to the sametoolCallId.A minimal regression test can patch
_run_agentto calltool_start_callback("call_terminal_1", "terminal", {...}), thentool_complete_callback("call_terminal_1", "terminal", {...}, result), and assert both custom SSE events are emitted.Expected Behavior
For each non-internal tool call in chat-completions streaming, the API server should emit lifecycle progress events with stable correlation:
event: hermes.tool.progresswithstatus: runningandtoolCallIdevent: hermes.tool.progresswithstatus: completedand the sametoolCallIdThe events should remain outside normal
delta.contentso clients do not persist progress markers into conversation history.Actual Behavior
The chat-completions streaming branch only wires the legacy
tool_progress_callback, which receives start-style events without exact call-id correlation. It does not passtool_start_callback/tool_complete_callbackinto_run_agent, even though the Responses API streaming branch already uses those callbacks for structured function call lifecycle output.Root Cause Analysis
In
gateway/platforms/api_server.py, the/v1/chat/completionsstreaming branch defines_on_tool_progress()and passes onlytool_progress_callback=_on_tool_progressto_run_agent().AIAgentalready emits exact lifecycle callbacks throughtool_start_callback(tool_call_id, name, args)andtool_complete_callback(tool_call_id, name, args, result)in the tool execution paths. Those callbacks are available in the API server adapter and are used by the/v1/responsesstreaming path, but chat completions does not currently wire them.Proposed Fix
Wire chat-completions streaming to
tool_start_callbackandtool_complete_callback, and queue customhermes.tool.progressSSE payloads that include:toolCallIdtool/namestatus: running | completedinput/argsoutput/resultfor completed eventsKeep the legacy
tool_progress_callbackas a start-only fallback for producers that do not provide exact call ids, and continue filtering internal tool names beginning with_.Verification
A PR with regression coverage will run:
scripts/run_tests.sh tests/gateway/test_api_server.py -k 'tool_progress or tool_completed_progress'