fix(api-server): emit tool.completed lifecycle SSE for chat completions (#16588)#17505
Merged
Conversation
Address Copilot review on PR #16666: 1. **Duplicate event on every tool start** — both ``tool_progress_callback`` and ``tool_start_callback`` fire side-by-side in ``run_agent.py``, so wiring both into chat completions emitted *two* ``hermes.tool.progress`` events per real tool call. Drop the legacy ``_on_tool_progress`` emit entirely; ``_on_tool_start`` now produces a single unified event that carries the legacy ``tool``/``emoji``/``label`` fields plus the new ``toolCallId``/``status`` correlation fields. Label is computed inline via ``build_tool_preview`` so callers do not need to pre-format it. 2. **Weak per-event correlation in the regression test** — the previous assertion checked that a ``toolCallId`` appeared *somewhere* in the aggregate, which would have passed even if ``running`` lacked the id. Collect ``(status, toolCallId)`` per event and assert each event carries the correct pair, plus exactly two events on the wire (no silent duplication regression). The two existing chat-completions tool-progress tests are updated to fire ``tool_start_callback`` instead of ``tool_progress_callback``, matching production reality where ``run_agent`` always pairs them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
🚨 CRITICAL Supply Chain Risk DetectedThis PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging. 🚨 CRITICAL: Install-hook file added or modifiedThese files can execute code during package installation or interpreter startup. Files: Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting. |
4 tasks
13 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvage of #16666 by @briandevans onto current main.
Fixes #16588 —
/v1/chat/completionsstreaming emitsevent: hermes.tool.progresswithstatus: runningfor tool starts, and a matchingstatus: completedevent carrying the sametoolCallId. Frontends rendering tool cards can now mark tools finished without guessing.Why this over #16591
run_agent.pyfires BOTHtool_progress_callback("tool.started", ...)ANDtool_start_callback(id, name, args)for every tool start. #16591 wired both on the chat branch without dedup, producing 3 SSE events per tool (legacyrunningwith null toolCallId + structuredrunning+completed). Verified empirically by cherry-picking #16591 and probing with both callbacks firing — got('running', None), ('running', 'call_x'), ('completed', 'call_x').This PR (from #16666) drops
tool_progress_callbackentirely on the chat branch, wires only the structured pair, and tracks_started_tool_call_idsso orphan completes (internal tools, never-started IDs) are silently dropped instead of reaching the wire.Changes
gateway/platforms/api_server.py: replace_on_tool_progresswith_on_tool_start/_on_tool_completeon the chat stream;tool_progress_callbackintentionally not wired (comment explains why); payload now carriestoolCallId+statuson the existinghermes.tool.progressevent.tests/gateway/test_api_server.py: updated existing tool-progress tests for the new callback shape; addedtest_stream_emits_tool_lifecycle_with_call_id(asserts exactly 2 per-event correlated pairs — would fail under fix(api-server): emit chat tool completion progress #16591's dup-emit) andtest_stream_tool_lifecycle_skips_internal_and_orphan_completes.Validation
scripts/run_tests.sh tests/gateway/test_api_server.py→ 123/123 passing.Closes #16591 (superseded).