Skip to content

fix(agent): stream guardrail_halt explanation to clients (#30770)#30813

Open
xxxigm wants to merge 3 commits into
NousResearch:mainfrom
xxxigm:fix/30770-guardrail-halt-silent
Open

fix(agent): stream guardrail_halt explanation to clients (#30770)#30813
xxxigm wants to merge 3 commits into
NousResearch:mainfrom
xxxigm:fix/30770-guardrail-halt-silent

Conversation

@xxxigm

@xxxigm xxxigm commented May 23, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes #30770: when the tool-call loop guardrail fires (repeated_exact_failure_block, same_tool_failure_block, or idempotent_no_progress_block), the agent generates a short user-facing explanation via _toolguard_controlled_halt_response and appends it to the conversation as an assistant turn — but the synthesized text was never fanned out through the streaming callbacks. The Chat Completions SSE writer (Open WebUI, curl, OpenAI SDK) and TUI streaming display saw the role chunk, the finish chunk, and nothing between them; from the user's perspective the conversation died mid-thought. Interim-message platform adapters (Telegram, Discord) had the same gap.

This PR adds a single fan-out helper, wires it into the guardrail_halt branch, and pins the behaviour with a 16-test regression suite.

Related Issue

Closes #30770

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • run_agent.py — add _emit_synthesized_final_response(text) next to the existing _emit_interim_assistant_message / _fire_stream_delta helpers (+77):

    • Fires _fire_stream_delta so the gateway SSE queue and TUI streaming display receive the text exactly as if the model had produced it (re-uses the existing think-block / context-leak scrubbers so no partial-tag artefacts leak).
    • Closes stream_delta_callback with None afterwards so the TUI display closes its open response box before the finish chunk lands (matches the existing flush between tool-call iterations; the gateway SSE consumer filters None out of its queue so this is harmless there).
    • Fans out to interim_assistant_callback for platforms that consume structured messages instead of deltas (Telegram, Discord, Slack), with already_streamed=True so they dedupe correctly.
    • Preserves TTS _stream_callback end-of-stream semantics by NOT firing None on it — None is its sentinel and a later synthesized branch in the same turn would otherwise be muted.
    • Each downstream invocation is wrapped in try/except so a misbehaving consumer cannot prevent the turn from finishing.
  • agent/conversation_loop.py — call the helper inside the guardrail_halt branch (+27):

    • One call right after the existing _emit_status warning, wrapped in try/except so the halt text in final_response / messages is always preserved even if a downstream callback explodes.
    • Helper is intentionally not called on the normal text-response branch (the model already streamed its own deltas there; a re-emit would duplicate the visible answer for clients that don't dedupe).
  • tests/run_agent/test_guardrail_halt_emit_30770.py — 16 regression tests across three classes (+473):

    • TestEmitSynthesizedFinalResponseUnit (8 cases) — direct coverage of the helper: fires both stream + interim callbacks with the expected already_streamed flag, strips whitespace, skips empty / None / whitespace input, no-ops gracefully without callbacks, swallows callback exceptions on both channels, never closes the TTS _stream_callback.
    • TestGuardrailHaltStreamsToClient (6 cases) — end-to-end through run_conversation: synthesized halt is fanned out exactly once, persisted in messages for session resume, names both the tool and the guardrail code so users can correlate with logs, is a clean no-op when no callbacks registered, preserves the pre-existing _emit_status lifecycle warning, survives a helper that raises.
    • TestConversationLoopWiring (2 cases) — pins the call-site so future refactors fail noisily; also asserts the helper is NOT called for normal text responses.

How to Test

  1. Check out this branch and ensure .venv is set up: python3 -m venv .venv && source .venv/bin/activate && pip install -e ".[all,dev]"
  2. Run the new regression tests on their own:
    scripts/run_tests.sh tests/run_agent/test_guardrail_halt_emit_30770.py -v
    
    Expected: 16 passed.
  3. Run the broader guardrail + agent suite to confirm no cross-file regressions:
    scripts/run_tests.sh tests/run_agent/test_tool_call_guardrail_runtime.py \
        tests/run_agent/test_guardrail_halt_emit_30770.py \
        tests/agent/test_tool_guardrails.py \
        tests/run_agent/test_run_agent.py
    
    Expected: 376 passed.
  4. Confirm gateway SSE pipelines still pass (these are the primary consumers of the fan-out):
    scripts/run_tests.sh tests/gateway/test_stream_consumer.py \
        tests/gateway/test_api_server_runs.py \
        tests/gateway/test_api_server.py \
        tests/gateway/test_api_server_multimodal.py
    
    Expected: 283 passed.
  5. End-to-end manual verification of the original symptom:
    # Set up a hard-stop guardrail config:
    cat >> ~/.hermes/config.yaml <<YAML
    tool_loop_guardrails:
      warnings_enabled: true
      hard_stop_enabled: true
      hard_stop_after:
        exact_failure: 2
    YAML
    
    # Send a prompt that traps the model in a failure loop (e.g. "run pytest"
    # in an env without pytest installed) and watch the SSE stream:
    curl -N -X POST http://localhost:8000/v1/chat/completions \
         -H 'Content-Type: application/json' \
         -d '{"model":"hermes","stream":true,"messages":[
               {"role":"user","content":"write two pytest cases and run them"}]}'
    
    # Before the fix: stream ends with role + finish chunk, no content.
    # After the fix: a single delta.content chunk carries the halt explanation
    #   ("I stopped retrying terminal because it hit the tool-call guardrail
    #     (repeated_exact_failure_block) after 2 repeated non-progressing
    #     attempts. …") before the finish chunk lands.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(agent): …, test(agent): …)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix (no unrelated commits)
  • I've run scripts/run_tests.sh tests/run_agent/test_guardrail_halt_emit_30770.py and all tests pass
  • I've added tests for my changes (16 new test cases across 3 classes)
  • I've tested on my platform: macOS 15.2 (Darwin 24.6.0), Python 3.12

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — N/A; the helper's docstring documents the contract, no user-facing surface changed
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A (no new config keys)
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — pure Python, no platform-specific calls; tests are hermetic (no real network or filesystem)
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

Screenshots / Logs

$ scripts/run_tests.sh tests/run_agent/test_guardrail_halt_emit_30770.py -v
16 passed in 9.37s

$ scripts/run_tests.sh tests/run_agent/test_tool_call_guardrail_runtime.py \
    tests/run_agent/test_guardrail_halt_emit_30770.py \
    tests/agent/test_tool_guardrails.py tests/run_agent/test_run_agent.py
376 passed in 184.77s (0:03:04)

$ scripts/run_tests.sh tests/gateway/test_stream_consumer.py \
    tests/gateway/test_api_server_runs.py tests/gateway/test_api_server.py \
    tests/gateway/test_api_server_multimodal.py
283 passed in 10.37s

xxxigm added 3 commits May 23, 2026 14:48
…Research#30770)

Hermes synthesizes final assistant text in several turn-exit branches
that the LLM never streamed — guardrail_halt is the visible one, but
budget_exhausted, max_iterations, all_retries_exhausted_no_response,
and a handful of others share the same shape. Until now the
synthesized text landed in ``messages`` and in
``result["final_response"]`` only, which works for non-streaming
callers (CLI prints, ``/v1/chat/completions`` without ``stream=true``)
but starves every streaming consumer: the Chat Completions SSE writer
pulls from a queue populated by ``stream_delta_callback``, so without
an explicit fire it never receives a ``delta.content`` chunk between
the role chunk and the finish chunk — Open WebUI / curl / the OpenAI
SDK see an empty assistant message and the stream closes without
explanation.

Add ``_emit_synthesized_final_response(text)`` next to the existing
``_emit_interim_assistant_message`` and ``_fire_stream_delta`` helpers.
It is the single fan-out point a synthesized branch needs to call:

* fires ``_fire_stream_delta`` so the gateway SSE queue and TUI
  streaming display see the text exactly as if the model had produced
  it (re-using the existing think-block / context-leak scrubbers so
  no partial-tag artefacts leak),
* closes ``stream_delta_callback`` with ``None`` afterwards so the
  TUI display closes its open response box before the finish chunk
  lands (matches the existing flush between tool-call iterations;
  the gateway SSE consumer filters None out of its queue so this is
  harmless there),
* fans out to ``interim_assistant_callback`` for platforms that
  consume structured messages instead of deltas (Telegram, Discord),
  with ``already_streamed=True`` so they dedupe correctly,
* preserves the ``_stream_callback`` (TTS) end-of-stream semantics by
  NOT firing None on it — None is its sentinel and a later
  synthesized branch in the same turn would otherwise be muted.

Every callback invocation is wrapped in try/except so a misbehaving
downstream consumer cannot prevent the turn from finishing; the whole
point of the helper is to deliver SOMETHING when normal streaming has
already broken down.
…h#30770)

The tool-call loop guardrail (repeated_exact_failure_block,
same_tool_failure_block, idempotent_no_progress_block) generates a
short user-facing explanation via
``_toolguard_controlled_halt_response`` and appends it to the
conversation as an assistant turn. Before this change that text was
delivered only through the returned result dict, so streaming clients
got no SSE ``delta.content`` between the role chunk and the finish
chunk — the conversation appeared to die mid-thought from the user's
perspective.

Call the new ``_emit_synthesized_final_response`` helper inside the
``guardrail_halt`` branch so the synthesized explanation flows
through ``stream_delta_callback`` (Chat Completions SSE, TUI
streaming display) and ``interim_assistant_callback`` (Telegram,
Discord, Slack adapters) exactly as if the model itself had produced
it. Wrapped in try/except: the halt text is already safely captured
in ``final_response`` and the ``messages`` history, so even a broken
downstream callback must not turn a controlled guardrail halt back
into a silent failure.

The existing ``_emit_status`` lifecycle warning is intentionally
preserved — it still drives the gateway dashboard / TUI warning band
output that operators rely on for triage.
…rch#30770)

16 tests in three classes:

* ``TestEmitSynthesizedFinalResponseUnit`` — direct coverage of the
  helper: it fires both ``stream_delta_callback`` and
  ``interim_assistant_callback`` with the expected ``already_streamed``
  flag, strips whitespace, skips empty / whitespace / None inputs,
  no-ops gracefully when no callbacks are registered, and swallows
  callback exceptions on both channels. Pins that the ``None``
  end-of-stream flush goes to ``stream_delta_callback`` only — NOT
  the TTS ``_stream_callback`` (where ``None`` is its sentinel).

* ``TestGuardrailHaltStreamsToClient`` — end-to-end via
  ``run_conversation`` with a hard-stop guardrail config: the
  synthesized halt text is fanned out through the helper exactly
  once, persisted in the assistant turn of ``messages`` for session
  resume, names both the tool and the guardrail code so users can
  correlate with logs, is a clean no-op when no streaming callbacks
  are registered, leaves the pre-existing ``_emit_status`` lifecycle
  warning untouched, and survives a fan-out helper that itself
  raises (the result dict + messages history are still well-formed).

* ``TestConversationLoopWiring`` — pins the conversation-loop call
  site so future refactors that split or rename the guardrail_halt
  branch fail noisily instead of silently re-introducing the silent-
  stream symptom; also asserts the helper is NOT called for normal
  text responses (where the model already streamed its own deltas
  and a re-emit would duplicate the visible answer for clients that
  don't dedupe).
@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery P1 High — major feature broken, no workaround labels May 23, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing fix: PRs #30807 and #30808 also address #30770. This is the most comprehensive of the three — covers all consumer channels (stream delta, flush, interim message) with a reusable _emit_synthesized_final_response helper + 16-test regression suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

guardrail_halt exits silently — no final assistant message delivered to client

2 participants