fix(agent,gateway): voice interrupts + cascading interrupt hang by kristianvast · Pull Request #6600 · NousResearch/hermes-agent

kristianvast · 2026-04-09T12:41:47Z

Summary

Two related bug fixes for voice-message and streaming-interrupt handling. They're complementary — together they make mid-task voice messages behave identically to mid-task text messages, and eliminate a multi-minute hang that could follow rapid consecutive interrupts.

Problem 1 — Voice messages during active agent runs silently drop

When a voice message arrives while the agent is busy on the same session, it hits the interrupt path with event.text == "" because STT only happens at _enrich_message_with_transcription after the running-agent guard. The voice is effectively lost: the agent sees an empty interrupt, and nothing gets echoed back to the user. This makes mid-task voice messaging unusable in the gateway.

Related upstream context: #6548 (Discord transcription), 2508098 (Discord placeholder strip), 6e02fa7 (Discord empty placeholder) — all improve fresh-voice paths, but none touch the running-agent interrupt path.

Fix

Two touch points in gateway/run.py:

_enrich_message_with_transcription now returns a (text, transcripts) tuple so callers can echo raw transcripts back to the user before feeding them to the agent. The fresh-message dispatch at _handle_message echoes each transcript as 🎙️ "..." immediately, giving the user visible confirmation of STT quality in real time (matches how Heimdal already feels on fresh messages for users who had this as a local patch).
Interrupt path: the async monitor_for_interrupt task now transcribes audio media before calling agent.interrupt(), and a new _dequeue_pending_with_transcription helper drives the post-agent drain the same way. Result: voice interrupts reach the running agent with the real transcript, not a placeholder or empty string.

Same 🎙️ echo format for both fresh and interrupt paths — voice interrupts now feel identical to text interrupts from the user's side.

Problem 2 — Cascading interrupt hang (7+ min ""musing…"" after rapid interrupts)

When agent.interrupt() fires during an active LLM streaming call, the main thread intentionally force-closes the worker-local httpx client to stop token generation. The comment at _interruptible_streaming_api_call spells this out: ""Force-close the in-flight worker-local HTTP connection to stop token generation without poisoning the shared client used to seed future retries."" The resulting RemoteProtocolError on the daemon worker thread is the expected consequence, not a network bug.

But the streaming retry loop inside _call() treated this as a transient connection error and retried it — each retry stalling for the full HERMES_STREAM_STALE_TIMEOUT (180s base, up to 300s for large contexts per the scaling at line ~4650). With 2 retries, that's ~6 minutes of ""Reconnecting… (attempt N/M)"" status spam, then a fallback to the non-streaming path that also fails the same way, then eventually a legit InterruptedError delivered to the caller.

The cached-agent twist: the gateway caches AIAgent instances per session (_agent_cache at gateway/run.py line ~6327). When the main thread raises InterruptedError and the turn ends, the daemon worker from the interrupted turn is not joined — it keeps running in the background. When the next turn starts on the same cached agent, the stale worker is still retrying, still emitting ""Reconnecting…"" status, still touching shared client state, and races the new turn.

Observed in production on 2026-04-09: a voice interrupt at 13:00:12, then ""musing…"" for 7 minutes 42 seconds before another user-supplied interrupt at 13:07:54 finally unstuck the agent. Cascading rapid interrupts reproduced the pattern reliably.

Fix — request-local cancellation token

Add a _request_cancelled = {""value"": False} dict scoped to each call of _interruptible_api_call and _interruptible_streaming_api_call. The outer poll loop sets it to True before force-closing the httpx client on interrupt. The worker's retry loop checks it at four decision points and exits cleanly if set:

Top of the retry loop, before each new _stream_attempt — so rapid cascading interrupts don't waste a fresh request.
Inside the except Exception as e: handler, before classifying the error as transient — so the forced RemoteProtocolError is recognized as a cancel, not retried.
Before emitting ""Reconnecting…"" status and retrying — prevents user-facing noise that implies a provider outage.
Before falling back to the non-streaming path — prevents a second doomed request.

Same pattern applied to the non-streaming _interruptible_api_call: the worker's exception handler checks the cancel token and returns without surfacing the caught error, so the main thread's InterruptedError is the only thing callers see.

Why a request-local token instead of `self._interrupt_requested`

self._interrupt_requested is cleared at run_conversation() turn boundaries (clear_interrupt() calls at line ~6899 and ~8883). A stale daemon worker from the previous turn can't reliably observe it — by the time the worker checks, the flag may already be False for the next turn. A token scoped to the specific request survives the turn boundary and unambiguously marks this request as cancelled regardless of what happens to the agent's global state.

Explicitly not done (deliberate)

Not adding self.clear_interrupt() to the InterruptedError handler at line ~7529. It's already cleared at run_conversation entry/exit, and clearing it early risks wiping _interrupt_message before result assembly at lines ~8878-8880.
Not calling self._replace_primary_openai_client() in any interrupt path. That would break the worker-local client isolation design intent (""without poisoning the shared client used to seed future retries"") and add churn on cross-turn races on the cached agent.
Not adding a max wall-clock timeout to the retry loop. HERMES_AGENT_TIMEOUT already bounds the whole turn; adding a narrower retry-budget deserves its own PR if desired.

Design validated by oracle consultation before implementation.

How to test

Automated

New regression tests in tests/test_cascading_interrupt.py with 5 scenarios:

test_interrupt_during_stream_does_not_retry — interrupt mid-stream; assert create() called exactly once, no ""Reconnecting…"" status emitted, elapsed < 2s.
test_cached_agent_after_interrupt_second_turn_clean — interrupt turn A, immediately start turn B on the same agent instance; assert turn B succeeds quickly with no stale-worker contamination.
test_interrupt_during_non_streaming_does_not_leak_error — same guarantee for the non-streaming path.
test_logged_as_cancellation_not_reconnect — verify the cancel-path debug log fires and no ""Streaming attempt N/M failed"" log appears after the interrupt.
test_normal_transient_error_still_retries — regression guard: a genuine RemoteProtocolError (no interrupt) still triggers the retry path so real network errors continue to recover.

Run:

pytest tests/test_cascading_interrupt.py -v

All 5 pass in ~3s. The broader related suite (tests/test_interrupt_propagation.py, tests/test_interactive_interrupt.py, tests/test_streaming.py, tests/test_run_agent.py, tests/tools/test_interrupt.py, tests/gateway/test_stt_config.py) — 104 tests — also all pass with no regressions after this change.

Manual (gateway)

Start a long-running task: ""count to 300 slowly"" or similar.
While the agent is mid-response, send a voice message on the same platform (Telegram, Discord). Expect 🎙️ ""…"" echo within ~1 second, followed by the agent pivoting to the transcribed instruction.
Send a second voice message while the agent is mid-response to the first interrupt. Expect the same 🎙️ echo and a clean handoff — no ""Reconnecting… (attempt N/M)"" messages, no multi-minute hang.
Verify the gateway logs contain ""Force-closing httpx client due to interrupt (not a network error)"" when interrupts fire (diagnostic log, debug level).

Platforms tested

Linux (Arch, kernel 6.18), Python 3.11.15
Telegram gateway adapter (primary reproduction)
Not tested on Windows / macOS (no platform-specific code changed; all modifications are in Python stdlib threading + httpx logic)

File surface

File	Kind	Lines
`gateway/run.py`	Voice-interrupt transcribe + echo paths	+143
`run_agent.py`	Request-local cancel token + checks in both API call paths	+82
`tests/test_cascading_interrupt.py`	New, 5 regression tests	+279
`tests/gateway/test_stt_config.py`	Update callers for new tuple return type	+6 / -4

All changes are additive where possible. No refactoring, no reformatting, no behavior change to paths that don't involve cancellation.

Related upstream work

25080986 fix(gateway): discard empty placeholder when voice transcription succeeds (Discord) — complementary, targets _enrich_message_with_transcription for a different duplication issue on fresh messages.
ae4a884e fix(agent): disable stale stream timeout for local providers (fix(agent): disable stale stream timeout for local providers #6368) — adjacent to my changes in _interruptible_streaming_api_call but outside my modified region. Rebased cleanly.
1a3ae6ac feat: structured API error classification for smart failover (feat: structured API error classification for smart failover #6514) — separate retry-logic improvement; does not overlap.

License

By submitting this PR I agree my contributions are licensed under MIT per CONTRIBUTING.md.

When a voice/audio message arrived while an agent was running, it hit the interrupt path with event.text == "" because STT only happened at line ~2601, *after* the running-agent guard. The voice was silently dropped — the agent saw an empty interrupt and the user never heard back. Fix in two places: 1. Fresh-message path (_enrich_message_with_transcription, line 5659): now returns a (text, transcripts) tuple so callers can echo raw transcripts back to the user before the agent loop starts. The fresh-message dispatch at line 2614 echoes each transcript as 🎙️ "..." immediately, so the user sees STT quality in real time. 2. Interrupt path (gateway's monitor_for_interrupt async task, line ~6710, and the post-agent drain via the new _dequeue_pending_with_transcription helper at line 5750): transcribe audio media BEFORE calling agent.interrupt(), so the running agent gets the real transcript instead of an empty string or a file-path placeholder. Same 🎙️ echo format as fresh voice messages — voice interrupts now feel identical to text interrupts. Tests updated to match the new tuple return type of _enrich_message_with_transcription.

…duced closes ## The bug When agent.interrupt() fires during an active LLM API call, the main thread intentionally force-closes the worker-local httpx client to stop token generation (comment at _interruptible_api_call line 4128 and _interruptible_streaming_api_call line 4662). This raises RemoteProtocolError on the daemon worker thread's chat.completions.create() call — which is the expected consequence, NOT a network bug. The streaming retry loop (_call() at line ~4486) treated this as a transient connection error and retried it, logging "⚠️ Connection to provider dropped (RemoteProtocolError). Reconnecting… (attempt N/M)". Each doomed retry stalled for the full HERMES_STREAM_STALE_TIMEOUT (default 180s, scaled up to 300s for large contexts), producing a multi-minute hang of the "musing..." spinner after rapid interrupts. Worse, because the gateway caches AIAgent instances per session (gateway/run.py:6327-6371), the stale daemon worker outlives the interrupted turn. When the cached agent starts its next turn, the stale worker from the previous turn is still running — retrying, falling back, emitting status messages — and races the new turn on shared client state. This was the root cause of the 7-minute cascading interrupt hang observed in the wild (13:00:12 → 13:07:54 on 2026-04-09). ## The fix Add a request-local cancellation token (_request_cancelled dict) inside both _interruptible_api_call and _interruptible_streaming_api_call. The main thread sets it to True when it observes self._interrupt_requested and is about to force-close the client. The worker's retry loop checks the token at four decision points and exits cleanly on cancellation: 1. Top of the retry loop (before each new _stream_attempt) — so rapid cascading interrupts don't waste a fresh request. 2. Immediately inside the "except Exception as e" block — so the forced RemoteProtocolError is recognized as a cancel, not a transient error. 3. Before emitting the "Reconnecting…" status and retrying — prevents user-facing noise that implies a real network outage. 4. Before falling back to the non-streaming _interruptible_api_call — prevents a second doomed request. Same pattern applied to the non-streaming path: the worker's exception handler checks the cancel token and returns without surfacing the caught error, so the main thread's InterruptedError is the only thing callers see. Why a request-local token instead of self._interrupt_requested: self._interrupt_requested is cleared at run_conversation() turn boundaries (lines 6899 and 8883), so a stale daemon worker from the previous turn can't reliably observe it — by the time the worker checks, the flag may already be False. A token scoped to the specific request survives the turn boundary and unambiguously marks THIS request as cancelled regardless of what happens to the agent's global state. Also adds explicit "Force-closing httpx client due to interrupt (not a network error)" debug log so future debuggers don't misread the subsequent RemoteProtocolError cascade as a provider outage. ## Tests New tests/test_cascading_interrupt.py with 5 regression tests: - test_interrupt_during_stream_does_not_retry — interrupt mid-stream, assert create() called exactly once, no "Reconnecting…" status, <2s. - test_cached_agent_after_interrupt_second_turn_clean — interrupt turn A, immediately start turn B on the SAME agent, assert turn B succeeds quickly with no stale-worker contamination. - test_interrupt_during_non_streaming_does_not_leak_error — same guarantee for the non-streaming path. - test_logged_as_cancellation_not_reconnect — verify the cancel-path debug log fires and the "Streaming attempt N/M failed" log does NOT. - test_normal_transient_error_still_retries — regression guard: real RemoteProtocolError (no interrupt) still triggers the retry path. All 5 pass in 3.08s. No regressions in the existing 257 interrupt / streaming / run_agent / gateway tests. Design validated by oracle consultation (session ses_28e0015f0ffeDet3wmOv9s550z).

alt-glitch · 2026-04-30T00:05:39Z

Related to #8434 (voice messages interrupting with empty text) — this PR provides a more comprehensive fix covering both the empty-text and cascading hang issues.

alt-glitch · 2026-04-30T00:06:09Z

Related to #8434 — this PR provides a more comprehensive fix covering both the empty-text and cascading hang issues.

@kristianvast

…-interrupt hang) When agent.interrupt() fires during an active LLM call, the main poll loop force-closes the worker-local httpx client to stop token generation. That raises a transport error (RemoteProtocolError) on the worker thread — the EXPECTED consequence of our own close, not a network bug. The streaming retry loop misclassified it as a transient connection error and retried; each doomed retry stalled for the full stream-stale timeout (up to 300s). Because the gateway caches AIAgent instances per session, the stale worker outlived the interrupted turn and raced the next turn's request on shared client state — the root of the multi-minute cascading-interrupt hang reported in the wild. Fix: a request-local _request_cancelled token set by the poll loop right before the force-close, in both interruptible_api_call (non-streaming) and interruptible_streaming_api_call. The worker's exception handler checks the token and exits cleanly — no retry, no fallback, no 'reconnecting' status — instead of treating the forced error as transient. The token is request- local (not agent._interrupt_requested, which is cleared at turn boundaries) so a stale worker outliving its turn still recognizes its own forced close. Original diagnosis and fix by @kristianvast (PR #6600), against the then- inline methods in run_agent.py. Those were since extracted into agent/chat_completion_helpers.py, so the fix is reapplied there. Co-authored-by: Kristian Vastveit <kristianvast@users.noreply.github.com>

teknium1 · 2026-06-08T09:19:33Z

Merged via #41952 — #41952

Your diagnosis and fix were spot-on. Since this PR was opened, the _interruptible_api_call / _interruptible_streaming_api_call methods were extracted from run_agent.py into agent/chat_completion_helpers.py, so a clean cherry-pick wasn't possible — the request-local _request_cancelled token fix was reapplied at the new location with the same design, and your authorship is preserved via Co-authored-by in the merged commit.

The voice-transcription-during-active-run portion of this PR (commit 2) was left out of this salvage to keep it focused on the cascading-interrupt fix; that gateway-side change is being tracked separately. Thanks for the thorough root-cause writeup.

@kristianvast

Salvaged from NousResearch#6600 (@kristianvast) — re-scoped to the voice half only and rebased onto current main. The cascading-interrupt hang half of the original PR landed independently in dd0d122, so this carries ONLY Problem 1. When a voice/audio message arrives while the agent is busy on the same session, it hit the interrupt path with empty text because STT only ran after the running-agent guard — the voice was effectively lost. Now we transcribe audio BEFORE signaling the agent (and on the fresh-message path), echo the raw transcript back to the user (🎙️), and _enrich_message_with_transcription returns (text, transcripts) so callers can echo. A new _dequeue_pending_with_transcription drives the post-agent drain the same way. Reapplied onto _prepare_inbound_message_text (inbound enrichment was extracted from the inline dispatch block since the original PR). Co-authored-by: Kristian Vastveit <kristian@agrointel.no>

…ming-worker fix(gateway): transcribe voice messages during active agent runs (salvage #6600, voice half)

@kristianvast

…-interrupt hang) When agent.interrupt() fires during an active LLM call, the main poll loop force-closes the worker-local httpx client to stop token generation. That raises a transport error (RemoteProtocolError) on the worker thread — the EXPECTED consequence of our own close, not a network bug. The streaming retry loop misclassified it as a transient connection error and retried; each doomed retry stalled for the full stream-stale timeout (up to 300s). Because the gateway caches AIAgent instances per session, the stale worker outlived the interrupted turn and raced the next turn's request on shared client state — the root of the multi-minute cascading-interrupt hang reported in the wild. Fix: a request-local _request_cancelled token set by the poll loop right before the force-close, in both interruptible_api_call (non-streaming) and interruptible_streaming_api_call. The worker's exception handler checks the token and exits cleanly — no retry, no fallback, no 'reconnecting' status — instead of treating the forced error as transient. The token is request- local (not agent._interrupt_requested, which is cleared at turn boundaries) so a stale worker outliving its turn still recognizes its own forced close. Original diagnosis and fix by @kristianvast (PR NousResearch#6600), against the then- inline methods in run_agent.py. Those were since extracted into agent/chat_completion_helpers.py, so the fix is reapplied there. Co-authored-by: Kristian Vastveit <kristianvast@users.noreply.github.com>

@kristianvast

Salvaged from NousResearch#6600 (@kristianvast) — re-scoped to the voice half only and rebased onto current main. The cascading-interrupt hang half of the original PR landed independently in dd0d122, so this carries ONLY Problem 1. When a voice/audio message arrives while the agent is busy on the same session, it hit the interrupt path with empty text because STT only ran after the running-agent guard — the voice was effectively lost. Now we transcribe audio BEFORE signaling the agent (and on the fresh-message path), echo the raw transcript back to the user (🎙️), and _enrich_message_with_transcription returns (text, transcripts) so callers can echo. A new _dequeue_pending_with_transcription drives the post-agent drain the same way. Reapplied onto _prepare_inbound_message_text (inbound enrichment was extracted from the inline dispatch block since the original PR). Co-authored-by: Kristian Vastveit <kristian@agrointel.no>

@kristianvast

…-interrupt hang) When agent.interrupt() fires during an active LLM call, the main poll loop force-closes the worker-local httpx client to stop token generation. That raises a transport error (RemoteProtocolError) on the worker thread — the EXPECTED consequence of our own close, not a network bug. The streaming retry loop misclassified it as a transient connection error and retried; each doomed retry stalled for the full stream-stale timeout (up to 300s). Because the gateway caches AIAgent instances per session, the stale worker outlived the interrupted turn and raced the next turn's request on shared client state — the root of the multi-minute cascading-interrupt hang reported in the wild. Fix: a request-local _request_cancelled token set by the poll loop right before the force-close, in both interruptible_api_call (non-streaming) and interruptible_streaming_api_call. The worker's exception handler checks the token and exits cleanly — no retry, no fallback, no 'reconnecting' status — instead of treating the forced error as transient. The token is request- local (not agent._interrupt_requested, which is cleared at turn boundaries) so a stale worker outliving its turn still recognizes its own forced close. Original diagnosis and fix by @kristianvast (PR NousResearch#6600), against the then- inline methods in run_agent.py. Those were since extracted into agent/chat_completion_helpers.py, so the fix is reapplied there. Co-authored-by: Kristian Vastveit <kristianvast@users.noreply.github.com>

@kristianvast

Salvaged from NousResearch#6600 (@kristianvast) — re-scoped to the voice half only and rebased onto current main. The cascading-interrupt hang half of the original PR landed independently in dd0d122, so this carries ONLY Problem 1. When a voice/audio message arrives while the agent is busy on the same session, it hit the interrupt path with empty text because STT only ran after the running-agent guard — the voice was effectively lost. Now we transcribe audio BEFORE signaling the agent (and on the fresh-message path), echo the raw transcript back to the user (🎙️), and _enrich_message_with_transcription returns (text, transcripts) so callers can echo. A new _dequeue_pending_with_transcription drives the post-agent drain the same way. Reapplied onto _prepare_inbound_message_text (inbound enrichment was extracted from the inline dispatch block since the original PR). Co-authored-by: Kristian Vastveit <kristian@agrointel.no>

… fork consolidation; finish fork-feature ports Per-cluster restoration with the test suite as the oracle, after comparing the merged tree's failures against a pristine-upstream run in the same environment (14 file-level deltas, now zero): - gateway/run.py: upstream wholesale (fork's monolith had undone the mixin decomposition; both real fork deltas re-applied — voice_ack_callback **kwargs; the custom-providers context-length fix exists upstream). - agent/conversation_loop.py + turn_context.py: upstream structure with the fork features regrafted at their new homes — sender_device attribution (#131), preflight token-usage emission + compression-complete status and live-estimate snapshots (#126). - agent/chat_completion_helpers.py: upstream wholesale (brings the second partial-stream-stub routing site and the NousResearch#6600 cancellation fix). - agent/tool_executor.py: usage= kwarg on tool start/complete callbacks now falls back to the bare 3-arg form for legacy receivers. - tools/approval.py: upstream's resolved-HERMES_HOME rewrite + normalize steps restored alongside the fork's self-host kill guard (#128). - hermes_cli/main.py: desktop install-identity stale-build cluster and the post-subcommand global-flag hoister ported from fork main. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@kristianvast

…-interrupt hang) When agent.interrupt() fires during an active LLM call, the main poll loop force-closes the worker-local httpx client to stop token generation. That raises a transport error (RemoteProtocolError) on the worker thread — the EXPECTED consequence of our own close, not a network bug. The streaming retry loop misclassified it as a transient connection error and retried; each doomed retry stalled for the full stream-stale timeout (up to 300s). Because the gateway caches AIAgent instances per session, the stale worker outlived the interrupted turn and raced the next turn's request on shared client state — the root of the multi-minute cascading-interrupt hang reported in the wild. Fix: a request-local _request_cancelled token set by the poll loop right before the force-close, in both interruptible_api_call (non-streaming) and interruptible_streaming_api_call. The worker's exception handler checks the token and exits cleanly — no retry, no fallback, no 'reconnecting' status — instead of treating the forced error as transient. The token is request- local (not agent._interrupt_requested, which is cleared at turn boundaries) so a stale worker outliving its turn still recognizes its own forced close. Original diagnosis and fix by @kristianvast (PR #6600), against the then- inline methods in run_agent.py. Those were since extracted into agent/chat_completion_helpers.py, so the fix is reapplied there. Co-authored-by: Kristian Vastveit <kristianvast@users.noreply.github.com>

@kristianvast

Salvaged from #6600 (@kristianvast) — re-scoped to the voice half only and rebased onto current main. The cascading-interrupt hang half of the original PR landed independently in 6370360, so this carries ONLY Problem 1. When a voice/audio message arrives while the agent is busy on the same session, it hit the interrupt path with empty text because STT only ran after the running-agent guard — the voice was effectively lost. Now we transcribe audio BEFORE signaling the agent (and on the fresh-message path), echo the raw transcript back to the user (🎙️), and _enrich_message_with_transcription returns (text, transcripts) so callers can echo. A new _dequeue_pending_with_transcription drives the post-agent drain the same way. Reapplied onto _prepare_inbound_message_text (inbound enrichment was extracted from the inline dispatch block since the original PR). Co-authored-by: Kristian Vastveit <kristian@agrointel.no>

…ming-worker fix(gateway): transcribe voice messages during active agent runs (salvage #6600, voice half)

kristianvast added 2 commits April 9, 2026 14:37

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/gateway Gateway runner, session dispatch, delivery comp/agent Core agent loop, run_agent.py, prompt builder tool/tts Text-to-speech and transcription labels Apr 30, 2026

This was referenced Jun 8, 2026

fix(memory): run end-of-turn sync off the turn thread (agent stuck 'running') #41945

Merged

fix(agent): don't retry interrupt-induced transport errors (salvage #6600) #41952

Merged

teknium1 closed this in #41952 Jun 8, 2026

kshitijk4poor mentioned this pull request Jun 8, 2026

fix(gateway): transcribe voice messages during active agent runs (salvage #6600, voice half) #41984

Merged

kshitijk4poor added a commit that referenced this pull request Jun 8, 2026

Merge pull request #41984 from kshitijk4poor/salvage/6600-stale-strea…

c3055d6

…ming-worker fix(gateway): transcribe voice messages during active agent runs (salvage #6600, voice half)

liuhao1024 mentioned this pull request Jun 8, 2026

fix(config): map provider shortcut to model.provider for hermes config set #41970

Closed

13 tasks

liuhao1024 mentioned this pull request Jun 10, 2026

fix(agent): escalate consecutive stale-stream kills to trigger provider fallback #43222

Open

13 tasks

alt-glitch pushed a commit that referenced this pull request Jun 14, 2026

Merge pull request #41984 from kshitijk4poor/salvage/6600-stale-strea…

38fadb0

…ming-worker fix(gateway): transcribe voice messages during active agent runs (salvage #6600, voice half)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent,gateway): voice interrupts + cascading interrupt hang#6600

fix(agent,gateway): voice interrupts + cascading interrupt hang#6600
kristianvast wants to merge 2 commits into
NousResearch:mainfrom
kristianvast:fix/voice-interrupt-and-cascading-worker

kristianvast commented Apr 9, 2026

Uh oh!

alt-glitch commented Apr 30, 2026

Uh oh!

alt-glitch commented Apr 30, 2026

Uh oh!

teknium1 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kristianvast commented Apr 9, 2026

Summary

Problem 1 — Voice messages during active agent runs silently drop

Fix

Problem 2 — Cascading interrupt hang (7+ min ""musing…"" after rapid interrupts)

Fix — request-local cancellation token

Why a request-local token instead of self._interrupt_requested

Explicitly not done (deliberate)

How to test

Automated

Manual (gateway)

Platforms tested

File surface

Related upstream work

License

Uh oh!

alt-glitch commented Apr 30, 2026

Uh oh!

alt-glitch commented Apr 30, 2026

Uh oh!

teknium1 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Why a request-local token instead of `self._interrupt_requested`