Skip to content

fix: improve interrupt responsiveness during concurrent tool execution#10935

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-8a120751
Apr 16, 2026
Merged

fix: improve interrupt responsiveness during concurrent tool execution#10935
teknium1 merged 1 commit into
mainfrom
hermes/hermes-8a120751

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Fixes the 'agent stuck on long terminal command' issue reported by @_SushantSays on Twitter. When a user sends a message while the agent is running a long-running terminal command, the interrupt is acknowledged but the follow-up can feel unresponsive.

Changes

1. Concurrent tool wait loop now checks interrupts (run_agent.py)

The sequential tool execution path already checked _interrupt_requested before each tool call (line 7538), but the concurrent path's wait loop was a blind 30-second poll. Now:

  • Polls every 5s instead of 30s
  • Checks _interrupt_requested each poll
  • Cancels pending futures on interrupt
  • Gives already-running tools 3s to notice the per-thread interrupt signal

2. Cancelled concurrent tools get proper interrupt messages (run_agent.py)

When a concurrent tool is cancelled or didn't return a result due to interrupt, the tool result message says 'skipped due to user interrupt' instead of a generic error.

3. Typing indicator fires before follow-up turn (gateway/run.py)

After an interrupt is acknowledged and the pending message dequeued, the gateway sends a typing indicator before starting the recursive _run_agent call. This closes the perceived 'dead air' gap between the interrupt ack and the response — the user sees the typing bubble immediately instead of silence while the system bootstraps the new turn.

Test plan

  • New test: tests/run_agent/test_concurrent_interrupt.py — verifies both the mid-execution interrupt path and the pre-flight skip path
  • Existing interrupt tests pass: test_busy_session_ack, test_interrupt_key_match, test_telegram_photo_interrupts
  • 798 passed, 2 pre-existing failures (missing OPENROUTER_API_KEY in local env)

Context

The core interrupt mechanism works correctly (per-thread interrupt flags, terminal process group kill, sequential tool skip). The remaining user-perceived 'stuck' state comes from:

  1. The concurrent tool path not checking interrupts during its wait loop
  2. No visual feedback between the interrupt ack and the follow-up turn starting

Reported by @_SushantSays via Twitter thread.

…n and follow-up turns

Three targeted fixes for the 'agent stuck on terminal command' report:

1. **Concurrent tool wait loop now checks interrupts** (run_agent.py)
   The sequential path checked _interrupt_requested before each tool call,
   but the concurrent path's wait loop just blocked with 30s timeouts.
   Now polls every 5s and cancels pending futures on interrupt, giving
   already-running tools 3s to notice the per-thread interrupt signal.

2. **Cancelled concurrent tools get proper interrupt messages** (run_agent.py)
   When a concurrent tool is cancelled or didn't return a result due to
   interrupt, the tool result message says 'skipped due to user interrupt'
   instead of a generic error.

3. **Typing indicator fires before follow-up turn** (gateway/run.py)
   After an interrupt is acknowledged and the pending message dequeued,
   the gateway now sends a typing indicator before starting the recursive
   _run_agent call. This gives the user immediate visual feedback that
   the system is processing their new message (closing the perceived
   'dead air' gap between the interrupt ack and the response).

Reported by @_SushantSays.
@teknium1 teknium1 merged commit 333cb82 into main Apr 16, 2026
6 of 7 checks passed
@teknium1 teknium1 deleted the hermes/hermes-8a120751 branch April 16, 2026 09:44
shuv1337 added a commit to shuv1337/hermes-agent that referenced this pull request Apr 16, 2026
Upstream's QQ platform hint (added in the salvaged PR NousResearch#10935/NousResearch#10940 block
that landed during this merge) re-introduced the 'MEDIA:/absolute/path/to/file'
literal that our prior commit 03f8a5e fixed for WeCom. The
ANTHROPIC_OAUTH_BLOCKED_LITERALS guard test flagged it.

Same fix pattern: 'absolute' -> 'full' to break tokenization.
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
…n and follow-up turns (NousResearch#10935)

Three targeted fixes for the 'agent stuck on terminal command' report:

1. **Concurrent tool wait loop now checks interrupts** (run_agent.py)
   The sequential path checked _interrupt_requested before each tool call,
   but the concurrent path's wait loop just blocked with 30s timeouts.
   Now polls every 5s and cancels pending futures on interrupt, giving
   already-running tools 3s to notice the per-thread interrupt signal.

2. **Cancelled concurrent tools get proper interrupt messages** (run_agent.py)
   When a concurrent tool is cancelled or didn't return a result due to
   interrupt, the tool result message says 'skipped due to user interrupt'
   instead of a generic error.

3. **Typing indicator fires before follow-up turn** (gateway/run.py)
   After an interrupt is acknowledged and the pending message dequeued,
   the gateway now sends a typing indicator before starting the recursive
   _run_agent call. This gives the user immediate visual feedback that
   the system is processing their new message (closing the perceived
   'dead air' gap between the interrupt ack and the response).

Reported by @_SushantSays.
aj-nt pushed a commit to aj-nt/hermes-agent that referenced this pull request May 1, 2026
…n and follow-up turns (NousResearch#10935)

Three targeted fixes for the 'agent stuck on terminal command' report:

1. **Concurrent tool wait loop now checks interrupts** (run_agent.py)
   The sequential path checked _interrupt_requested before each tool call,
   but the concurrent path's wait loop just blocked with 30s timeouts.
   Now polls every 5s and cancels pending futures on interrupt, giving
   already-running tools 3s to notice the per-thread interrupt signal.

2. **Cancelled concurrent tools get proper interrupt messages** (run_agent.py)
   When a concurrent tool is cancelled or didn't return a result due to
   interrupt, the tool result message says 'skipped due to user interrupt'
   instead of a generic error.

3. **Typing indicator fires before follow-up turn** (gateway/run.py)
   After an interrupt is acknowledged and the pending message dequeued,
   the gateway now sends a typing indicator before starting the recursive
   _run_agent call. This gives the user immediate visual feedback that
   the system is processing their new message (closing the perceived
   'dead air' gap between the interrupt ack and the response).

Reported by @_SushantSays.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…n and follow-up turns (NousResearch#10935)

Three targeted fixes for the 'agent stuck on terminal command' report:

1. **Concurrent tool wait loop now checks interrupts** (run_agent.py)
   The sequential path checked _interrupt_requested before each tool call,
   but the concurrent path's wait loop just blocked with 30s timeouts.
   Now polls every 5s and cancels pending futures on interrupt, giving
   already-running tools 3s to notice the per-thread interrupt signal.

2. **Cancelled concurrent tools get proper interrupt messages** (run_agent.py)
   When a concurrent tool is cancelled or didn't return a result due to
   interrupt, the tool result message says 'skipped due to user interrupt'
   instead of a generic error.

3. **Typing indicator fires before follow-up turn** (gateway/run.py)
   After an interrupt is acknowledged and the pending message dequeued,
   the gateway now sends a typing indicator before starting the recursive
   _run_agent call. This gives the user immediate visual feedback that
   the system is processing their new message (closing the perceived
   'dead air' gap between the interrupt ack and the response).

Reported by @_SushantSays.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…n and follow-up turns (NousResearch#10935)

Three targeted fixes for the 'agent stuck on terminal command' report:

1. **Concurrent tool wait loop now checks interrupts** (run_agent.py)
   The sequential path checked _interrupt_requested before each tool call,
   but the concurrent path's wait loop just blocked with 30s timeouts.
   Now polls every 5s and cancels pending futures on interrupt, giving
   already-running tools 3s to notice the per-thread interrupt signal.

2. **Cancelled concurrent tools get proper interrupt messages** (run_agent.py)
   When a concurrent tool is cancelled or didn't return a result due to
   interrupt, the tool result message says 'skipped due to user interrupt'
   instead of a generic error.

3. **Typing indicator fires before follow-up turn** (gateway/run.py)
   After an interrupt is acknowledged and the pending message dequeued,
   the gateway now sends a typing indicator before starting the recursive
   _run_agent call. This gives the user immediate visual feedback that
   the system is processing their new message (closing the perceived
   'dead air' gap between the interrupt ack and the response).

Reported by @_SushantSays.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…n and follow-up turns (NousResearch#10935)

Three targeted fixes for the 'agent stuck on terminal command' report:

1. **Concurrent tool wait loop now checks interrupts** (run_agent.py)
   The sequential path checked _interrupt_requested before each tool call,
   but the concurrent path's wait loop just blocked with 30s timeouts.
   Now polls every 5s and cancels pending futures on interrupt, giving
   already-running tools 3s to notice the per-thread interrupt signal.

2. **Cancelled concurrent tools get proper interrupt messages** (run_agent.py)
   When a concurrent tool is cancelled or didn't return a result due to
   interrupt, the tool result message says 'skipped due to user interrupt'
   instead of a generic error.

3. **Typing indicator fires before follow-up turn** (gateway/run.py)
   After an interrupt is acknowledged and the pending message dequeued,
   the gateway now sends a typing indicator before starting the recursive
   _run_agent call. This gives the user immediate visual feedback that
   the system is processing their new message (closing the perceived
   'dead air' gap between the interrupt ack and the response).

Reported by @_SushantSays.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant