Problem or Use Case
Currently, Hermes Agent's queued messages (e.g., from Telegram/Discord gateway platforms) follow a strict "queue-and-wait" model:
- Messages enter the queue and wait until the current agent loop execution fully completes
- If the agent is executing a long-running task (e.g., code generation, multi-step debugging), new messages in the queue are blocked
- User follow-up messages, clarifications, or urgent instructions cannot interrupt or participate in the ongoing loop execution
This creates two specific pain points:
Pain Point 1 — Conversation Breakage
When a user sees the agent outputting lengthy content and wants to add "stop, try a different approach", that message must wait until the current loop finishes. By the time the agent finally reads it, the context has already diverged.
Pain Point 2 — Inefficient Multi-turn Collaboration
In complex tasks (e.g., "help me write a scraper" → agent starts writing → user wants to add "include proxy support"), the user's supplementary message cannot real-time influence the ongoing code generation. The user must wait for the first draft to complete before making revisions, wasting tokens and time.
Proposed Solution
Introduce a Queue Message Interruption / Participation mechanism that allows queued messages to participate in an ongoing agent loop execution.
Option A: Soft Interrupt + Context Injection (Recommended)
At each tool-call return point (or after each LLM call) in the agent loop, check if the queue has new messages:
# Pseudocode
while not task_complete:
# 1. Execute current step
result = execute_next_step(state)
# 2. Check queue for new messages (non-blocking, lightweight query)
queued_messages = check_queue_non_blocking()
# 3. If new messages exist, inject into next prompt
if queued_messages:
state.context += format_queued_messages(queued_messages)
# Optional: decide whether to interrupt based on message content
if should_interrupt(queued_messages):
state = handle_interrupt(state, queued_messages)
# 4. Continue or terminate
if state.interrupted:
break
Key Design Points:
- Check Timing: Check the queue after each tool call returns and before the next LLM call — never interrupt a running tool
- Injection Format: Format queued messages as
[New message from user_id] content and inject into the prompt
- Interrupt Strategy: User-configurable —
always (any new message interrupts), keyword (specific keywords like "stop"/"halt" trigger interrupt), never (inject only, no interrupt)
- State Preservation: Save current state on interrupt, allowing resume or discard
Option B: Parallel Sub-session (Alternative)
Spawn queued messages as independent subagent sessions. The main session and sub-session communicate via shared state:
- Main session continues executing the current task
- Sub-session handles user clarifications/supplements
- Sub-session can optionally merge results back into main session or be discarded
Drawback: Increases concurrency complexity, and two sessions may compete for the same resources (e.g., editing the same file).
Alternatives Considered
| Option |
Pros |
Cons |
| A. Soft interrupt + injection |
Simple to implement, coherent user experience, token-efficient |
Requires core loop modification |
| B. Parallel sub-session |
Does not block main task |
Concurrency complexity, resource contention, context splitting |
| C. Status quo |
No implementation cost |
Poor UX, conversation breakage |
Feature Type
Core agent loop enhancement
Scope
Medium — touches run_agent.py, gateway message dispatch, and potentially prompt builder
Contribution
Additional Context
Implementation references:
- Similar to Claude Code's
interrupt handling (user presses Ctrl+C and enters a new instruction)
- Similar to ChatGPT web's "edit and resend" (allows user to modify a sent message and regenerate)
- Complements Hermes' existing
delegate_task: delegate is "I assign someone else", this is "someone interrupts me"
Problem or Use Case
Currently, Hermes Agent's queued messages (e.g., from Telegram/Discord gateway platforms) follow a strict "queue-and-wait" model:
This creates two specific pain points:
Pain Point 1 — Conversation Breakage
When a user sees the agent outputting lengthy content and wants to add "stop, try a different approach", that message must wait until the current loop finishes. By the time the agent finally reads it, the context has already diverged.
Pain Point 2 — Inefficient Multi-turn Collaboration
In complex tasks (e.g., "help me write a scraper" → agent starts writing → user wants to add "include proxy support"), the user's supplementary message cannot real-time influence the ongoing code generation. The user must wait for the first draft to complete before making revisions, wasting tokens and time.
Proposed Solution
Introduce a Queue Message Interruption / Participation mechanism that allows queued messages to participate in an ongoing agent loop execution.
Option A: Soft Interrupt + Context Injection (Recommended)
At each tool-call return point (or after each LLM call) in the agent loop, check if the queue has new messages:
Key Design Points:
[New message from user_id] contentand inject into the promptalways(any new message interrupts),keyword(specific keywords like "stop"/"halt" trigger interrupt),never(inject only, no interrupt)Option B: Parallel Sub-session (Alternative)
Spawn queued messages as independent subagent sessions. The main session and sub-session communicate via shared state:
Drawback: Increases concurrency complexity, and two sessions may compete for the same resources (e.g., editing the same file).
Alternatives Considered
Feature Type
Core agent loop enhancement
Scope
Medium — touches
run_agent.py, gateway message dispatch, and potentially prompt builderContribution
Additional Context
Implementation references:
interrupthandling (user presses Ctrl+C and enters a new instruction)delegate_task: delegate is "I assign someone else", this is "someone interrupts me"