[Feature]: Allow queued messages to participate in ongoing agent loop execution

## Problem or Use Case

Currently, Hermes Agent's queued messages (e.g., from Telegram/Discord gateway platforms) follow a strict "queue-and-wait" model:
- Messages enter the queue and wait until the current agent loop execution fully completes
- If the agent is executing a long-running task (e.g., code generation, multi-step debugging), new messages in the queue are blocked
- User follow-up messages, clarifications, or urgent instructions cannot interrupt or participate in the ongoing loop execution

This creates two specific pain points:

**Pain Point 1 — Conversation Breakage**
When a user sees the agent outputting lengthy content and wants to add "stop, try a different approach", that message must wait until the current loop finishes. By the time the agent finally reads it, the context has already diverged.

**Pain Point 2 — Inefficient Multi-turn Collaboration**
In complex tasks (e.g., "help me write a scraper" → agent starts writing → user wants to add "include proxy support"), the user's supplementary message cannot real-time influence the ongoing code generation. The user must wait for the first draft to complete before making revisions, wasting tokens and time.

## Proposed Solution

Introduce a **Queue Message Interruption / Participation** mechanism that allows queued messages to participate in an ongoing agent loop execution.

### Option A: Soft Interrupt + Context Injection (Recommended)

At each tool-call return point (or after each LLM call) in the agent loop, check if the queue has new messages:

```python
# Pseudocode
while not task_complete:
    # 1. Execute current step
    result = execute_next_step(state)
    
    # 2. Check queue for new messages (non-blocking, lightweight query)
    queued_messages = check_queue_non_blocking()
    
    # 3. If new messages exist, inject into next prompt
    if queued_messages:
        state.context += format_queued_messages(queued_messages)
        # Optional: decide whether to interrupt based on message content
        if should_interrupt(queued_messages):
            state = handle_interrupt(state, queued_messages)
    
    # 4. Continue or terminate
    if state.interrupted:
        break
```

**Key Design Points:**
- **Check Timing**: Check the queue after each tool call returns and before the next LLM call — never interrupt a running tool
- **Injection Format**: Format queued messages as `[New message from user_id] content` and inject into the prompt
- **Interrupt Strategy**: User-configurable — `always` (any new message interrupts), `keyword` (specific keywords like "stop"/"halt" trigger interrupt), `never` (inject only, no interrupt)
- **State Preservation**: Save current state on interrupt, allowing resume or discard

### Option B: Parallel Sub-session (Alternative)

Spawn queued messages as independent subagent sessions. The main session and sub-session communicate via shared state:
- Main session continues executing the current task
- Sub-session handles user clarifications/supplements
- Sub-session can optionally merge results back into main session or be discarded

**Drawback**: Increases concurrency complexity, and two sessions may compete for the same resources (e.g., editing the same file).

## Alternatives Considered

| Option | Pros | Cons |
|--------|------|------|
| A. Soft interrupt + injection | Simple to implement, coherent user experience, token-efficient | Requires core loop modification |
| B. Parallel sub-session | Does not block main task | Concurrency complexity, resource contention, context splitting |
| C. Status quo | No implementation cost | Poor UX, conversation breakage |

## Feature Type

Core agent loop enhancement

## Scope

Medium — touches `run_agent.py`, gateway message dispatch, and potentially prompt builder

## Contribution

- [ ] I'd like to implement this myself and submit a PR

## Additional Context

Implementation references:
- Similar to Claude Code's `interrupt` handling (user presses Ctrl+C and enters a new instruction)
- Similar to ChatGPT web's "edit and resend" (allows user to modify a sent message and regenerate)
- Complements Hermes' existing `delegate_task`: delegate is "I assign someone else", this is "someone interrupts me"


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Allow queued messages to participate in ongoing agent loop execution #17298

Problem or Use Case

Proposed Solution

Option A: Soft Interrupt + Context Injection (Recommended)

Option B: Parallel Sub-session (Alternative)

Alternatives Considered

Feature Type

Scope

Contribution

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Option	Pros	Cons
A. Soft interrupt + injection	Simple to implement, coherent user experience, token-efficient	Requires core loop modification
B. Parallel sub-session	Does not block main task	Concurrency complexity, resource contention, context splitting
C. Status quo	No implementation cost	Poor UX, conversation breakage

[Feature]: Allow queued messages to participate in ongoing agent loop execution #17298

Description

Problem or Use Case

Proposed Solution

Option A: Soft Interrupt + Context Injection (Recommended)

Option B: Parallel Sub-session (Alternative)

Alternatives Considered

Feature Type

Scope

Contribution

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions