Feature: Message Coalescing for Gateway Platforms

## Overview

When users send rapid-fire messages on platforms like Telegram, Discord, or Slack, each message currently triggers a separate LLM turn. This causes fragmented responses, wasted API calls, and an overall poor experience — the agent starts responding to message 1 while messages 2, 3, and 4 are still arriving, often with important context.

[Spacedrive's Spacebot](https://github.com/spacedriveapp/spacebot) implements **message coalescing** — a debounce mechanism that accumulates rapid messages into a single batched turn before processing. This is a proven pattern from their multi-user community bot that handles Discord, Slack, Telegram, and Twitch.

This feature would add the same capability to Hermes Agent's gateway platforms.

**Research source:** [Spacebot source code](https://github.com/spacedriveapp/spacebot/blob/main/src/agent/channel.rs) (lines 160-476)

---

## Research Findings

### How Spacebot's Coalescing Works

The implementation lives in `src/agent/channel.rs` with configuration in `CoalesceConfig`:

```
Defaults:
  enabled: true
  debounce_ms: 1500      # Wait this long after each message for more
  max_wait_ms: 5000      # Maximum total wait from first message
  min_messages: 2        # Minimum messages to trigger batching
  multi_user_only: true  # Only batch in multi-user channels (not DMs)
```

**Algorithm:**

1. **should_coalesce()** checks: enabled? not a system retrigger? not a DM (when multi_user_only)?

2. If yes, message is pushed to a `coalesce_buffer` and `update_coalesce_deadline()` is called.

3. **Deadline logic:**
   - If buffer has >= min_messages: set deadline to `debounce_ms` from now, capped at `max_wait_ms` from the first message's timestamp
   - If buffer has < min_messages: set a short `debounce_ms` deadline

4. The channel event loop (tokio::select!) checks `coalesce_deadline` on each iteration.

5. **On deadline expiry:** `flush_coalesce_buffer()` is called:
   - **Single message:** Processed normally
   - **Multiple messages:** Batched via `handle_message_batch()` which formats all messages with attribution and timestamps, then presents them as a single user turn. A coalesce hint is injected:

   > "N messages arrived in Xs. This is a fast-moving conversation with multiple participants..."

### Why This Matters

Without coalescing:
```
User sends: "hey"           → Agent starts LLM call #1
User sends: "can you check" → Agent starts LLM call #2
User sends: "the logs"      → Agent starts LLM call #3
User sends: "from yesterday"→ Agent starts LLM call #4

Result: 4 separate LLM calls, 4 fragmented responses, $$$
```

With coalescing (1.5s debounce):
```
User sends: "hey"                     → Buffer: 1 msg, deadline in 1.5s
User sends: "can you check" (0.3s)    → Buffer: 2 msgs, deadline in 1.5s
User sends: "the logs" (0.8s)         → Buffer: 3 msgs, deadline in 1.5s
User sends: "from yesterday" (1.2s)   → Buffer: 4 msgs, deadline in 1.5s
                            (2.7s)    → Deadline fires → single batched turn

Result: 1 LLM call, coherent response, ¢
```

---

## Current State in Hermes Agent

### What We Have
Each gateway platform adapter processes messages immediately:
- `gateway/platforms/telegram.py` — on_message triggers immediate `handle_message()`
- `gateway/platforms/discord_adapter.py` — same pattern
- `gateway/platforms/slack.py` — same pattern
- No debounce, no accumulation window, no batching logic

### What's Missing
- No message buffer or coalesce timer per session/channel
- No configurable debounce window
- No multi-message formatting with attribution
- No hint injection for batched messages

### Integration Points
- `gateway/run.py` — Message routing (GatewayRunner.handle_message)
- `gateway/session.py` — Session management (SessionStore)
- `gateway/platforms/*.py` — Platform adapters
- `~/.hermes/config.yaml` — Config (where coalescing settings would live)

---

## Implementation Plan

### Skill vs. Tool Classification

This should be a **core codebase feature** (in the gateway module) because:
- It modifies the message flow in `gateway/run.py` or session management
- It needs timer/async scheduling integrated with the event loop
- It must work across all platform adapters consistently
- It's a platform-level behavior, not an agent-level capability

### What We'd Need

1. **CoalesceBuffer class** — Per-session message accumulation with deadline tracking
2. **Config** — `message_coalescing` section in config.yaml (enabled, debounce_ms, max_wait_ms, etc.)
3. **Batch formatter** — Combine multiple messages into a single attributed turn
4. **Timer integration** — asyncio deadline management in the gateway event loop
5. **Platform-specific tuning** — DM vs. group behavior, platform-specific quirks

### Phased Rollout

**Phase 1: Basic debounce for all platforms**
- CoalesceBuffer with configurable debounce_ms (default: 1500) and max_wait_ms (5000)
- Implemented in gateway/run.py (platform-agnostic)
- Simple concatenation of message texts with timestamps
- Config: `message_coalescing.enabled: true` in config.yaml
- Deliverable: Rapid messages batched into single turn

**Phase 2: Attributed batching and hint injection**
- Format batched messages with user attribution (important for multi-user channels)
- Inject coalesce hint into the batched message ("N messages arrived...")
- DM vs. group channel detection for multi_user_only mode
- Platform-specific message metadata (Discord thread IDs, Telegram reply chains)
- Deliverable: Clean multi-user message batching

**Phase 3: Smart coalescing**
- Detect "complete thought" patterns to flush early (question marks, commands)
- Adjust debounce based on user typing speed patterns
- File attachment handling (images/documents should trigger flush)
- Integrate with session context (if agent is already processing, queue next batch)
- Deliverable: Intelligent coalescing that feels natural

---

## Pros & Cons

### Pros
- **Cost reduction** — Fewer LLM calls for rapid multi-message input (common pattern on chat platforms)
- **Better responses** — Agent sees the full context of what the user intended, not fragments
- **Proven pattern** — Battle-tested in Spacebot's community bot across multiple platforms
- **Low complexity** — The core is a simple debounce timer + buffer, ~100-200 lines
- **User-configurable** — Can be disabled for users who prefer immediate response

### Cons / Risks
- **Perceived latency** — Users may feel the 1.5s wait is "slow" in DMs (mitigated by multi_user_only mode)
- **Typing indicators** — Need to show "thinking" during the coalesce window so users know the bot is "listening"
- **Platform differences** — Telegram, Discord, Slack have different message timing patterns
- **Edge cases** — File attachments, inline images, voice messages need special handling

---

## Open Questions

- Should coalescing be enabled by default, or opt-in? (Spacebot defaults to enabled, multi_user_only)
- What's the right default debounce window? 1.5s feels right for chat, but may need tuning per platform.
- Should DMs be exempt by default? (Spacebot's multi_user_only=true exempts DMs)
- How should the coalesced message be formatted? Simple concatenation, or attributed with timestamps?
- Should the agent show a "typing" indicator during the coalesce window?

---

## References

- [Spacebot source: channel.rs](https://github.com/spacedriveapp/spacebot/blob/main/src/agent/channel.rs) (lines 160-476)
- [Spacebot source: CoalesceConfig](https://github.com/spacedriveapp/spacebot/blob/main/src/config/types.rs) (lines 644-673)
- [Spacebot coalesce hint template](https://github.com/spacedriveapp/spacebot/blob/main/prompts/en/fragments/coalesce_hint.md.j2)
- [Spacebot README](https://github.com/spacedriveapp/spacebot#readme)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Message Coalescing for Gateway Platforms #345

Overview

Research Findings

How Spacebot's Coalescing Works

Why This Matters

Current State in Hermes Agent

What We Have

What's Missing

Integration Points

Implementation Plan

Skill vs. Tool Classification

What We'd Need

Phased Rollout

Pros & Cons

Pros

Cons / Risks

Open Questions

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature: Message Coalescing for Gateway Platforms #345

Description

Overview

Research Findings

How Spacebot's Coalescing Works

Why This Matters

Current State in Hermes Agent

What We Have

What's Missing

Integration Points

Implementation Plan

Skill vs. Tool Classification

What We'd Need

Phased Rollout

Pros & Cons

Pros

Cons / Risks

Open Questions

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions