Problem
In Discord channels, Telegram groups, and other non-DM contexts, all users share a single session. The session key is constructed as agent:main:{platform}:{chat_type}:{chat_id} (in gateway/session.py build_session_key()), meaning every message from every user in that channel feeds into one conversation history.
This causes context to balloon much faster than single-user DMs, making it significantly more likely to hit the context limit — especially in active Discord servers where multiple people may be chatting with the bot concurrently.
This was a contributing factor to #813 — the session hit 233k tokens in a Discord conversation.
Current Behavior
- All users in a Discord channel/group share one conversation context
- User A asks a complex question triggering 10+ tool calls → context grows by ~50k tokens
- User B asks an unrelated question → gets all of User A's context plus their own
- Multiple users active in the same channel can push context past the limit within a few exchanges
- The interrupt mechanism (
_running_agents / _pending_messages) means if two users message at the same time, one interrupts the other's agent run
Impact
- Context overflow: Shared sessions hit the context limit much faster than single-user sessions
- Confused responses: The agent sees all users' conversations as one thread — it may reference User A's request when responding to User B
- Interrupt conflicts: If User A's agent is running and User B sends a message, User B's message interrupts User A's agent and gets queued as a pending message
- Cost: Every user's message rehydrates the entire shared context, paying for tokens from other users' conversations
Possible Solutions
Option A: Per-user sessions in group contexts
Key sessions by platform:chat_type:chat_id:user_id instead of just platform:chat_type:chat_id for group/channel contexts. Each user gets their own conversation thread.
Pros: Clean isolation, no cross-contamination
Cons: Agent loses shared context (can't reference what another user said), uses more storage
Option B: More aggressive compression thresholds for group sessions
Keep shared sessions but lower the compression threshold (e.g., 60% instead of 85%) for group contexts, and compress more aggressively.
Pros: Simple, preserves shared context
Cons: Doesn't solve the confused-responses problem
Option C: Thread-based sessions for Discord
In Discord specifically, encourage/require thread usage — each thread gets its own session (already works since thread_id is part of the source). The main channel could auto-create threads per interaction.
Pros: Natural UX for Discord, clean isolation
Cons: Platform-specific, doesn't help Telegram/WhatsApp groups
Option D: Hybrid — shared context with user-tagged messages
Keep one session but tag each message with the user who sent it. The system prompt could instruct the agent to track who said what and respond accordingly.
Pros: Natural group conversation, agent can reference cross-user context
Cons: Still shares context window, still has the interrupt problem
Environment
gateway/session.py build_session_key() — line 293
gateway/run.py _handle_message() — interrupt handling at line 757
- Observed on Discord with multiple users in a DM group
Problem
In Discord channels, Telegram groups, and other non-DM contexts, all users share a single session. The session key is constructed as
agent:main:{platform}:{chat_type}:{chat_id}(ingateway/session.pybuild_session_key()), meaning every message from every user in that channel feeds into one conversation history.This causes context to balloon much faster than single-user DMs, making it significantly more likely to hit the context limit — especially in active Discord servers where multiple people may be chatting with the bot concurrently.
This was a contributing factor to #813 — the session hit 233k tokens in a Discord conversation.
Current Behavior
_running_agents/_pending_messages) means if two users message at the same time, one interrupts the other's agent runImpact
Possible Solutions
Option A: Per-user sessions in group contexts
Key sessions by
platform:chat_type:chat_id:user_idinstead of justplatform:chat_type:chat_idfor group/channel contexts. Each user gets their own conversation thread.Pros: Clean isolation, no cross-contamination
Cons: Agent loses shared context (can't reference what another user said), uses more storage
Option B: More aggressive compression thresholds for group sessions
Keep shared sessions but lower the compression threshold (e.g., 60% instead of 85%) for group contexts, and compress more aggressively.
Pros: Simple, preserves shared context
Cons: Doesn't solve the confused-responses problem
Option C: Thread-based sessions for Discord
In Discord specifically, encourage/require thread usage — each thread gets its own session (already works since thread_id is part of the source). The main channel could auto-create threads per interaction.
Pros: Natural UX for Discord, clean isolation
Cons: Platform-specific, doesn't help Telegram/WhatsApp groups
Option D: Hybrid — shared context with user-tagged messages
Keep one session but tag each message with the user who sent it. The system prompt could instruct the agent to track who said what and respond accordingly.
Pros: Natural group conversation, agent can reference cross-user context
Cons: Still shares context window, still has the interrupt problem
Environment
gateway/session.pybuild_session_key()— line 293gateway/run.py_handle_message()— interrupt handling at line 757