Skip to content

feat(gateway): busy-text-mode queue + pre-output message assimilation (B1.2/B1.3)#29943

Closed
pnascimento9596 wants to merge 7 commits into
NousResearch:mainfrom
pnascimento9596:feature/busy-text-mode-queue
Closed

feat(gateway): busy-text-mode queue + pre-output message assimilation (B1.2/B1.3)#29943
pnascimento9596 wants to merge 7 commits into
NousResearch:mainfrom
pnascimento9596:feature/busy-text-mode-queue

Conversation

@pnascimento9596

@pnascimento9596 pnascimento9596 commented May 21, 2026

Copy link
Copy Markdown
Contributor

Problem

When users send 2–3 related messages in quick succession before seeing a reply, Hermes treats them as separate turns — answering only the first message, then handling the rest as follow-up turns.

Solution

Two complementary mechanisms covering different timing windows:

B1.2 — Text debounce + silent queuing (busy_text_mode=queue)

New busy_text_mode adapter setting (default: queue) that silently queues text follow-ups instead of interrupting the active turn. Includes configurable debounce (0.35s busy / 0.45s idle with hard caps) so rapid messages arriving within the debounce window are coalesced into a single event before a turn is created.

B1.3 — Pre-output assimilation

Messages arriving after a turn starts internally but before any visible output or side effects are folded into the current turn via cancel + restart. Covers the gap between debounce expiry and first model output.

Key design points:

  • Precommit state machine (running_precommitcommitted)
  • Bounded restarts (max 2) + 3.0s assimilation deadline
  • Accepted assimilation state drives the restart, not result.interrupted — closes a race where fast model responses complete before the interrupt lands
  • Session-key construction aligned between init (gateway) and assimilate (adapter) paths
  • Feature flag HERMES_PRE_OUTPUT_ASSIMILATION_ENABLED (runtime kill switch, default off)
  • Unconditional entry logging + gate-level REJECT/ACCEPT logging

Also included

  • B1.1 config bridge: _busy_text_mode propagation from GatewayRunner to adapters with RuntimeError guard
  • Cron clean delivery: raw output delivery when cron-clean-delivery skill is present (removes header/footer wrapping)
  • Config alias: shutdown_notification as backward-compatible alias for gateway_restart_notification

Message timing zones

Gap between messages Handler
< 0.45s B1.2 debounce coalesces pre-turn
0.45s – 3.0s (before output) B1.3 assimilates into running turn
> 3.0s or after tool call / output Separate turn

Testing

  • 970 lines of dedicated tests (test_active_session_text_merge.py, 38 tests)
  • Additional coverage in test_busy_session_ack.py, test_config.py, test_config_env_bridge_authority.py, test_interrupt_key_match.py, test_restart_drain.py
  • 5,707 gateway + 373 cron tests passing (1 pre-existing unrelated failure in test_shutdown_forensics)
  • Live-tested on Discord DMs: 2-message and 3-message rapid bursts confirmed aggregating into single consolidated responses

Deploy notes

  • B1.2 queue-mode debounce is active by default (busy_text_mode=queue)
  • B1.3 is off by default — enable via HERMES_PRE_OUTPUT_ASSIMILATION_ENABLED=true in ~/.hermes/.env
  • Monitor B1.3 via grep 'B1.3' gateway.log for init / assimilate ACCEPT|REJECT / restart events
  • Runtime kill switch: set env var to false and restart to disable without rollback

@alt-glitch alt-glitch added type/feature New feature or request comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have labels May 21, 2026
@pnascimento9596 pnascimento9596 changed the title gateway: add busy_text_mode (default=queue) — silent text follow-up queuing without interrupting current turn gateway: add busy_text_mode (default=queue) + text debounce window for rapid follow-ups May 21, 2026
@pnascimento9596

Copy link
Copy Markdown
Contributor Author

Update: Added Option B1 — a short debounce window (default 0.6s) for rapid consecutive text messages while busy.

  • Each new text message resets the timer.
  • After the window elapses with no new messages, the accumulated text is flushed into _pending_messages as a single merged event.
  • This prevents fragmented multi-reply behavior while preserving the original "no interrupt" guarantee from busy_text_mode=queue.
  • 5 new regression tests + all existing tests pass.

@pnascimento9596 pnascimento9596 force-pushed the feature/busy-text-mode-queue branch from 9c6bed7 to 441fe09 Compare May 23, 2026 01:10
@pnascimento9596 pnascimento9596 changed the title gateway: add busy_text_mode (default=queue) + text debounce window for rapid follow-ups feat(gateway): busy-text-mode queue + pre-output message assimilation (B1.2/B1.3) May 23, 2026
@pnascimento9596 pnascimento9596 force-pushed the feature/busy-text-mode-queue branch 3 times, most recently from 0355108 to 53704db Compare May 23, 2026 18:47
…ueuing without interrupting current turn

- Text messages during active sessions now merge into _pending_messages without setting the interrupt event or calling running_agent.interrupt()
- Eliminates unwanted 'Interrupting current task...' acks for normal consecutive messages
- Existing drain logic provides clean cascade after current turn completes
- Configurable via display.busy_text_mode or HERMES_GATEWAY_BUSY_TEXT_MODE (queue|interrupt)
- Adds regression coverage for the new path

Fixes user friction with rapid multi-message workflows while preserving /stop and legacy interrupt behavior.
…r rapid follow-ups

- Text messages during active sessions no longer trigger "Interrupting current task..." acks
  (busy_text_mode=queue by default).
- Added short debounce window (default 0.6s, configurable via HERMES_GATEWAY_BUSY_TEXT_DEBOUNCE_SECONDS)
  so rapid consecutive messages are merged into one coherent event instead of multiple replies.
- Generalized Telegram's proven _queue_media_group_event pattern into BasePlatformAdapter.
- Respects existing busy_text_mode and steer mode.
- Added proper cleanup, task management, and regression tests.

Fixes rapid multi-message workflow friction while preserving /stop and legacy interrupt behavior.

Co-authored-by: Grok + Paulo
Add pre-output assimilation: messages arriving after a turn starts
internally but before any visible output are folded into the current
turn instead of spawning a new one.

Components:
- Precommit state machine (running_precommit → committed)
- init_precommit_state / _try_assimilate / commit_precommit_turn
- Revision-based stale output suppression
- Bounded restarts (max 2) + 1.5s assimilation deadline
- Feature flag: HERMES_PRE_OUTPUT_ASSIMILATION_ENABLED (runtime kill switch)
- Hidden coalesced-turn signaling (gated separately)
- Unconditional entry logging for observability

Fixes session-key construction divergence between init (gateway config)
and assimilate (adapter config) paths — both now use adapter-local
PlatformConfig.extra for group_sessions_per_user / thread_sessions_per_user.
- Entry logs elevated to INFO for live observability
- Gate-level REJECT/ACCEPT logging in _try_assimilate
- Assimilation deadline extended 1.5s → 3.0s
- Launchd plist env var injection for HERMES_PRE_OUTPUT_ASSIMILATION_ENABLED
…atus

Move _get_assimilation_pending check outside the result.interrupted gate.
Accepted assimilation state is now the authoritative restart signal;
agent.interrupt() is a latency optimization, not the correctness gate.
Fixes silent text drop when agent completes before interrupt lands.
_deliver_result no longer reads cron.wrap_response; the 6 load_config
patches were inert. test_delivery_skips_wrapping_when_config_disabled
was a duplicate of test_delivery_sends_raw_content_without_header_or_footer.
Wider window catches real human typing gaps (4-5s between related
messages) without requiring sub-3s bursts.
@pnascimento9596 pnascimento9596 force-pushed the feature/busy-text-mode-queue branch from 53704db to c8c9a18 Compare May 23, 2026 19:02
@teknium1

Copy link
Copy Markdown
Contributor

Thanks for the substantial work here, @pnascimento9596 — the design write-up is detailed and the test coverage is real. After review I'm going to pass on this one, though.

A few reasons:

  1. Scope. The PR bundles three independent changes (B1.2 queue/debounce, B1.3 pre-output assimilation, removal of the cron.wrap_response default + a shutdown_notification alias). The cron change in particular silently strips the Cronjob Response: <name> header and "agent cannot see this message" footer from every existing user's cron deliveries, which is a documented architectural invariant on our end (cron outputs aren't mirrored into the session, so the wrapper exists to tell the user the agent can't see it).

  2. B1.3 correctness. Your own code comment notes that tools can dispatch during the 250 ms assimilation window and the result is then "discarded" — but the side effects (terminal commands, file writes, network calls) aren't undone. Shipping that behind an off-by-default flag (which the launchd plist also explicitly disables) suggests it isn't ready yet, and a real fix would need a tool-dispatch gate in the agent loop, not a timing window.

  3. Footprint. +907 lines and 11 new instance dicts into gateway/platforms/base.py is a lot of new state in a critical file, and the gateway adapter test patterns (many use object.__new__ to skip __init__) would need a sweep before we'd be comfortable with that surface area.

The underlying problem — rapid multi-message follow-ups creating separate turns — is real, but I'd want to approach it with a much smaller change set, and not in combination with the cron delivery semantics. I'll keep the issue space open in my head; if you want to do a focused B1.2-only PR (queue + debounce, no B1.3, no cron changes) I'd take another look. Appreciate the effort.

@teknium1 teknium1 closed this May 23, 2026
@pnascimento9596

Copy link
Copy Markdown
Contributor Author

Thanks for the substantial work here, @pnascimento9596 — the design write-up is detailed and the test coverage is real. After review I'm going to pass on this one, though.

A few reasons:

  1. Scope. The PR bundles three independent changes (B1.2 queue/debounce, B1.3 pre-output assimilation, removal of the cron.wrap_response default + a shutdown_notification alias). The cron change in particular silently strips the Cronjob Response: <name> header and "agent cannot see this message" footer from every existing user's cron deliveries, which is a documented architectural invariant on our end (cron outputs aren't mirrored into the session, so the wrapper exists to tell the user the agent can't see it).
  2. B1.3 correctness. Your own code comment notes that tools can dispatch during the 250 ms assimilation window and the result is then "discarded" — but the side effects (terminal commands, file writes, network calls) aren't undone. Shipping that behind an off-by-default flag (which the launchd plist also explicitly disables) suggests it isn't ready yet, and a real fix would need a tool-dispatch gate in the agent loop, not a timing window.
  3. Footprint. +907 lines and 11 new instance dicts into gateway/platforms/base.py is a lot of new state in a critical file, and the gateway adapter test patterns (many use object.__new__ to skip __init__) would need a sweep before we'd be comfortable with that surface area.

The underlying problem — rapid multi-message follow-ups creating separate turns — is real, but I'd want to approach it with a much smaller change set, and not in combination with the cron delivery semantics. I'll keep the issue space open in my head; if you want to do a focused B1.2-only PR (queue + debounce, no B1.3, no cron changes) I'd take another look. Appreciate the effort.

Hi - completely agreed on scope, that bundling was my error....I'll come back with a B1.2-only PR (queue + debounce, no B1.3, no cron) and split the rest into focused follow-ups. Thanks for the detailed review.

@pnascimento9596

Copy link
Copy Markdown
Contributor Author

Opened #31235 as the focused B1.2-only follow-up — queue + debounce for active-session text follow-ups. No B1.3, no cron changes, no config aliases, reduced adapter state to one dataclass dict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants