feat(gateway): busy-text-mode queue + pre-output message assimilation (B1.2/B1.3)#29943
feat(gateway): busy-text-mode queue + pre-output message assimilation (B1.2/B1.3)#29943pnascimento9596 wants to merge 7 commits into
Conversation
|
Update: Added Option B1 — a short debounce window (default 0.6s) for rapid consecutive text messages while busy.
|
9c6bed7 to
441fe09
Compare
0355108 to
53704db
Compare
…ueuing without interrupting current turn - Text messages during active sessions now merge into _pending_messages without setting the interrupt event or calling running_agent.interrupt() - Eliminates unwanted 'Interrupting current task...' acks for normal consecutive messages - Existing drain logic provides clean cascade after current turn completes - Configurable via display.busy_text_mode or HERMES_GATEWAY_BUSY_TEXT_MODE (queue|interrupt) - Adds regression coverage for the new path Fixes user friction with rapid multi-message workflows while preserving /stop and legacy interrupt behavior.
…r rapid follow-ups - Text messages during active sessions no longer trigger "Interrupting current task..." acks (busy_text_mode=queue by default). - Added short debounce window (default 0.6s, configurable via HERMES_GATEWAY_BUSY_TEXT_DEBOUNCE_SECONDS) so rapid consecutive messages are merged into one coherent event instead of multiple replies. - Generalized Telegram's proven _queue_media_group_event pattern into BasePlatformAdapter. - Respects existing busy_text_mode and steer mode. - Added proper cleanup, task management, and regression tests. Fixes rapid multi-message workflow friction while preserving /stop and legacy interrupt behavior. Co-authored-by: Grok + Paulo
Add pre-output assimilation: messages arriving after a turn starts internally but before any visible output are folded into the current turn instead of spawning a new one. Components: - Precommit state machine (running_precommit → committed) - init_precommit_state / _try_assimilate / commit_precommit_turn - Revision-based stale output suppression - Bounded restarts (max 2) + 1.5s assimilation deadline - Feature flag: HERMES_PRE_OUTPUT_ASSIMILATION_ENABLED (runtime kill switch) - Hidden coalesced-turn signaling (gated separately) - Unconditional entry logging for observability Fixes session-key construction divergence between init (gateway config) and assimilate (adapter config) paths — both now use adapter-local PlatformConfig.extra for group_sessions_per_user / thread_sessions_per_user.
- Entry logs elevated to INFO for live observability - Gate-level REJECT/ACCEPT logging in _try_assimilate - Assimilation deadline extended 1.5s → 3.0s - Launchd plist env var injection for HERMES_PRE_OUTPUT_ASSIMILATION_ENABLED
…atus Move _get_assimilation_pending check outside the result.interrupted gate. Accepted assimilation state is now the authoritative restart signal; agent.interrupt() is a latency optimization, not the correctness gate. Fixes silent text drop when agent completes before interrupt lands.
_deliver_result no longer reads cron.wrap_response; the 6 load_config patches were inert. test_delivery_skips_wrapping_when_config_disabled was a duplicate of test_delivery_sends_raw_content_without_header_or_footer.
Wider window catches real human typing gaps (4-5s between related messages) without requiring sub-3s bursts.
53704db to
c8c9a18
Compare
|
Thanks for the substantial work here, @pnascimento9596 — the design write-up is detailed and the test coverage is real. After review I'm going to pass on this one, though. A few reasons:
The underlying problem — rapid multi-message follow-ups creating separate turns — is real, but I'd want to approach it with a much smaller change set, and not in combination with the cron delivery semantics. I'll keep the issue space open in my head; if you want to do a focused B1.2-only PR (queue + debounce, no B1.3, no cron changes) I'd take another look. Appreciate the effort. |
Hi - completely agreed on scope, that bundling was my error....I'll come back with a B1.2-only PR (queue + debounce, no B1.3, no cron) and split the rest into focused follow-ups. Thanks for the detailed review. |
|
Opened #31235 as the focused B1.2-only follow-up — queue + debounce for active-session text follow-ups. No B1.3, no cron changes, no config aliases, reduced adapter state to one dataclass dict. |
Problem
When users send 2–3 related messages in quick succession before seeing a reply, Hermes treats them as separate turns — answering only the first message, then handling the rest as follow-up turns.
Solution
Two complementary mechanisms covering different timing windows:
B1.2 — Text debounce + silent queuing (
busy_text_mode=queue)New
busy_text_modeadapter setting (default:queue) that silently queues text follow-ups instead of interrupting the active turn. Includes configurable debounce (0.35s busy / 0.45s idle with hard caps) so rapid messages arriving within the debounce window are coalesced into a single event before a turn is created.B1.3 — Pre-output assimilation
Messages arriving after a turn starts internally but before any visible output or side effects are folded into the current turn via cancel + restart. Covers the gap between debounce expiry and first model output.
Key design points:
running_precommit→committed)result.interrupted— closes a race where fast model responses complete before the interrupt landsHERMES_PRE_OUTPUT_ASSIMILATION_ENABLED(runtime kill switch, default off)Also included
_busy_text_modepropagation fromGatewayRunnerto adapters withRuntimeErrorguardcron-clean-deliveryskill is present (removes header/footer wrapping)shutdown_notificationas backward-compatible alias forgateway_restart_notificationMessage timing zones
Testing
test_active_session_text_merge.py, 38 tests)test_busy_session_ack.py,test_config.py,test_config_env_bridge_authority.py,test_interrupt_key_match.py,test_restart_drain.pytest_shutdown_forensics)Deploy notes
busy_text_mode=queue)HERMES_PRE_OUTPUT_ASSIMILATION_ENABLED=truein~/.hermes/.envgrep 'B1.3' gateway.logfor init / assimilate ACCEPT|REJECT / restart eventsfalseand restart to disable without rollback