perf(gateway): tune Telegram cadence + adaptive fast-path for short replies (salvage of #10388) by teknium1 · Pull Request #23587 · NousResearch/hermes-agent

teknium1 · 2026-05-11T05:18:55Z

Summary

Telegram replies feel sluggish on short messages — the user types something brief, the bot responds, and there's a noticeable ~0.6-2s wait before the first token shows. This PR tunes the cadence end-to-end:

Short replies stream in ~180ms (down from ~600ms) via an adaptive text-batch fast-path.
Mid-stream edits cadence at 0.8s with a 24-char trigger (down from 1.0s/40 chars), still inside Telegram's ~1 edit/s flood envelope.
Telegram's tool_progress default flips from all → new so tool-batch bubbles don't compound edit pressure on busy chats.

Composes cleanly with the native draft transport (#23512): the cadence gates both send_draft and edit_message paths, and the text-batch fast-path runs before the consumer starts, so first-token latency drops on every transport.

How the fast-path works

incoming text batch → _flush_text_batch
  total ≤ 320 cp?  → min(cap, 0.18s)   ← fast tier (typed messages)
  total ≤ 1024 cp? → min(cap, 0.24s)   ← short tier (one paragraph)
  else             → cap                ← long
  last_chunk ≥ 4000? → split delay      ← continuation almost certain

The tiers compose with the configured cap via min(), so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still gets the lower number on every tier.

Changes

File	What
`gateway/platforms/telegram.py`	Adaptive text-batch fast-path; lower defaults (0.6→0.3s ingress, 2.0→1.0s split); new `_env_float_clamped` helper rejects NaN/Inf and applies min/max bounds.
`gateway/config.py`	`DEFAULT_STREAMING_EDIT_INTERVAL=0.8`, `DEFAULT_STREAMING_BUFFER_THRESHOLD=24`, `DEFAULT_STREAMING_CURSOR=" ▉"` as a single source of truth. `StreamingConfig` defaults reference them.
`gateway/stream_consumer.py`	`StreamConsumerConfig` imports the same constants. Fixes the prior dual-default drift between gateway/config.py and stream_consumer.py.
`gateway/display_config.py`	Telegram `_PLATFORM_DEFAULTS` keeps `_TIER_HIGH` membership but overrides `tool_progress` to `"new"`. Slack's `tool_progress="off"` (#14663) preserved.
`tests/gateway/test_telegram_text_batch_perf.py`	New file. 7 tests for `_env_float_clamped` + 4 tests for adaptive-tier composition rules.
`tests/gateway/test_display_config.py`	Updated assertions for Telegram's new `tool_progress="new"` default.
`scripts/release.py`	AUTHOR_MAP entry for wilsen0.

Salvage notes

PR #10388 by @wilsen0 was branched 3800+ commits behind current main and could not be cherry-picked. Re-authored against current main with @wilsen0's authorship preserved on the substantive commit. Dropped from the original branch:

Unrelated d2f043f9c "fix(anthropic): preserve third-party thinking continuity" commit
boot_md.py builtin gateway hook (unrelated)
Reverted Slack tool_progress="off" override ([Bug]: slack gateway is reporting intermediate tool calls #14663) restoration
Reverted Platform plugin discovery + MSGRAPH_WEBHOOK + YUANBAO member deletions
2300+ lines of run.py base-skew noise

Validation

tests/gateway/test_telegram_text_batch_perf.py + test_display_config.py + test_stream_consumer*.py + test_telegram_format.py: 248/248 pass.

Closes #10388.
Authored by @wilsen0; design re-authored against current main.

@wilsen0

…eplies Re-authored against current main from PR #10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes #10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>

- New tests/gateway/test_telegram_text_batch_perf.py: TestEnvFloatClamped — 7 tests covering default-when-unset, valid parse, garbage fallback, NaN rejection, Inf rejection, min-clamp, max-clamp. Asserts asyncio.sleep() always gets a finite number. TestAdaptiveTextBatchTiers — 4 tests covering the tier-constant invariants and the min(cap, tier_delay) composition rule. - tests/gateway/test_display_config.py: update assertions for Telegram's new tool_progress='new' default.

github-actions · 2026-05-11T05:20:00Z

🔎 Lint report: `salvage/pr-10388` vs `origin/main`

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8147 on HEAD, 8146 on base (🆕 +1)

🆕 New issues (1):

Rule	Count
`unresolved-import`	1

First entries

tests/gateway/test_telegram_text_batch_perf.py:19: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`

✅ Fixed issues: none

Unchanged: 4280 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

wilsen0 · 2026-05-11T05:50:33Z

Got it！

wilsen0 and others added 3 commits May 10, 2026 22:18

chore: AUTHOR_MAP entry for wilsen0

8e23593

teknium1 merged commit 482d49c into main May 11, 2026
13 of 16 checks passed

teknium1 deleted the salvage/pr-10388 branch May 11, 2026 05:22

teknium1 mentioned this pull request May 11, 2026

perf(gateway): reduce Telegram end-to-end response latency #10388

Closed

7 tasks

alt-glitch added type/perf Performance improvement or optimization comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter P2 Medium — degraded but workaround exists labels May 11, 2026

BrewTestBot mentioned this pull request May 16, 2026

hermes-agent 2026.5.16 Homebrew/homebrew-core#283141

Merged

1 task

github-actions Bot mentioned this pull request May 17, 2026

chore: bump NousResearch/hermes-agent version from v2026.5.7 to v2026.5.16 Docker-Hub-sirmark/docker-hermes-agent#6

Merged

Qwinty mentioned this pull request May 27, 2026

[Bug]: Telegram text+photo sends can start a text-only turn before the image is attached #33270

Open

rdnot mentioned this pull request Jun 12, 2026

WhatsApp gateway: 5s default debounce introduced in PR#35391 adds 5s latency to every response #44883

Open

1 task

liuhao1024 mentioned this pull request Jun 12, 2026

fix(whatsapp): lower default debounce delays to match Telegram cadence #44896

Open

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(gateway): tune Telegram cadence + adaptive fast-path for short replies (salvage of #10388)#23587

perf(gateway): tune Telegram cadence + adaptive fast-path for short replies (salvage of #10388)#23587
teknium1 merged 3 commits into
mainfrom
salvage/pr-10388

teknium1 commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

Uh oh!

wilsen0 commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

teknium1 commented May 11, 2026

Summary

How the fast-path works

Changes

Salvage notes

Validation

Uh oh!

github-actions Bot commented May 11, 2026

🔎 Lint report: salvage/pr-10388 vs origin/main

ruff

ty (type checker)

Uh oh!

Uh oh!

wilsen0 commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🔎 Lint report: `salvage/pr-10388` vs `origin/main`