Skip to content

perf(gateway): tune Telegram cadence + adaptive fast-path for short replies (salvage of #10388)#23587

Merged
teknium1 merged 3 commits into
mainfrom
salvage/pr-10388
May 11, 2026
Merged

perf(gateway): tune Telegram cadence + adaptive fast-path for short replies (salvage of #10388)#23587
teknium1 merged 3 commits into
mainfrom
salvage/pr-10388

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Telegram replies feel sluggish on short messages — the user types something brief, the bot responds, and there's a noticeable ~0.6-2s wait before the first token shows. This PR tunes the cadence end-to-end:

  • Short replies stream in ~180ms (down from ~600ms) via an adaptive text-batch fast-path.
  • Mid-stream edits cadence at 0.8s with a 24-char trigger (down from 1.0s/40 chars), still inside Telegram's ~1 edit/s flood envelope.
  • Telegram's tool_progress default flips from allnew so tool-batch bubbles don't compound edit pressure on busy chats.

Composes cleanly with the native draft transport (#23512): the cadence gates both send_draft and edit_message paths, and the text-batch fast-path runs before the consumer starts, so first-token latency drops on every transport.

How the fast-path works

incoming text batch → _flush_text_batch
  total ≤ 320 cp?  → min(cap, 0.18s)   ← fast tier (typed messages)
  total ≤ 1024 cp? → min(cap, 0.24s)   ← short tier (one paragraph)
  else             → cap                ← long
  last_chunk ≥ 4000? → split delay      ← continuation almost certain

The tiers compose with the configured cap via min(), so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still gets the lower number on every tier.

Changes

File What
gateway/platforms/telegram.py Adaptive text-batch fast-path; lower defaults (0.6→0.3s ingress, 2.0→1.0s split); new _env_float_clamped helper rejects NaN/Inf and applies min/max bounds.
gateway/config.py DEFAULT_STREAMING_EDIT_INTERVAL=0.8, DEFAULT_STREAMING_BUFFER_THRESHOLD=24, DEFAULT_STREAMING_CURSOR=" ▉" as a single source of truth. StreamingConfig defaults reference them.
gateway/stream_consumer.py StreamConsumerConfig imports the same constants. Fixes the prior dual-default drift between gateway/config.py and stream_consumer.py.
gateway/display_config.py Telegram _PLATFORM_DEFAULTS keeps _TIER_HIGH membership but overrides tool_progress to "new". Slack's tool_progress="off" (#14663) preserved.
tests/gateway/test_telegram_text_batch_perf.py New file. 7 tests for _env_float_clamped + 4 tests for adaptive-tier composition rules.
tests/gateway/test_display_config.py Updated assertions for Telegram's new tool_progress="new" default.
scripts/release.py AUTHOR_MAP entry for wilsen0.

Salvage notes

PR #10388 by @wilsen0 was branched 3800+ commits behind current main and could not be cherry-picked. Re-authored against current main with @wilsen0's authorship preserved on the substantive commit. Dropped from the original branch:

  • Unrelated d2f043f9c "fix(anthropic): preserve third-party thinking continuity" commit
  • boot_md.py builtin gateway hook (unrelated)
  • Reverted Slack tool_progress="off" override ([Bug]: slack gateway is reporting intermediate tool calls #14663) restoration
  • Reverted Platform plugin discovery + MSGRAPH_WEBHOOK + YUANBAO member deletions
  • 2300+ lines of run.py base-skew noise

Validation

tests/gateway/test_telegram_text_batch_perf.py + test_display_config.py + test_stream_consumer*.py + test_telegram_format.py: 248/248 pass.

Closes #10388.
Authored by @wilsen0; design re-authored against current main.

wilsen0 and others added 3 commits May 10, 2026 22:18
…eplies

Re-authored against current main from PR #10388 by @wilsen0.  The
original branch is 3800+ commits stale and could not be cherry-picked
without reverting unrelated work; this change carries only the perf
intent forward.

Tuning summary
==============

Text-batch ingress (gateway/platforms/telegram.py):
  - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3
  - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0
  - Adaptive fast-path tiers in _flush_text_batch:
      total <= 320 cp -> min(cap, 0.18)
      total <= 1024 cp -> min(cap, 0.24)
      else            -> cap
    A single short reply now reaches the agent in ~180ms instead of
    600ms.  Tier constants compose with the configured cap via min()
    so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS
    below 0.18 still wins on every tier.
  - _env_float_clamped helper replaces bare float(os.getenv()).
    Rejects NaN / Inf, applies optional min/max bounds.  Used for
    text-batch + media-batch knobs.  Prevents asyncio.sleep(NaN)
    crashes when an operator typos an env var.

Stream cadence (gateway/config.py + stream_consumer.py):
  - StreamingConfig.edit_interval default 1.0s -> 0.8s
  - StreamingConfig.buffer_threshold default 40 -> 24 chars
  - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now
    a single source of truth.  StreamConsumerConfig imports them
    instead of duplicating the literals; the prior dual-source drift
    is fixed.

Tool progress (gateway/display_config.py):
  - Telegram default tool_progress 'all' -> 'new'.  Inside
    Telegram's ~1 edit/s flood envelope the 'all' default would
    accumulate edit pressure on busy chats; 'new' shows only the
    leading bubble per tool batch and feels less spammy.
  - Slack tier_low override (tool_progress='off') is preserved.

Composition with native draft streaming (#23512)
================================================

The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH
the draft path (send_draft) and the edit path (edit_message), so the
tighter cadence helps native draft as much as edit-based.  The
text-batch fast-path applies before the consumer starts, so it speeds
up the first-token latency on every transport.  No conflict.

Stale-base avoidance
====================

Re-authored from scratch rather than cherry-picked.  Dropped from the
original branch:
  - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking
    continuity' commit
  - boot_md.py builtin gateway hook (unrelated)
  - Reverted Slack tool_progress='off' (#14663) restoration
  - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO
    members deletion
  - 2300+ lines of run.py base-skew noise

Tests
=====

New tests/gateway/test_telegram_text_batch_perf.py:
  - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds).
  - 4 tests for the adaptive-tier composition rules.

Updated tests/gateway/test_display_config.py:
  - test_platform_default_when_no_user_config: 'all' -> 'new' for
    Telegram, with comment.
  - test_high_tier_platforms: split into Telegram-overrides-to-new
    and Discord-stays-all assertions.

Closes #10388.

Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
- New tests/gateway/test_telegram_text_batch_perf.py:
  TestEnvFloatClamped — 7 tests covering default-when-unset, valid
  parse, garbage fallback, NaN rejection, Inf rejection, min-clamp,
  max-clamp.  Asserts asyncio.sleep() always gets a finite number.

  TestAdaptiveTextBatchTiers — 4 tests covering the tier-constant
  invariants and the min(cap, tier_delay) composition rule.

- tests/gateway/test_display_config.py: update assertions for
  Telegram's new tool_progress='new' default.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: salvage/pr-10388 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8147 on HEAD, 8146 on base (🆕 +1)

🆕 New issues (1):

Rule Count
unresolved-import 1
First entries
tests/gateway/test_telegram_text_batch_perf.py:19: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`

✅ Fixed issues: none

Unchanged: 4280 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@teknium1 teknium1 merged commit 482d49c into main May 11, 2026
13 of 16 checks passed
@teknium1 teknium1 deleted the salvage/pr-10388 branch May 11, 2026 05:22
@alt-glitch alt-glitch added type/perf Performance improvement or optimization comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter P2 Medium — degraded but workaround exists labels May 11, 2026
@wilsen0

wilsen0 commented May 11, 2026

Copy link
Copy Markdown
Contributor

Got it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists platform/telegram Telegram bot adapter type/perf Performance improvement or optimization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants