perf(gateway): reduce Telegram end-to-end response latency#10388
Closed
wilsen0 wants to merge 3 commits into
Closed
perf(gateway): reduce Telegram end-to-end response latency#10388wilsen0 wants to merge 3 commits into
wilsen0 wants to merge 3 commits into
Conversation
Downgrade third-party thinking blocks to text so reasoning context survives across turns while removing redacted payloads and stale signatures. Add regression tests for third-party thinking conversion and keep z.ai preserved-thinking behavior server-driven by removing explicit clear_thinking injection.
teknium1
pushed a commit
that referenced
this pull request
May 11, 2026
…eplies Re-authored against current main from PR #10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes #10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
Contributor
|
Merged via salvage PR #23587 (#23587). Your authorship preserved on the substantive commit. Re-authored against current main rather than cherry-picked — your branch was 3800+ commits behind and would have reverted unrelated work (Slack tool_progress override #14663, Platform plugin discovery, MSGRAPH_WEBHOOK + YUANBAO members). The salvage carries only the perf intent forward:
Salvage dropped your unrelated d2f043f (anthropic thinking) commit and a boot_md.py builtin gateway hook that wasn't in scope. Added 11 regression tests for the env helper + tier composition. Composes cleanly with the native draft transport (#23512). Thanks for the perf work — short Telegram replies feel noticeably snappier now. |
rmulligan
pushed a commit
to rmulligan/hermes-agent
that referenced
this pull request
May 11, 2026
…eplies Re-authored against current main from PR NousResearch#10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (NousResearch#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (NousResearch#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes NousResearch#10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
JinyuID
pushed a commit
to JinyuID/hermes-agent
that referenced
this pull request
May 11, 2026
…eplies Re-authored against current main from PR NousResearch#10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (NousResearch#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (NousResearch#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes NousResearch#10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
mxdhavgautam
pushed a commit
to mxdhavgautam/hermes-agent
that referenced
this pull request
May 13, 2026
…eplies Re-authored against current main from PR NousResearch#10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (NousResearch#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (NousResearch#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes NousResearch#10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com> (cherry picked from commit ac95b8c)
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…eplies Re-authored against current main from PR NousResearch#10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (NousResearch#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (NousResearch#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes NousResearch#10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
jsboige
pushed a commit
to jsboige/hermes-agent
that referenced
this pull request
May 14, 2026
…eplies Re-authored against current main from PR NousResearch#10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (NousResearch#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (NousResearch#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes NousResearch#10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
AlexFoxD
pushed a commit
to AlexFoxD/hermes-agent
that referenced
this pull request
May 21, 2026
…eplies Re-authored against current main from PR NousResearch#10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (NousResearch#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (NousResearch#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes NousResearch#10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…eplies Re-authored against current main from PR NousResearch#10388 by @wilsen0. The original branch is 3800+ commits stale and could not be cherry-picked without reverting unrelated work; this change carries only the perf intent forward. Tuning summary ============== Text-batch ingress (gateway/platforms/telegram.py): - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3 - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0 - Adaptive fast-path tiers in _flush_text_batch: total <= 320 cp -> min(cap, 0.18) total <= 1024 cp -> min(cap, 0.24) else -> cap A single short reply now reaches the agent in ~180ms instead of 600ms. Tier constants compose with the configured cap via min() so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS below 0.18 still wins on every tier. - _env_float_clamped helper replaces bare float(os.getenv()). Rejects NaN / Inf, applies optional min/max bounds. Used for text-batch + media-batch knobs. Prevents asyncio.sleep(NaN) crashes when an operator typos an env var. Stream cadence (gateway/config.py + stream_consumer.py): - StreamingConfig.edit_interval default 1.0s -> 0.8s - StreamingConfig.buffer_threshold default 40 -> 24 chars - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now a single source of truth. StreamConsumerConfig imports them instead of duplicating the literals; the prior dual-source drift is fixed. Tool progress (gateway/display_config.py): - Telegram default tool_progress 'all' -> 'new'. Inside Telegram's ~1 edit/s flood envelope the 'all' default would accumulate edit pressure on busy chats; 'new' shows only the leading bubble per tool batch and feels less spammy. - Slack tier_low override (tool_progress='off') is preserved. Composition with native draft streaming (NousResearch#23512) ================================================ The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH the draft path (send_draft) and the edit path (edit_message), so the tighter cadence helps native draft as much as edit-based. The text-batch fast-path applies before the consumer starts, so it speeds up the first-token latency on every transport. No conflict. Stale-base avoidance ==================== Re-authored from scratch rather than cherry-picked. Dropped from the original branch: - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking continuity' commit - boot_md.py builtin gateway hook (unrelated) - Reverted Slack tool_progress='off' (NousResearch#14663) restoration - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO members deletion - 2300+ lines of run.py base-skew noise Tests ===== New tests/gateway/test_telegram_text_batch_perf.py: - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds). - 4 tests for the adaptive-tier composition rules. Updated tests/gateway/test_display_config.py: - test_platform_default_when_no_user_config: 'all' -> 'new' for Telegram, with comment. - test_high_tier_platforms: split into Telegram-overrides-to-new and Discord-stays-all assertions. Closes NousResearch#10388. Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
edit_interval1.0s → 0.8s,buffer_threshold40 → 24 chars — balanced against Telegram's ~1 edit/s flood limit.tool_progressfrom"all"to"new"to reduce edit pressure and flood-control risk._env_float_clamped()withmath.isfinite()guard to reject NaN/Inf at the source.DEFAULT_STREAMING_EDIT_INTERVAL/DEFAULT_STREAMING_BUFFER_THRESHOLD/DEFAULT_STREAMING_CURSORingateway/config.py);StreamConsumerConfigimports from there, eliminating dual-definition drift risk._is_gateway_streaming_enabled()helper for consistent platform-streaming resolution across both gateway code paths, with properTrue/False/Noneoverride semantics.already_sentdedup logic to account forresponse_previewedand gate stream-consumeralready_sentbehind effective streaming state, preventing false suppression of final responses.Test plan
tests/gateway/test_display_config.py— platform default assertions updated for Telegramtool_progress: "new"tests/gateway/test_proxy_mode.py— addedTestGatewayStreamingResolverunit tests (5 cases)tests/gateway/test_run_progress_topics.py— verified no regression in already_sent / previewed flowtests/gateway/test_text_batching.py— adaptive delay logic unchanged for non-Telegram platformstests/gateway/test_stream_consumer.py— streaming consumer defaults now flow from unified constantstests/gateway/test_duplicate_reply_suppression.py— dedup logic verified