fix(telegram): split long messages at word boundaries instead of mid-word by hydro13 · Pull Request #56595 · openclaw/openclaw

hydro13 · 2026-03-28T20:00:54Z

Summary

Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit
Split at the last whitespace boundary within the verified prefix
Single words longer than the limit still hard-split (unavoidable)
Markdown formatting stays balanced across split points

Root Cause

splitTelegramChunkByHtmlLimit in extensions/telegram/src/format.ts used a proportional estimate from rendered HTML length. When HTML escaping expanded characters (e.g. < → <), the estimate window was too short to reach the next whitespace, and findMarkdownIRPreservedSplitIndex fell back to a hard cut at maxEnd — mid-word.

Change Type

Bug fix

Testing

41 tests pass. New regression tests for:

Word-boundary split when HTML escaping shrinks the retry window
Single long word exceeding the limit (hard split)
Formatted text splitting at word boundary with balanced <b>...</b> tags

Fixes #36644

greptile-apps · 2026-03-28T20:03:54Z

Greptile Summary

This PR replaces the proportional-estimate split heuristic in splitTelegramChunkByHtmlLimit with a binary search for the largest text prefix whose rendered Telegram HTML fits within the character limit, then delegates to the existing whitespace-boundary splitter. This correctly handles the case where HTML escaping (e.g. < → <) caused the proportional window to fall short of the next whitespace, resulting in mid-word cuts.

Key changes:

New findLargestTelegramChunkTextLengthWithinHtmlLimit does a binary search over text-character prefixes, using the proportional estimate as an optimistic starting point to shrink the initial search range.
splitTelegramChunkByHtmlLimit is simplified to one call of splitMarkdownIRPreserveWhitespace using the binary-search result.
Three new regression tests cover: escaped-HTML shrinking the retry window, single words exceeding the limit (hard-split fallback), and formatting preserved across a word-boundary split.

The implementation is correct and well-tested. The one implicit assumption is that renderTelegramChunkHtml is monotonically non-decreasing in text length; this holds for any reasonable Markdown-to-HTML renderer and is never violated in practice here.

Confidence Score: 5/5

Safe to merge — the binary search is correct, the proportional-estimate optimisation is sound, edge cases are handled, and 41 tests pass.

No P0 or P1 issues found. The logic is monotone-safe (more text → more HTML bytes), the best = 0 fallback terminates correctly, and the termination of splitMarkdownIRPreserveWhitespace with limit = 1 is guaranteed. All remaining observations are P2 or lower.

No files require special attention.

Important Files Changed

Filename	Overview
extensions/telegram/src/format.ts	Adds `findLargestTelegramChunkTextLengthWithinHtmlLimit` (binary search over text-prefix lengths) and simplifies `splitTelegramChunkByHtmlLimit` to use it. Logic is correct: the proportional estimate is used as an optimistic lower bound, the search is bounded by `currentTextLength - 1` to guarantee progress, and the `Math.max(1, best)` fallback is safe.
extensions/telegram/src/format.wrap-md.test.ts	Adds three targeted regression tests covering the escaped-HTML window bug, single-word hard-split fallback, and bold-formatting preservation across a word-boundary split. Tests are well-structured and provide concrete assertions on both text content and HTML length invariants.

_{Reviews (1): Last reviewed commit: "fix(telegram): split long messages at wo..." | Re-trigger Greptile}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: edc3827ca8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T20:04:39Z

+    }
+    high = mid - 1;


Replace binary search with non-monotonic-safe split search

The new binary search assumes renderTelegramChunkHtml(slice(..., 0, mid)).length <= htmlLimit is monotonic as mid increases, but this formatter has non-monotonic prefixes: truncated auto-linked file refs render as long <a ...> tags, while the completed ref is de-linkified/wrapped shorter (buildTelegramLink + wrapFileReferencesInHtml). In cases like README.md z with a limit between those sizes, high = mid - 1 can prune valid larger prefixes and return 1; the retry loop then emits 1-character chunks and accepts them via the chunk.text.length <= 1 escape hatch, so resulting chunk HTML can still exceed Telegram’s limit.

Useful? React with 👍 / 👎.

…word Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 00be04e7ed

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.

Add missing changelog entries for PRs #56500, #56540, #56555, #56567, #56573, #56587, #56595, #56612, #56620.

…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644

Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.

…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644

Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.

…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644

Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.

…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644

Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.

…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644

Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.

…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644

Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.

…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644

Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.

openclaw-barnacle Bot added channel: telegram Channel integration: telegram size: S maintainer Maintainer-authored PR labels Mar 28, 2026

chatgpt-codex-connector Bot reviewed Mar 28, 2026

View reviewed changes

hydro13 force-pushed the fix/telegram-message-split-word-boundary branch from edc3827 to 00be04e Compare March 28, 2026 20:16

chatgpt-codex-connector Bot reviewed Mar 28, 2026

View reviewed changes

Comment thread extensions/telegram/src/format.ts

hydro13 merged commit ab2ef7b into openclaw:main Mar 28, 2026
34 of 35 checks passed

rmarr mentioned this pull request Mar 28, 2026

fix(imessage): prevent self-chat dedupe false positives (#47830) #55359

Merged

25 tasks

hydro13 mentioned this pull request Mar 28, 2026

chore: backfill changelog entries for recent fixes #56625

Merged

hydro13 added a commit that referenced this pull request Mar 28, 2026

chore: backfill changelog entries for recent fixes (#56625)

2022dfd

Add missing changelog entries for PRs #56500, #56540, #56555, #56567, #56573, #56587, #56595, #56612, #56620.

github-actions Bot mentioned this pull request Mar 28, 2026

📡 Upstream Digest — 2026-03-28 22:21 UTC curtismercier/openclaw-mods#395

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(telegram): split long messages at word boundaries instead of mid-word#56595

fix(telegram): split long messages at word boundaries instead of mid-word#56595
hydro13 merged 1 commit intoopenclaw:mainfrom
hydro13:fix/telegram-message-split-word-boundary

hydro13 commented Mar 28, 2026

Uh oh!

greptile-apps Bot commented Mar 28, 2026

Important Files Changed

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hydro13 commented Mar 28, 2026

Summary

Root Cause

Change Type

Testing

Uh oh!

greptile-apps Bot commented Mar 28, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant