fix(telegram): split long messages at word boundaries instead of mid-word#56595
Conversation
Greptile SummaryThis PR replaces the proportional-estimate split heuristic in Key changes:
The implementation is correct and well-tested. The one implicit assumption is that Confidence Score: 5/5Safe to merge — the binary search is correct, the proportional-estimate optimisation is sound, edge cases are handled, and 41 tests pass. No P0 or P1 issues found. The logic is monotone-safe (more text → more HTML bytes), the No files require special attention.
|
| Filename | Overview |
|---|---|
| extensions/telegram/src/format.ts | Adds findLargestTelegramChunkTextLengthWithinHtmlLimit (binary search over text-prefix lengths) and simplifies splitTelegramChunkByHtmlLimit to use it. Logic is correct: the proportional estimate is used as an optimistic lower bound, the search is bounded by currentTextLength - 1 to guarantee progress, and the Math.max(1, best) fallback is safe. |
| extensions/telegram/src/format.wrap-md.test.ts | Adds three targeted regression tests covering the escaped-HTML window bug, single-word hard-split fallback, and bold-formatting preservation across a word-boundary split. Tests are well-structured and provide concrete assertions on both text content and HTML length invariants. |
Reviews (1): Last reviewed commit: "fix(telegram): split long messages at wo..." | Re-trigger Greptile
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: edc3827ca8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
| high = mid - 1; |
There was a problem hiding this comment.
Replace binary search with non-monotonic-safe split search
The new binary search assumes renderTelegramChunkHtml(slice(..., 0, mid)).length <= htmlLimit is monotonic as mid increases, but this formatter has non-monotonic prefixes: truncated auto-linked file refs render as long <a ...> tags, while the completed ref is de-linkified/wrapped shorter (buildTelegramLink + wrapFileReferencesInHtml). In cases like README.md z with a limit between those sizes, high = mid - 1 can prune valid larger prefixes and return 1; the retry loop then emits 1-character chunks and accepts them via the chunk.text.length <= 1 escape hatch, so resulting chunk HTML can still exceed Telegram’s limit.
Useful? React with 👍 / 👎.
…word Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644
edc3827 to
00be04e
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 00be04e7ed
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.
…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644
Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.
…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644
Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.
…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644
Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.
…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644
Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.
…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644
Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.
…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644
Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.
…word (openclaw#56595) Replace proportional text estimate with binary search for the largest text prefix whose rendered Telegram HTML fits the character limit, then split at the last whitespace boundary within that verified prefix. Single words longer than the limit still hard-split (unavoidable). Markdown formatting stays balanced across split points. Fixes openclaw#36644
Add missing changelog entries for PRs openclaw#56500, openclaw#56540, openclaw#56555, openclaw#56567, openclaw#56573, openclaw#56587, openclaw#56595, openclaw#56612, openclaw#56620.
Summary
Root Cause
splitTelegramChunkByHtmlLimitinextensions/telegram/src/format.tsused a proportional estimate from rendered HTML length. When HTML escaping expanded characters (e.g.<→<), the estimate window was too short to reach the next whitespace, andfindMarkdownIRPreservedSplitIndexfell back to a hard cut atmaxEnd— mid-word.Change Type
Testing
41 tests pass. New regression tests for:
<b>...</b>tagsFixes #36644