-
-
Notifications
You must be signed in to change notification settings - Fork 55.4k
Description
Description
When streamMode is set to "block" on the Telegram channel, long model replies arrive on Telegram as a single final message with garbled/corrupted text (missing spaces, words merged, broken markdown). The draft stream preview visible in the local Dashboard GUI shows progressively longer replacements of the message, but Telegram itself does not show progressive updates — it only receives the final (already garbled) message. The garbled final state in the dashboard matches what Telegram receives.
Steps to Reproduce
- Configure Telegram channel with
streamMode: "block"(private chat) - Send a message that triggers a long response (e.g., a code analysis question)
- Observe the local dashboard showing progressively longer message updates
- The final message in both dashboard and Telegram is garbled
Observed Behavior
The initial short reply appears clean. As the model continues generating, the dashboard draft preview grows. The final delivered message contains:
- Code snippets merged into prose without spaces
- Internal analysis/tool output leaking into reply text
- Markdown syntax broken (backticks without spaces, headers merged with text)
Root Cause Analysis
Architecture Overview
When streamMode: "block" is active in a private Telegram chat:
- Draft stream is created (
createTelegramDraftStream) with a 4096-char cap - Draft chunker (
EmbeddedBlockChunker) withminChars: 200,maxChars: 800,breakPreference: "paragraph"controls when the draft preview updates - Block streaming is disabled (
disableBlockStreaming = true— line 156-158 inbot-message-dispatch.ts) because the draft stream handles the preview instead - The final reply comes from
deliverReplies()using the accumulatedassistantTextspayloads
The Draft Stream Path (Preview Only)
In bot-message-dispatch.ts lines 99-136, updateDraftFromPartial:
- Each
onPartialReplycallback receives the cumulative cleaned text from the agent - A delta is extracted:
delta = text.slice(lastPartialText.length)(line 113) - The delta is fed to
draftChunker.append(delta)(line 128) - The chunker drains to
draftText += chunkand callsdraftStream.update(draftText)(lines 131-133) - The draft stream is throttled (300ms) and capped at 4096 chars (line 45-49 in
draft-stream.ts)
Key: When draftText exceeds 4096 chars, sendDraft sets stopped = true and logs a warning. After this, the draft stream silently stops updating. The last preview in the dashboard is whatever was sent before the cap was hit.
The Final Reply Path
Since disableBlockStreaming = true, the block reply pipeline is not active. Instead:
- The embedded agent accumulates text in
assistantTexts[]viapi-embedded-subscribe.handlers.messages.ts - On
message_end, the full text is extracted from the assistant message buildReplyPayloads()processes the final payloads- Since
blockStreamingEnabledis false (due to draft stream),shouldDropFinalPayloadsis false — the final payloads are used deliverReplies()sends the final text throughmarkdownToTelegramChunks→sendTelegramText
Where Garbling Occurs — Two Suspect Paths
Path A: Draft Stream Truncation → Stale Dashboard Preview
When the response exceeds 4096 chars in the draft stream:
stopped = truefreezes the preview at a partial state- But the dashboard UI may show the frozen preview as the "current" message
- The actual final delivery via
deliverReplies()sends the correct full text - If the dashboard is showing the draft preview (not the delivered message), the garbling is a display issue
Path B: Non-Monotonic Stream Handling (the likely culprit)
In updateDraftFromPartial (lines 112-118):
if (text.startsWith(lastPartialText)) {
delta = text.slice(lastPartialText.length);
} else {
// Streaming buffer reset (or non-monotonic stream). Start fresh.
draftChunker?.reset();
draftText = "";
}
lastPartialText = text;When the provider's streaming is non-monotonic (the new cumulative text doesn't start with the previous), the chunker resets and draftText is cleared. But lastPartialText is set to the new (different) text. On the next delta:
text.startsWith(lastPartialText)may succeed- But
draftTextwas reset to"", so the next chunk starts from scratch - The gap between the old
draftTextand the new accumulation creates the garbling
This happens when:
- Auto-compaction triggers mid-stream (the provider rewrites the text buffer)
- Tool calls complete and the agent produces a new text block
- The AI SDK resets/replaces the text content block
Path C: sanitizeUserFacingText stripping
The normalizeStreamingText function in agent-runner-execution.ts (line 130) runs sanitizeUserFacingText() on the partial text. This function:
- Strips
<final>tags - Collapses consecutive duplicate paragraphs
- Sanitizes HTTP error codes
If sanitizeUserFacingText modifies the text in a way that makes it non-monotonic relative to the previous partial, the updateDraftFromPartial function triggers the reset path (Path B), causing the garbling cascade.
Summary of the Bug
The combination of:
- Draft stream chunker accumulating
draftTextby appending chunks - Non-monotonic partial replies (from tool calls, compaction, or text sanitization) triggering the reset path
- No reconciliation between the reset
draftTextand what was previously accumulated
...means the final draftText can have gaps or overwrites, producing garbled output in the draft preview. If the final delivery is ALSO garbled (not just the draft), then the issue is in the assistantTexts accumulation in pi-embedded-subscribe.handlers.messages.ts, potentially in the text_end handling where:
if (content.startsWith(ctx.state.deltaBuffer)) {
chunk = content.slice(ctx.state.deltaBuffer.length);
} else if (ctx.state.deltaBuffer.startsWith(content)) {
chunk = "";
} else if (!ctx.state.deltaBuffer.includes(content)) {
chunk = content;
}If none of the conditions match (partial overlap), the text_end content is silently dropped, causing missing text in the final output.
Relevant Files
| File | Role |
|---|---|
src/telegram/bot-message-dispatch.ts |
Draft stream setup + updateDraftFromPartial |
src/telegram/draft-stream.ts |
Draft stream with 4096-char cap |
src/telegram/draft-chunking.ts |
Chunking config (200-800 chars, paragraph break) |
src/agents/pi-embedded-block-chunker.ts |
EmbeddedBlockChunker implementation |
src/agents/pi-embedded-subscribe.handlers.messages.ts |
text_delta/text_end accumulation |
src/auto-reply/reply/agent-runner-execution.ts |
normalizeStreamingText + sanitizeUserFacingText |
src/auto-reply/reply/agent-runner-payloads.ts |
buildReplyPayloads final assembly |
src/telegram/bot/delivery.ts |
deliverReplies final Telegram send |
Suggested Fixes
-
Path B fix: When the non-monotonic reset triggers, reconstruct
draftTextfrom the fulltextinstead of clearing to empty. Thetextparameter already contains the full cleaned response — use it directly asdraftText. -
Draft stream cap: When
draftTextexceeds 4096 chars, instead ofstopped = true(which freezes the preview), truncate the preview text with "..." and continue tracking internally. -
Add monotonicity guard in
text_end: The partial overlap case inhandleMessageUpdatesilently drops content. Add a fallback that uses the fullcontentwhen no clean delta can be extracted. -
Diagnostic logging: Log when non-monotonic resets occur, with before/after text lengths, to aid debugging.