Draft stream preview garbles long replies on Telegram (streamMode: block)

## Description

When `streamMode` is set to `"block"` on the Telegram channel, long model replies arrive on Telegram as a single final message with garbled/corrupted text (missing spaces, words merged, broken markdown). The draft stream preview visible in the **local Dashboard GUI** shows progressively longer replacements of the message, but Telegram itself does not show progressive updates — it only receives the final (already garbled) message. The garbled final state in the dashboard matches what Telegram receives.

## Steps to Reproduce

1. Configure Telegram channel with `streamMode: "block"` (private chat)
2. Send a message that triggers a long response (e.g., a code analysis question)
3. Observe the local dashboard showing progressively longer message updates
4. The final message in both dashboard and Telegram is garbled

## Observed Behavior

The initial short reply appears clean. As the model continues generating, the dashboard draft preview grows. The final delivered message contains:
- Code snippets merged into prose without spaces
- Internal analysis/tool output leaking into reply text
- Markdown syntax broken (backticks without spaces, headers merged with text)

## Root Cause Analysis

### Architecture Overview

When `streamMode: "block"` is active in a private Telegram chat:

1. **Draft stream** is created (`createTelegramDraftStream`) with a 4096-char cap
2. **Draft chunker** (`EmbeddedBlockChunker`) with `minChars: 200`, `maxChars: 800`, `breakPreference: "paragraph"` controls when the draft preview updates
3. **Block streaming** is **disabled** (`disableBlockStreaming = true` — line 156-158 in `bot-message-dispatch.ts`) because the draft stream handles the preview instead
4. The **final reply** comes from `deliverReplies()` using the accumulated `assistantTexts` payloads

### The Draft Stream Path (Preview Only)

In `bot-message-dispatch.ts` lines 99-136, `updateDraftFromPartial`:

1. Each `onPartialReply` callback receives the **cumulative** cleaned text from the agent
2. A delta is extracted: `delta = text.slice(lastPartialText.length)` (line 113)
3. The delta is fed to `draftChunker.append(delta)` (line 128)
4. The chunker drains to `draftText += chunk` and calls `draftStream.update(draftText)` (lines 131-133)
5. The draft stream is throttled (300ms) and capped at 4096 chars (line 45-49 in `draft-stream.ts`)

**Key: When `draftText` exceeds 4096 chars, `sendDraft` sets `stopped = true` and logs a warning.** After this, the draft stream silently stops updating. The last preview in the dashboard is whatever was sent before the cap was hit.

### The Final Reply Path

Since `disableBlockStreaming = true`, the block reply pipeline is not active. Instead:

1. The embedded agent accumulates text in `assistantTexts[]` via `pi-embedded-subscribe.handlers.messages.ts`
2. On `message_end`, the full text is extracted from the assistant message
3. `buildReplyPayloads()` processes the final payloads
4. Since `blockStreamingEnabled` is false (due to draft stream), `shouldDropFinalPayloads` is false — the final payloads are used
5. `deliverReplies()` sends the final text through `markdownToTelegramChunks` → `sendTelegramText`

### Where Garbling Occurs — Two Suspect Paths

#### Path A: Draft Stream Truncation → Stale Dashboard Preview

When the response exceeds 4096 chars in the draft stream:
- `stopped = true` freezes the preview at a partial state
- But the dashboard UI may show the frozen preview as the "current" message
- The actual final delivery via `deliverReplies()` sends the correct full text
- **If the dashboard is showing the draft preview (not the delivered message), the garbling is a display issue**

#### Path B: Non-Monotonic Stream Handling (the likely culprit)

In `updateDraftFromPartial` (lines 112-118):

```typescript
if (text.startsWith(lastPartialText)) {
  delta = text.slice(lastPartialText.length);
} else {
  // Streaming buffer reset (or non-monotonic stream). Start fresh.
  draftChunker?.reset();
  draftText = "";
}
lastPartialText = text;
```

When the provider's streaming is **non-monotonic** (the new cumulative text doesn't start with the previous), the chunker resets and `draftText` is cleared. But `lastPartialText` is set to the new (different) text. On the **next** delta:
- `text.startsWith(lastPartialText)` may succeed
- But `draftText` was reset to `""`, so the next chunk starts from scratch
- **The gap between the old `draftText` and the new accumulation creates the garbling**

This happens when:
- Auto-compaction triggers mid-stream (the provider rewrites the text buffer)
- Tool calls complete and the agent produces a new text block
- The AI SDK resets/replaces the text content block

#### Path C: `sanitizeUserFacingText` stripping

The `normalizeStreamingText` function in `agent-runner-execution.ts` (line 130) runs `sanitizeUserFacingText()` on the partial text. This function:
- Strips `<final>` tags
- Collapses consecutive duplicate paragraphs
- Sanitizes HTTP error codes

If `sanitizeUserFacingText` modifies the text in a way that makes it non-monotonic relative to the previous partial, the `updateDraftFromPartial` function triggers the reset path (Path B), causing the garbling cascade.

### Summary of the Bug

The combination of:
1. **Draft stream chunker** accumulating `draftText` by appending chunks
2. **Non-monotonic partial replies** (from tool calls, compaction, or text sanitization) triggering the reset path
3. **No reconciliation** between the reset `draftText` and what was previously accumulated

...means the final `draftText` can have gaps or overwrites, producing garbled output in the draft preview. If the final delivery is ALSO garbled (not just the draft), then the issue is in the `assistantTexts` accumulation in `pi-embedded-subscribe.handlers.messages.ts`, potentially in the `text_end` handling where:

```typescript
if (content.startsWith(ctx.state.deltaBuffer)) {
  chunk = content.slice(ctx.state.deltaBuffer.length);
} else if (ctx.state.deltaBuffer.startsWith(content)) {
  chunk = "";
} else if (!ctx.state.deltaBuffer.includes(content)) {
  chunk = content;
}
```

If none of the conditions match (partial overlap), the `text_end` content is silently dropped, causing missing text in the final output.

## Relevant Files

| File | Role |
|------|------|
| `src/telegram/bot-message-dispatch.ts` | Draft stream setup + `updateDraftFromPartial` |
| `src/telegram/draft-stream.ts` | Draft stream with 4096-char cap |
| `src/telegram/draft-chunking.ts` | Chunking config (200-800 chars, paragraph break) |
| `src/agents/pi-embedded-block-chunker.ts` | `EmbeddedBlockChunker` implementation |
| `src/agents/pi-embedded-subscribe.handlers.messages.ts` | `text_delta`/`text_end` accumulation |
| `src/auto-reply/reply/agent-runner-execution.ts` | `normalizeStreamingText` + `sanitizeUserFacingText` |
| `src/auto-reply/reply/agent-runner-payloads.ts` | `buildReplyPayloads` final assembly |
| `src/telegram/bot/delivery.ts` | `deliverReplies` final Telegram send |

## Suggested Fixes

1. **Path B fix**: When the non-monotonic reset triggers, reconstruct `draftText` from the full `text` instead of clearing to empty. The `text` parameter already contains the full cleaned response — use it directly as `draftText`.

2. **Draft stream cap**: When `draftText` exceeds 4096 chars, instead of `stopped = true` (which freezes the preview), truncate the preview text with "..." and continue tracking internally.

3. **Add monotonicity guard in `text_end`**: The partial overlap case in `handleMessageUpdate` silently drops content. Add a fallback that uses the full `content` when no clean delta can be extracted.

4. **Diagnostic logging**: Log when non-monotonic resets occur, with before/after text lengths, to aid debugging.

File	Role
`src/telegram/bot-message-dispatch.ts`	Draft stream setup + `updateDraftFromPartial`
`src/telegram/draft-stream.ts`	Draft stream with 4096-char cap
`src/telegram/draft-chunking.ts`	Chunking config (200-800 chars, paragraph break)
`src/agents/pi-embedded-block-chunker.ts`	`EmbeddedBlockChunker` implementation
`src/agents/pi-embedded-subscribe.handlers.messages.ts`	`text_delta`/`text_end` accumulation
`src/auto-reply/reply/agent-runner-execution.ts`	`normalizeStreamingText` + `sanitizeUserFacingText`
`src/auto-reply/reply/agent-runner-payloads.ts`	`buildReplyPayloads` final assembly
`src/telegram/bot/delivery.ts`	`deliverReplies` final Telegram send

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft stream preview garbles long replies on Telegram (streamMode: block) #8537

Description

Steps to Reproduce

Observed Behavior

Root Cause Analysis

Architecture Overview

The Draft Stream Path (Preview Only)

The Final Reply Path

Where Garbling Occurs — Two Suspect Paths

Path A: Draft Stream Truncation → Stale Dashboard Preview

Path B: Non-Monotonic Stream Handling (the likely culprit)

Path C: `sanitizeUserFacingText` stripping

Summary of the Bug

Relevant Files

Suggested Fixes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Draft stream preview garbles long replies on Telegram (streamMode: block) #8537

Description

Description

Steps to Reproduce

Observed Behavior

Root Cause Analysis

Architecture Overview

The Draft Stream Path (Preview Only)

The Final Reply Path

Where Garbling Occurs — Two Suspect Paths

Path A: Draft Stream Truncation → Stale Dashboard Preview

Path B: Non-Monotonic Stream Handling (the likely culprit)

Path C: sanitizeUserFacingText stripping

Summary of the Bug

Relevant Files

Suggested Fixes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Path C: `sanitizeUserFacingText` stripping