Fix streaming text sanitization to prevent non-monotonic output#9412
Fix streaming text sanitization to prevent non-monotonic output#9412ksylvan wants to merge 4 commits intoopenclaw:mainfrom
Conversation
|
Good catch @greptile-apps! You're absolutely right — the When Fixed in 79c4a99: simplified to just |
|
You're welcome! The fix looks good — explicitly setting The updated logic now properly covers all three cases:
The comment in the new commit accurately explains why this is safe: the final delivery doesn't rely on |
- Add `streaming` option to `sanitizeUserFacingText` to skip duplicate-block collapsing - Handle partial overlap in delta buffer to avoid dropping streamed text - Pass streaming flag in agent runner execution for draft updates - Recover gracefully from non-monotonic streams in Telegram dispatch - Add comprehensive tests for streaming mode behavior
79c4a99 to
ab69c5f
Compare
|
@steipete This fixes the common case I was seeing recently where long tool-use conversations were generating slightly garbled messages on both Telegram and the local gateway UI chat interface. |
- Add buffer reset on content backtrack to prevent garbled output - Clarify branch conditions with explicit comments for each case - Reset `deltaBuffer`, `blockBuffer`, and `blockChunker` on content mismatch - Reorder conditional branches from prefix-match to substring to fallback - Remove stale comment about `extractAssistantText` fallback safety
- Clear `lastStreamedAssistant` and `lastStreamedAssistantCleaned` on backtrack rewrite - Reset all streaming state, not just delta and block buffers - Prevent stale cached assistant content from corrupting rewritten output - Update comment to reflect full state reset scope
…tput - Add `lastContentBlockIndex` to track current content block - Flush pending block reply data when content block index changes - Reset `deltaBuffer`, `blockBuffer`, and streamed state on new block - Initialize `lastContentBlockIndex` to `-1` in subscribe state - Reset `lastContentBlockIndex` on turn boundaries - Prevent delta contamination across multi-block assistant responses
|
This fix does not completely work. Closing and refreshing. |
Fix streaming text sanitization to prevent non-monotonic output
Fixes #8537
Summary
This PR fixes a streaming text bug where the
sanitizeUserFacingTextfunction's duplicate-block collapsing behavior caused non-monotonic output during streaming, resulting in lost textin the draft stream. The fix introduces astreamingmode that bypasses the duplicate-blockcollapsing logic while preserving other sanitization behaviors.Files Changed
src/agents/pi-embedded-helpers/errors.tsstreamingparameter tosanitizeUserFacingTextsrc/agents/pi-embedded-helpers.sanitizeuserfacingtext.test.tssrc/agents/pi-embedded-subscribe.handlers.messages.tssrc/auto-reply/reply/agent-runner-execution.tssrc/telegram/bot-message-dispatch.tsCode Changes
errors.ts- New Streaming Modepi-embedded-subscribe.handlers.messages.ts- Partial Overlap Handlingbot-message-dispatch.ts- Recovery FixReason for Changes
The
collapseConsecutiveDuplicateBlocksfunction is stateless and operates on whatever textit receives at any given moment. During streaming, chunk boundaries shift as more content arrives, which can cause the function to produce different output shapes for the same logical content at different times. This non-monotonic behavior caused downstream consumers to detect a "reset" condition and discard accumulated text.Impact of Changes
extractAssistantText, so duplicate collapsing still applies to completed messagesTest Plan
New unit tests verify:
<final>tagsAdditional Notes
handleMessageUpdatehandles an edge case where neither the new content nor the buffer is a prefix of the other, but they share a substring relationshipchunkas empty string is safe because final delivery doesn't rely ondeltaBufferGreptile Overview
Greptile Summary
This PR introduces a
streamingoption tosanitizeUserFacingTextso that streaming paths can bypass stateless duplicate-paragraph collapsing (which can change output shape across chunk boundaries), while preserving other sanitization behaviors like stripping<final>tags and rewriting common API/role-ordering errors. It also adjusts streaming/draft consumers (agent runner, Telegram draft streaming) to handle non-monotonic streams more safely by recovering with a full-text baseline instead of discarding accumulated text, and adds unit tests for the new streaming mode.These changes fit into the codebase’s partial-reply pipelines by making sanitization predictable for incremental updates, while keeping the existing “final” message behavior (including duplicate collapsing) unchanged for completed deliveries.
Confidence Score: 5/5
(2/5) Greptile learns from your feedback when you react with thumbs up/down!