Skip to content

perf(desktop): coalesce streaming text/reasoning deltas per animation frame#3114

Closed
HUQIANTAO wants to merge 2 commits into
esengine:main-v2from
HUQIANTAO:perf/streaming-raf-batch
Closed

perf(desktop): coalesce streaming text/reasoning deltas per animation frame#3114
HUQIANTAO wants to merge 2 commits into
esengine:main-v2from
HUQIANTAO:perf/streaming-raf-batch

Conversation

@HUQIANTAO

Copy link
Copy Markdown
Contributor

The agent stream pushes a text/reasoning delta per Webview event; at 200 tok/s that is one state update every ~5ms, while rAF is 16ms. The reducer was running 3-4 times per visible frame and the React tree was re-rendering the entire Message list each time.

Add lib/rafBatch.ts: a tiny rAF-coalescing queue with a synchronous drain() so non-text events (tool_dispatch, usage, notice, turn_started/done, message) flush any pending deltas first to keep causal ordering intact. Drain is also called on unmount so a turn-end that races the teardown doesn't strand the last few tokens in the buffer.

tsc --noEmit passes; go build ./desktop/... is clean. Frontend test spec is included as a contract test to be enabled when vitest is added.

@github-actions github-actions Bot added the v2 Go rewrite (1.x) — main-v2 branch, active development label Jun 4, 2026
@HUQIANTAO HUQIANTAO requested a review from SivanCola as a code owner June 5, 2026 01:21
@HUQIANTAO HUQIANTAO force-pushed the perf/streaming-raf-batch branch from f6ec9ab to f017ca8 Compare June 5, 2026 01:23
HUQIANTAO added 2 commits June 5, 2026 09:38
… frame

The agent stream pushes a text/reasoning delta per Webview event; at 200
tok/s that is one state update every ~5ms, while rAF is 16ms — so the
reducer was running 3-4 times per visible frame and the React tree was
re-rendering the entire Message list each time.

Add lib/rafBatch.ts: a tiny rAF-coalescing queue with a synchronous
drain() so non-text events (tool_dispatch, usage, notice, turn_started/
done, message) flush any pending deltas first to keep causal ordering
intact. Drain is also called on unmount so a turn-end that races the
teardown doesn't strand the last few tokens in the buffer.

tsc --noEmit passes; Go build ./desktop/... is clean.
… frame

The agent stream pushes a text/reasoning delta per Webview event; at 200
tok/s that is one state update every ~5ms, while rAF is 16ms — so the
reducer was running 3-4 times per visible frame and the React tree was
re-rendering the entire Message list each time.

Add lib/rafBatch.ts: a tiny rAF-coalescing queue with a synchronous
drain() so non-text events (tool_dispatch, usage, notice, turn_started/
done, message) flush any pending deltas first to keep causal ordering
intact. Drain is also called on unmount so a turn-end that races the
teardown doesn't strand the last few tokens in the buffer.

tsc --noEmit passes; Go build ./desktop/... is clean.
@esengine

esengine commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Thanks! This conflicts with the latest main-v2 in useController.ts (other desktop PRs have since landed there). Could you rebase onto the latest main-v2? Note your open desktop PRs also touch App.tsx (#3116, #3126) and useController.ts (#2948) — rebasing them as one ordered chain will avoid repeated conflicts. Happy to merge once green. Thank you!

esengine added a commit that referenced this pull request Jun 8, 2026
… frame (#3114) (#3501)

* perf(desktop): coalesce streaming text/reasoning deltas per animation frame

The agent stream pushes a text/reasoning delta per Webview event; at 200
tok/s that is one state update every ~5ms, while rAF is 16ms — so the
reducer was running 3-4 times per visible frame and the React tree was
re-rendering the entire Message list each time.

Add lib/rafBatch.ts: a tiny rAF-coalescing queue with a synchronous
drain() so non-text events (tool_dispatch, usage, notice, turn_started/
done, message) flush any pending deltas first to keep causal ordering
intact. Drain is also called on unmount so a turn-end that races the
teardown doesn't strand the last few tokens in the buffer.

tsc --noEmit passes; Go build ./desktop/... is clean.

* perf(desktop): coalesce streaming text/reasoning deltas per animation frame

The agent stream pushes a text/reasoning delta per Webview event; at 200
tok/s that is one state update every ~5ms, while rAF is 16ms — so the
reducer was running 3-4 times per visible frame and the React tree was
re-rendering the entire Message list each time.

Add lib/rafBatch.ts: a tiny rAF-coalescing queue with a synchronous
drain() so non-text events (tool_dispatch, usage, notice, turn_started/
done, message) flush any pending deltas first to keep causal ordering
intact. Drain is also called on unmount so a turn-end that races the
teardown doesn't strand the last few tokens in the buffer.

tsc --noEmit passes; Go build ./desktop/... is clean.

---------

Co-authored-by: HUQIANTAO <HUQIANTAO@users.noreply.github.com>
Co-authored-by: reasonix <reasonix@deepseek.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants