Skip to content

fix(tui): avoid retokenizing streaming replies per chunk#598

Merged
esengine merged 1 commit into
esengine:mainfrom
paradoxSCH:fix/streaming-card-live-token-rate
May 10, 2026
Merged

fix(tui): avoid retokenizing streaming replies per chunk#598
esengine merged 1 commit into
esengine:mainfrom
paradoxSCH:fix/streaming-card-live-token-rate

Conversation

@paradoxSCH

Copy link
Copy Markdown
Contributor

What

Reduces live StreamingCard token-rate work by calibrating exact token counts only at length buckets during streaming, while keeping the settled done-card token count exact. Also removes obsolete Biome suppression comments that now fail the current lint gate.

Why

Fixes #562. The live reply card previously called countTokens on the full accumulated assistant text on each render/chunk, which can add avoidable BPE work for long streaming responses.

How to verify

  • npm run verify

Checklist

  • npm run verify passes locally (lint + typecheck + tests + comment-policy gate)
  • No Co-Authored-By: Claude trailer in commits
  • Comments follow CONTRIBUTING.md (no module-essay headers, no incident history)
  • No edits to CHANGELOG.md — release notes are maintainer-written at release time

@esengine esengine merged commit 2fc06c7 into esengine:main May 10, 2026
3 checks passed
ChasLui pushed a commit to ChasLui/DeepSeek-Reasonix that referenced this pull request May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

StreamingCard re-tokenizes full reply text on every chunk — estimate during streaming, exact only at done

2 participants