[GTC] fix(tokenizer): bound pathological token counting by GTC2080 · Pull Request #1036 · esengine/DeepSeek-Reasonix

GTC2080 · 2026-05-16T16:09:28Z

What

Adds bounded token counting for oversized text and routes the hot token-estimation paths through it: request estimation, context shrink ordering, /context breakdowns, desktop context telemetry, /cost, streaming token-rate estimates, and subagent distillation metrics. This keeps exact counting for normal-sized inputs while sampling pathological large strings.

Why

Fixes #558. A reproduction is a very large repetitive payload such as "A".repeat(100_000) flowing into token-counting paths; exact BPE counting can spend multiple seconds there before shrink/healing gets a chance to reduce the content. The bounded counter avoids that stall while preserving a proportional estimate for context budgeting and UI telemetry.

How to verify

npm run verify

Regression coverage added for countTokensBounded("A".repeat(100_000)) and for /context breakdown over a 100KB tool result, both asserting the pathological path stays under 1 second.

Checklist

npm run verify passes locally (lint + typecheck + tests + comment-policy gate)
No Co-Authored-By: Claude trailer in commits
Comments follow CONTRIBUTING.md (no module-essay headers, no incident history)
No edits to CHANGELOG.md — release notes are maintainer-written at release time

esengine · 2026-05-17T00:56:24Z

Thanks @GTC2080 — clean fix and great test coverage on the perf bound. Merged.

fix(tokenizer): bound pathological token counting

6217ac5

GTC2080 marked this pull request as ready for review May 16, 2026 16:11

esengine merged commit 8fd3989 into esengine:main May 17, 2026
5 checks passed

ChasLui pushed a commit to ChasLui/DeepSeek-Reasonix that referenced this pull request May 23, 2026

fix(tokenizer): bound pathological token counting (esengine#1036)

ad8264f

GTC2080 deleted the GTC/bound-pathological-token-counting branch May 31, 2026 10:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GTC] fix(tokenizer): bound pathological token counting#1036

[GTC] fix(tokenizer): bound pathological token counting#1036
esengine merged 1 commit into
esengine:mainfrom
GTC2080:GTC/bound-pathological-token-counting

GTC2080 commented May 16, 2026

Uh oh!

Uh oh!

esengine commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GTC2080 commented May 16, 2026

What

Why

How to verify

Checklist

Uh oh!

Uh oh!

esengine commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants