Audit `countTokens` callers — add upper-char bound to defend non-MCP paths from pathological BPE input

## Summary

A user report flagged the pure-TS BPE tokenizer as O(n²) on
pathological repetitive input (`AAAA…`) and claimed multiple call
paths could hang for 30s+. Most of the report misread our code —
specifically, the `Never tokenizes full input — pathological
repetitive text (AAAA…) costs 30s+...` comment at
`src/mcp/registry.ts:152` is **defense documentation**, not a bug
admission. `truncateForModelByTokens` is deliberately designed to
never feed full input into BPE; `sizePrefixToTokens` only tokenizes
budget-bounded slices with a 6-iter cap. That whole MCP truncation
path is already safe.

But there's a narrower real surface that the report stumbled into,
worth fixing.

## The actual exposure

Three call sites tokenize **full message content** with only a
lower-bound short-circuit (`length <= maxTokens` skips), and no upper
char cap:

1. **`src/loop/shrink.ts:52`** — `shrinkOversizedToolResultsByTokens`:
   ```ts
   if (content.length <= maxTokens) return msg;
   const beforeTokens = countTokens(content);  // full content
   ```
2. **`src/loop/shrink.ts:83`** — same pattern for `tool_calls.arguments`.
3. **`src/tokenizer.ts:258` `estimateConversationTokens`** — preflight
   sums `countTokens(m.content)` over every message in the
   conversation; no per-message char cap.

Most of the time these are safe because upstream caps (MCP default
`maxResultChars: 8000`, subagent default 8000) keep individual
strings small. But:

- Non-MCP tools (`read_file`, shell stdout) don't share the MCP cap —
  a `read_file` of a 200KB log file lands in the message log at full
  size, and the next preflight tokenizes the whole thing.
- A repetitive payload (CSV columns, log timestamps, base64 chunks)
  hits the BPE inner loop's worst case where merges keep finding
  matches across the full string.

Worst case isn't 30s on realistic inputs — that number was
speculation in the report — but tokenizing 200KB of moderately
repetitive content on the pure-TS port does take seconds, and it
runs synchronously on the main thread. That's enough to:

- Stall the loop's preflight
- Make `decidePreflight` (`src/context-manager.ts:107`) feel like a
  freeze on long sessions
- Compound when shrink + preflight + estimate all touch the same
  oversized message in one turn

## Proposed fix

Add a bounded-tokenize helper alongside `countTokens`:

```ts
// Returns an exact count when feasible, an estimate for oversized
// inputs. Never tokenizes more than `maxChars` of the input.
export function countTokensBounded(text: string, maxChars: number): number;
```

Implementation: when `text.length <= maxChars`, just delegate to
`countTokens`. Otherwise tokenize a head + tail sample (same shape as
`registry.ts:170-182` already uses for size estimation) and scale by
char/token ratio. The math is already in-tree at `registry.ts:180`:

```ts
const ratio = sampleChars > 0 ? sampleTokens / sampleChars : 0.3;
const estTotalTokens = Math.ceil(s.length * ratio);
```

Then swap the three sites above to call `countTokensBounded` with a
sane cap (e.g. 32KB — well above any realistic single message that
matters for a token-budget decision, well below the slow zone).

For `shrinkOversizedToolResultsByTokens` the cap is fine to be loose
because the function's job is just deciding "is this over budget";
an estimate within ±10% is enough to gate the truncation pass.

## Out of scope

- Replacing the pure-TS BPE port with a native binding. The TS port
  is "good enough for budgeting decisions" by design; native is a
  separate, much larger change.
- Caching `countTokens` results. Useful but orthogonal.
- The MCP truncation path. Already defended; do not touch.

## Credit

Original report flagged the right family of risk even though the
specific claims were wrong. The 30s number isn't real on observed
inputs and the cited comment was misread, but the underlying "we
sometimes tokenize unbounded user input on the main thread" exposure
is genuine and worth closing.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audit `countTokens` callers — add upper-char bound to defend non-MCP paths from pathological BPE input #558

Summary

The actual exposure

Proposed fix

Out of scope

Credit

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Audit countTokens callers — add upper-char bound to defend non-MCP paths from pathological BPE input #558

Description

Summary

The actual exposure

Proposed fix

Out of scope

Credit

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Audit `countTokens` callers — add upper-char bound to defend non-MCP paths from pathological BPE input #558