Skip to content

Token usage badge + Fast/Smart/Auto model picker#6

Merged
shacharPash merged 5 commits into
mainfrom
team-feedback-improvements
Jun 4, 2026
Merged

Token usage badge + Fast/Smart/Auto model picker#6
shacharPash merged 5 commits into
mainfrom
team-feedback-improvements

Conversation

@shacharPash

Copy link
Copy Markdown
Owner

Two improvements coming out of the team demo. Each is one commit.

1. Token usage badge (commit b80*)

Surfaces total Claude tokens used in the current PR session as a compact pill in the header (◇ 38k), with input/output/cache breakdown in the tooltip. The badge hides until the first usage event arrives.

  • claudeRunner.ts: new optional onUsage callback fired from the stream-json result event
  • All 6 stream routes emit an event: usage SSE frame; send() widened to accept unknown so it can carry an object alongside the existing string payloads
  • Shared shared/usage.ts with TokenUsage, addUsage, totalTokens, formatTokens (1234 → "1.2k", 1.5M → "1.5M") + 7 unit tests
  • Zustand tokenUsage accumulates per PR session; a shared attachUsageListener plugs into every stream

2. Model picker — Fast / Smart / Auto (commit 6412613)

Reviewers can now dial the AI:

  • Fast — Sonnet everywhere. ~3× faster, much cheaper, good enough for most PRs
  • Smart — Opus everywhere. Slower / pricier but better on dense logic
  • Auto (default) — keeps the per-route hardcoded defaults (Sonnet for short outputs, CLI default for the rest)

Persisted in localStorage. Mid-flight streams keep their model — the next call uses the new pick.

  • preferences.ts: ModelPreference type + setter
  • HeaderControls: new SegmentedControl with per-option tooltips
  • store.ts: modelParam() helper appends &model=... to every stream URL (or returns "" for auto)
  • claudeRunner: exported validateModelParam(raw) accepts only sonnet / opus / haiku — defends against URL tampering. 4 unit tests
  • Each /api/*/stream route honors ?model=, overriding its hardcoded default when set

Verification

  • npm run typecheck clean after each feature
  • npm test — 65 tests pass (was 54)
  • Manual: loaded cli/cli#13510, watched badge tick to ◇ 38k after streams completed; flipped picker to Smart → all 6 stream URLs carried &model=opus; back to Auto → param disappeared

🤖 Generated with Claude Code

shacharPash and others added 5 commits June 1, 2026 15:27
Claude's stream-json `result` event carries an input/output/cache usage
block alongside the final text. We were dropping it. Capture it and
surface a running session total so the reviewer can see at a glance what
this PR cost in tokens — useful both for cost-awareness and for spotting
PRs that hit anomalous diff sizes.

- claudeRunner: new optional onUsage callback, fired once per run from
  the result event. Existing callers (aiComment, etc) unaffected.
- All 6 stream routes (tldr, headline, beforeAfter, complexity, diagram,
  explain) now emit an `event: usage` SSE frame with the normalized
  TokenUsage object. send() widened to accept unknown so we can pass an
  object alongside the existing string payloads (client decode() already
  JSON.parses, so no client format change needed).
- shared/usage.ts: TokenUsage type, EMPTY_USAGE, addUsage, totalTokens,
  formatTokens (1234 → "1.2k", 1500000 → "1.5M"). 7 unit tests.
- Zustand store: new tokenUsage field, reset on loadPR, accumulated via
  a shared attachUsageListener helper plugged into every stream.
- TokenBadge: compact "◇ 38k" pill in HeaderControls with the per-field
  breakdown in the tooltip. Hidden until the first usage event arrives
  so the empty state doesn't confuse first-time users.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The team's "headline is slow" complaint was driven by Opus being the
user's CLI default. We already force Sonnet for short outputs server-side,
but reviewers couldn't dial the rest up or down. Now they can:

- Fast (Sonnet): force Sonnet for every AI call — cheap, ~3× faster
- Smart (Opus): force Opus everywhere — slower but better on dense logic
- Auto (default): per-route defaults stay in effect (Sonnet for short
  outputs, CLI default for the rest)

Wiring:
- preferences.ts: ModelPreference type + persisted modelPreference (default 'auto')
- HeaderControls: new SegmentedControl labeled "AI model" with tooltips on
  each option explaining the trade-off
- store.ts: modelParam() helper reads the preference and emits the
  `&model=...` query suffix (or "" for 'auto'). All 6 stream URLs append it.
  Captured at stream-open time, so flipping the picker mid-stream affects
  only NEW calls — the current run keeps its model.
- claudeRunner: new exported validateModelParam(q) — accepts 'sonnet' |
  'opus' | 'haiku', returns undefined for anything else (defends against
  URL tampering). 4 unit tests.
- Each route reads req.query.model via the validator and either overrides
  its hardcoded default (headline, explain) or applies the user's choice
  to a previously-defaultless route (tldr, beforeAfter, complexity, diagram).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two team-feedback follow-ups:

1) Token badge now reads "◇ 45k tokens" (was just "◇ 45k"). The number
   alone wasn't legible as a unit — "tokens" makes the meaning obvious
   without growing the pill much.

2) The View (Split/Unified) and Theme (GitHub/VS Code/IntelliJ) pickers
   used to live in the global header, but they only affect the code
   viewer — they had nothing to do with the PR meta they sat next to.
   Pulled them out into a new DiffToolbar that sits above the diff
   column. The header keeps only the token + model controls, which is
   what you scan when triaging.

The toolbar uses small uppercase "VIEW" / "THEME" eyebrow labels so the
controls read clearly without crowding.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Smart used to promote every AI route to Opus — ~86k tokens on a routine
PR, with most of that spend going to short-output routes (headline,
before-after, complexity, persona tabs) where Opus adds zero quality
over Sonnet. Auto's behavior silently depended on each user's `claude`
CLI default, producing inconsistent token bills across teammates.

Collapse to two pills with a real meaning:

- Fast — Sonnet on every route. Cheap, snappy, fine for routine PRs.
- Smart — Opus on TL;DR + diagram (where reasoning genuinely moves
  quality); Sonnet on the short outputs.

Default for new and existing users is Smart, so the demo experience
shows the tool's strongest reasoning out of the box. Stored prefs
migrate: 'auto'/'opus' → 'smart', 'sonnet' → 'fast'.

Per-route logic lives server-side: routes self-declare their tier
(heavy/light) and call a single pickModel(req.query.mode, tier) helper.
Client just sends &mode=fast|smart on every stream URL. The old
validateModelParam abstraction is gone — pickModel replaces it.

7 unit tests cover the heavy/light x fast/smart matrix plus invalid
input handling (defaults toward smart on hand-crafted demo URLs).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The token badge claimed to show "total tokens used in this PR session"
but the two inline-comment helpers were invisible to it. They live in
aiComment.ts, which spawns claude directly with plain -p (no stream-json),
so there's no usage block to parse — the route returns plain text and
exits without telling anyone what it cost. Result: reviewers writing
several "AI suggested fix" calls saw the meter stay flat while real
tokens were being burned.

Fix:
- aiComment.ts now invokes claude with --output-format stream-json
  --verbose (same as the streaming routes), parses the result event for
  usage, and returns { text, usage } from the API. Hardcoded --model
  sonnet because these helpers are short-output (a code suggestion or a
  1-3 sentence rewrite); Opus adds zero quality here.
- New shared helper normalizeClaudeUsage() in claudeRunner.ts converts
  Claude's snake_case usage block into the camelCase TokenUsage the
  client expects. Both the streaming runner and the one-shot aiComment
  use it, so the two paths can't drift on field naming.
- New store action recordUsage(u) for non-SSE callers to accumulate
  into the same tokenUsage total the meter reads.
- InlineCommentsLayer pipes the returned usage into recordUsage when
  the AI call comes back.

Net result: the meter now reflects every Claude call in the session,
not just the streaming ones. Honest cost visibility.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@shacharPash shacharPash merged commit 369ce53 into main Jun 4, 2026
1 check passed
@shacharPash shacharPash deleted the team-feedback-improvements branch June 4, 2026 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant