Token usage badge + Fast/Smart/Auto model picker#6
Merged
Conversation
Claude's stream-json `result` event carries an input/output/cache usage block alongside the final text. We were dropping it. Capture it and surface a running session total so the reviewer can see at a glance what this PR cost in tokens — useful both for cost-awareness and for spotting PRs that hit anomalous diff sizes. - claudeRunner: new optional onUsage callback, fired once per run from the result event. Existing callers (aiComment, etc) unaffected. - All 6 stream routes (tldr, headline, beforeAfter, complexity, diagram, explain) now emit an `event: usage` SSE frame with the normalized TokenUsage object. send() widened to accept unknown so we can pass an object alongside the existing string payloads (client decode() already JSON.parses, so no client format change needed). - shared/usage.ts: TokenUsage type, EMPTY_USAGE, addUsage, totalTokens, formatTokens (1234 → "1.2k", 1500000 → "1.5M"). 7 unit tests. - Zustand store: new tokenUsage field, reset on loadPR, accumulated via a shared attachUsageListener helper plugged into every stream. - TokenBadge: compact "◇ 38k" pill in HeaderControls with the per-field breakdown in the tooltip. Hidden until the first usage event arrives so the empty state doesn't confuse first-time users. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The team's "headline is slow" complaint was driven by Opus being the user's CLI default. We already force Sonnet for short outputs server-side, but reviewers couldn't dial the rest up or down. Now they can: - Fast (Sonnet): force Sonnet for every AI call — cheap, ~3× faster - Smart (Opus): force Opus everywhere — slower but better on dense logic - Auto (default): per-route defaults stay in effect (Sonnet for short outputs, CLI default for the rest) Wiring: - preferences.ts: ModelPreference type + persisted modelPreference (default 'auto') - HeaderControls: new SegmentedControl labeled "AI model" with tooltips on each option explaining the trade-off - store.ts: modelParam() helper reads the preference and emits the `&model=...` query suffix (or "" for 'auto'). All 6 stream URLs append it. Captured at stream-open time, so flipping the picker mid-stream affects only NEW calls — the current run keeps its model. - claudeRunner: new exported validateModelParam(q) — accepts 'sonnet' | 'opus' | 'haiku', returns undefined for anything else (defends against URL tampering). 4 unit tests. - Each route reads req.query.model via the validator and either overrides its hardcoded default (headline, explain) or applies the user's choice to a previously-defaultless route (tldr, beforeAfter, complexity, diagram). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two team-feedback follow-ups: 1) Token badge now reads "◇ 45k tokens" (was just "◇ 45k"). The number alone wasn't legible as a unit — "tokens" makes the meaning obvious without growing the pill much. 2) The View (Split/Unified) and Theme (GitHub/VS Code/IntelliJ) pickers used to live in the global header, but they only affect the code viewer — they had nothing to do with the PR meta they sat next to. Pulled them out into a new DiffToolbar that sits above the diff column. The header keeps only the token + model controls, which is what you scan when triaging. The toolbar uses small uppercase "VIEW" / "THEME" eyebrow labels so the controls read clearly without crowding. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Smart used to promote every AI route to Opus — ~86k tokens on a routine PR, with most of that spend going to short-output routes (headline, before-after, complexity, persona tabs) where Opus adds zero quality over Sonnet. Auto's behavior silently depended on each user's `claude` CLI default, producing inconsistent token bills across teammates. Collapse to two pills with a real meaning: - Fast — Sonnet on every route. Cheap, snappy, fine for routine PRs. - Smart — Opus on TL;DR + diagram (where reasoning genuinely moves quality); Sonnet on the short outputs. Default for new and existing users is Smart, so the demo experience shows the tool's strongest reasoning out of the box. Stored prefs migrate: 'auto'/'opus' → 'smart', 'sonnet' → 'fast'. Per-route logic lives server-side: routes self-declare their tier (heavy/light) and call a single pickModel(req.query.mode, tier) helper. Client just sends &mode=fast|smart on every stream URL. The old validateModelParam abstraction is gone — pickModel replaces it. 7 unit tests cover the heavy/light x fast/smart matrix plus invalid input handling (defaults toward smart on hand-crafted demo URLs). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The token badge claimed to show "total tokens used in this PR session"
but the two inline-comment helpers were invisible to it. They live in
aiComment.ts, which spawns claude directly with plain -p (no stream-json),
so there's no usage block to parse — the route returns plain text and
exits without telling anyone what it cost. Result: reviewers writing
several "AI suggested fix" calls saw the meter stay flat while real
tokens were being burned.
Fix:
- aiComment.ts now invokes claude with --output-format stream-json
--verbose (same as the streaming routes), parses the result event for
usage, and returns { text, usage } from the API. Hardcoded --model
sonnet because these helpers are short-output (a code suggestion or a
1-3 sentence rewrite); Opus adds zero quality here.
- New shared helper normalizeClaudeUsage() in claudeRunner.ts converts
Claude's snake_case usage block into the camelCase TokenUsage the
client expects. Both the streaming runner and the one-shot aiComment
use it, so the two paths can't drift on field naming.
- New store action recordUsage(u) for non-SSE callers to accumulate
into the same tokenUsage total the meter reads.
- InlineCommentsLayer pipes the returned usage into recordUsage when
the AI call comes back.
Net result: the meter now reflects every Claude call in the session,
not just the streaming ones. Honest cost visibility.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two improvements coming out of the team demo. Each is one commit.
1. Token usage badge (commit b80*)
Surfaces total Claude tokens used in the current PR session as a compact pill in the header (
◇ 38k), with input/output/cache breakdown in the tooltip. The badge hides until the first usage event arrives.claudeRunner.ts: new optionalonUsagecallback fired from the stream-jsonresulteventevent: usageSSE frame;send()widened to acceptunknownso it can carry an object alongside the existing string payloadsshared/usage.tswithTokenUsage,addUsage,totalTokens,formatTokens(1234 → "1.2k", 1.5M → "1.5M") + 7 unit teststokenUsageaccumulates per PR session; a sharedattachUsageListenerplugs into every stream2. Model picker — Fast / Smart / Auto (commit 6412613)
Reviewers can now dial the AI:
Persisted in
localStorage. Mid-flight streams keep their model — the next call uses the new pick.preferences.ts:ModelPreferencetype + setterHeaderControls: newSegmentedControlwith per-option tooltipsstore.ts:modelParam()helper appends&model=...to every stream URL (or returns "" forauto)claudeRunner: exportedvalidateModelParam(raw)accepts onlysonnet/opus/haiku— defends against URL tampering. 4 unit tests/api/*/streamroute honors?model=, overriding its hardcoded default when setVerification
npm run typecheckclean after each featurenpm test— 65 tests pass (was 54)cli/cli#13510, watched badge tick to◇ 38kafter streams completed; flipped picker to Smart → all 6 stream URLs carried&model=opus; back to Auto → param disappeared🤖 Generated with Claude Code