Skip to content

Add live task usage/context telemetry and aggregate usage metrics#326

Merged
subsy merged 9 commits intomainfrom
fix/task-usage-indicators
Feb 20, 2026
Merged

Add live task usage/context telemetry and aggregate usage metrics#326
subsy merged 9 commits intomainfrom
fix/task-usage-indicators

Conversation

@subsy
Copy link
Copy Markdown
Owner

@subsy subsy commented Feb 20, 2026

Summary

  • add live per-task token usage tracking (input/output/total) from structured agent telemetry and output parsing
  • add in-flight context tracking (remaining % + tokens) when a context window can be resolved
  • resolve context windows via models.dev with shorthand fallbacks
  • detect runtime model from JSONL events when model is not explicitly configured
  • propagate runtime model through engine/remote state via agent:model updates
  • add dashboard aggregate input/output/total usage across all tasks in a run
  • move usage details into the existing timing panel and right-align I/O/T on the same row as start/end/duration
  • persist usage metadata in iteration logs and add parser coverage tests

Validation

  • bun test tests/plugins/agents/usage.test.ts
  • bun run typecheck
  • bun run build
  • UBS_MAX_DIR_SIZE_MB=0 ubs --staged

Summary by CodeRabbit

  • New Features

    • Live per-iteration token-usage tracking (input, output, total), incremental telemetry events and aggregate dashboard reporting.
    • Automatic detection and exposure of the active model for iterations and logs.
  • UI

    • Task list, timing/RightPanel and iteration details display token metrics, remaining context and compact token-count formatting.
    • Iteration logs and live views persist and show token-usage/context-window info.
  • API

    • Iteration outputs, logs and engine state include optional usage and current model fields.
  • Tests

    • Comprehensive tests for usage extraction, accumulation and metadata round-trips.

@vercel
Copy link
Copy Markdown

vercel bot commented Feb 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
ralph-tui Ignored Ignored Preview Feb 20, 2026 9:25pm

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 20, 2026

Walkthrough

Adds token-usage and model telemetry across runtime, persistence, remote API and TUI: a new usage extraction/accumulator module, per-iteration usage/model tracking in the engine, usage persisted in iteration logs and exposed via remote responses, and surfaced in the TUI with formatting and tests.

Changes

Cohort / File(s) Summary
Token usage core
src/plugins/agents/usage.ts
New module: TokenUsageSample/TokenUsageSummary types, model extraction, token extraction from JSON/JSONL, summarizeTokenUsageFromOutput, TokenUsageAccumulator and helpers.
Engine & runtime types
src/engine/index.ts, src/engine/types.ts
Per-iteration TokenUsageAccumulator; engine emits agent:usage and agent:model; EngineState.currentModel and IterationResult.usage added; iteration logs include active model.
Logging persistence & types
src/logs/persistence.ts, src/logs/types.ts, tests/logs/persistence.test.ts
Iteration metadata now carries usage; header formatting/parsing updated to render and parse usage fields; tests added for round-trip.
Streaming output parser
src/tui/output-parser.ts, src/tui/output-parser.test.ts
StreamingOutputParser accumulates usage via TokenUsageAccumulator, exposes getUsage() and resets accumulator on reset; tests added for JSONL usage extraction.
Remote layer
src/remote/types.ts, src/remote/client.ts, src/remote/instance-manager.ts, src/remote/server.ts
Iteration output responses extended with optional usage?: TokenUsageSummary; server returns live/completed usage derived from engine output or stored iteration data; client/manager types updated.
TUI integration & components
src/tui/components/RunApp.tsx, src/tui/components/RightPanel.tsx, src/tui/components/LeftPanel.tsx, src/tui/components/ProgressDashboard.tsx, src/tui/types.ts, src/tui/utils/token-format.ts
UI state/props extended to carry per-task and aggregate usage; model-context resolution to normalise usage; new formatTokenCount utility; UI renders compact usage indicators and timing+usage panel.
Tests for usage utilities
tests/plugins/agents/usage.test.ts
Comprehensive unit tests for extraction, accumulator, summarisation, model extraction and percent normalisation.
Misc wiring
src/remote/server.ts, src/remote/client.ts, src/remote/instance-manager.ts
Minor type/response wiring to include usage and currentModel in state and iteration outputs.

Sequence Diagram

sequenceDiagram
    participant Agent as Agent Output
    participant Parser as StreamingOutputParser
    participant Engine as ExecutionEngine
    participant Events as EventEmitter
    participant Logs as LogPersistence
    participant UI as TUI

    Agent->>Parser: stream JSONL lines
    Parser->>Parser: extractTokenUsageFromJsonLine() / TokenUsageAccumulator.add()
    Parser->>Engine: emit agent output (includes taskId, iteration)
    Engine->>Engine: detect model, update state.currentModel
    Engine->>Events: emit agent:model
    Engine->>Engine: accumulate iteration usage
    Engine->>Events: emit agent:usage (incremental)
    Engine->>Logs: saveIterationLog (include usage, model)
    Logs->>Logs: formatMetadataHeader() with usage fields
    Events->>UI: update per-task & aggregate usage displays
    UI->>UI: render timing + input/output/total tokens and context info
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarises the main changeset: adding live task usage/context telemetry and aggregate usage metrics across the codebase, which aligns with the extensive modifications to tracking, persisting, and displaying token usage data.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/task-usage-indicators

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/tui/components/LeftPanel.tsx (1)

79-111: ⚠️ Potential issue | 🟡 Minor

Guard against overflow when the panel is narrow.
If maxWidth is small, titleWidth can go negative; clamping to 5 still forces the row wider than the panel. Consider omitting the usage indicator when space is tight.

Suggested adjustment to avoid overflow
-  const usageIndicator = formatTaskUsageIndicator(task);
-  const usageIndicatorWidth = usageIndicator.length + 1;
-  const titleWidth = maxWidth - indentWidth - 3 - idDisplay.length - usageIndicatorWidth;
+  const baseWidth = maxWidth - indentWidth - 3 - idDisplay.length;
+  let usageIndicator = '';
+  let usageIndicatorWidth = 0;
+  if (baseWidth > 6) {
+    usageIndicator = formatTaskUsageIndicator(task);
+    usageIndicatorWidth = usageIndicator.length + 1;
+  }
+  const titleWidth = baseWidth - usageIndicatorWidth;
   const truncatedTitle = truncateText(task.title, Math.max(5, titleWidth));
-        <span fg={colors.fg.dim}> {usageIndicator}</span>
+        {usageIndicator && <span fg={colors.fg.dim}> {usageIndicator}</span>}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/LeftPanel.tsx` around lines 79 - 111, The titleWidth
calculation can be negative when maxWidth is small (causing overflow even with
Math.max(5,...)); change the layout logic in LeftPanel so you compute remaining
space after id and indent, then decide whether to render the usageIndicator
(formatTaskUsageIndicator) only if there is enough room: compute
availableForTitle = Math.max(0, maxWidth - indentWidth - 3 - idDisplay.length)
and if availableForTitle <= minimalTitlePlusIndicator (e.g., 5 +
usageIndicator.length+1) omit usageIndicator and set titleWidth = Math.max(5,
availableForTitle) otherwise include usageIndicatorWidth and set titleWidth
appropriately; update truncatedTitle via truncateText(title, titleWidth) and
only render the <span> for usageIndicator when you decided to include it.
src/tui/components/RunApp.tsx (1)

903-910: ⚠️ Potential issue | 🟠 Major

Remote agent:output event triggers a full remote state fetch for subagent tree on every stdout chunk.

Lines 904-910 call instanceManager.getRemoteState() on every agent:output event to refresh the subagent tree. This is a network round-trip per output chunk, which could be very frequent during active streaming. The getRemoteState() call fetches the entire remote state, not just the subagent tree.

This pattern could create significant network overhead and latency. Consider debouncing or throttling the subagent tree refresh, or using a dedicated lightweight API for subagent tree updates.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RunApp.tsx` around lines 903 - 910, The handler for the
"agent:output" event currently calls instanceManager.getRemoteState() on every
stdout chunk which causes excessive network round-trips; update the agent:output
handler to avoid fetching full remote state per chunk by either
debouncing/throttling calls to instanceManager.getRemoteState() (e.g.,
accumulate chunks and refresh at a fixed interval) or replacing the call with a
lightweight dedicated API for just the subagent tree, and then call
setRemoteSubagentTree(state.subagentTree) only when the debounced/throttled
fetch or lightweight endpoint returns updated data; keep references to
instanceManager.getRemoteState and setRemoteSubagentTree so the change is
localized to the existing agent:output callback.
🧹 Nitpick comments (13)
src/plugins/agents/usage.ts (1)

188-197: Ambiguous percentage normalisation heuristic.

The value > 0 && value <= 1 check assumes fractional values represent percentages. A legitimate value of 0.5% (i.e., half a percent) would be incorrectly scaled to 50%. This is an inherent limitation of the heuristic, but worth documenting in a comment so future maintainers are aware.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plugins/agents/usage.ts` around lines 188 - 197, The normalizePercent
function uses a heuristic (value > 0 && value <= 1) to treat fractional inputs
as proportions and multiply by 100, which misinterprets inputs like 0.5 meant to
represent 0.5% (becoming 50%); update the normalizePercent function by adding a
clear comment above it documenting this heuristic and its limitation (including
the 0.5% example), why it was chosen, and advising callers to pass values in the
intended unit (0–100 for percent or 0–1 for fraction) or to normalize before
calling normalizePercent; reference the function name normalizePercent in the
comment so maintainers can find it easily.
src/logs/persistence.ts (1)

565-587: events count is lost during parse round-trip.

The events field is hardcoded to 0 on line 585 because it's not persisted in the formatted header. This means round-tripping through formatMetadataHeaderparseMetadataHeader loses the original event count. Since events is primarily a diagnostic metric and not critical for downstream logic, this is acceptable — but consider adding a comment noting the intentional data loss.

📝 Optional: add a clarifying comment
             remainingContextPercent,
-            events: 0,
+            events: 0, // Not persisted in header; lost on round-trip
           }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/logs/persistence.ts` around lines 565 - 587, The parsed `usage` object
currently sets `events` to 0 which intentionally drops the original events count
because the formatted header (via formatMetadataHeader) doesn't persist it;
update the code around the `usage` construction (the `usage` variable in
parseMetadataHeader/parseNumber/extractValue flow) to add a clarifying inline
comment stating this intentional data loss for `events` (e.g., "events not
persisted in header; default to 0") so future readers understand this is
deliberate rather than a bug.
tests/plugins/agents/usage.test.ts (1)

49-102: Consider adding edge-case tests for robustness.

The current tests cover the happy path well. For additional confidence, consider adding tests for:

  • Invalid/empty/non-JSON input to extractTokenUsageFromJsonLine
  • TokenUsageAccumulator.reset() behaviour
  • Samples with only totalTokens (no input/output breakdown)
  • normalizePercent with fractional vs. already-percentage values

These can be deferred if preferred.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/plugins/agents/usage.test.ts` around lines 49 - 102, Add edge-case
tests to improve robustness: add tests for extractTokenUsageFromJsonLine to
handle invalid/empty/non-JSON lines and verify it returns undefined or a safe
default; add tests for TokenUsageAccumulator.reset() to ensure totals
(inputTokens, outputTokens, totalTokens, contextWindowTokens,
remainingContextTokens, remainingContextPercent) return to initial state; add a
test for summarizeTokenUsageFromOutput handling lines that only contain
totalTokens without input/output breakdown; and add tests for normalizePercent
to assert behavior when given fractional values (e.g., 0.76) versus
already-percentage values (e.g., 76). Use the existing test file structure and
reference the functions TokenUsageAccumulator, extractTokenUsageFromJsonLine,
summarizeTokenUsageFromOutput, and normalizePercent when adding these cases.
src/engine/index.ts (1)

1137-1168: Droid agent path does not emit real-time agent:usage events.

The Droid agent's JSONL messages are parsed via droidJsonlParser for subagent tracing, but usage/model extraction is not performed on those parsed messages. This means Droid-based iterations won't emit incremental agent:usage or agent:model events — usage is only available post-iteration via summarizeTokenUsageFromOutput.

If live usage display matters for Droid, you could add usage/model extraction within the Droid parsing loop (lines 1156-1167). Low priority since the fallback works.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/engine/index.ts` around lines 1137 - 1168, The Droid JSONL parsing loop
currently only forwards messages to subagentParser (droidJsonlParser.push(...)
-> subagentParser.processMessage(...)) but does not extract or emit incremental
usage/model info; update the block inside the isDroidAgent && droidJsonlParser
condition to parse usage and model from each normalized or raw result.message
(reuse the same logic used by summarizeTokenUsageFromOutput or the existing
post-iteration extraction), and call this.emit with type 'agent:usage' and
'agent:model' (including timestamp, taskId, iteration and the extracted
usage/model payload) for each message that contains such info so Droid
iterations produce real-time usage/model events. Ensure to keep using
toClaudeJsonlMessages(result.message) for normalization and preserve existing
subagentParser.processMessage calls.
src/tui/components/ProgressDashboard.tsx (1)

75-86: Consider centralising token-count formatting to avoid drift.
formatTokenCount now exists here and in LeftPanel.tsx; a shared util would keep the UI consistent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/ProgressDashboard.tsx` around lines 75 - 86, Duplicate
token formatting logic exists as formatTokenCount in ProgressDashboard.tsx and
in LeftPanel.tsx; extract this into a shared utility exported from a new module
(e.g., tokenFormat.ts or ui/utils/tokenFormat) and replace both local
implementations with imports of the single exported function; update
ProgressDashboard.tsx and LeftPanel.tsx to import formatTokenCount from the new
util and remove the local formatTokenCount definition so both components use the
same centralized implementation.
src/tui/components/RunApp.tsx (6)

2652-2658: Inline summarizeTokenUsageFromOutput in useMemo may be expensive for large parallel outputs.

In the parallel-mode branch, summarizeTokenUsageFromOutput(output) re-parses the full joined output string on every memo recomputation. Unlike the local/remote paths which use a StreamingOutputParser with incremental accumulation, this performs a full parse each time.

For tasks with large stdout (common in parallel workers), this could introduce noticeable lag. Consider using a per-worker TokenUsageAccumulator (or the taskUsageMap) to avoid repeated full-output parsing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RunApp.tsx` around lines 2652 - 2658, The useMemo in the
parallel-mode branch currently calls summarizeTokenUsageFromOutput(output) which
reparses the entire joined output on every recompute; replace that full-parse
call with a per-worker incremental accumulator (e.g., TokenUsageAccumulator) or
read from the existing taskUsageMap for this worker/task so token usage is
updated incrementally as new segments arrive instead of re-parsing output.
Locate the memo where summarizeTokenUsageFromOutput is used (the return object
with iteration/output/segments/usage/timing) and wire it to the per-worker
accumulator or taskUsageMap entry, ensuring the accumulator is updated whenever
new output segments are appended to segments or when the StreamingOutputParser
emits tokens.

1660-1676: Usage extracted from parser on every agent:output event — verify frequency impact.

On each agent:output event (which fires per chunk of stdout), outputParserRef.current.getUsage() is called and, if non-null, setTaskUsageMap triggers a state update. If the agent emits frequent small chunks, this could produce a high rate of state updates for the usage map.

Given that getUsage() returns the accumulator's current summary (which only changes when a JSONL usage line is actually parsed), the state updates are likely infrequent in practice. However, the normalizeUsage call and new Map(prev) allocation happen on every event where usage is non-null, even if the values haven't changed. A shallow equality check before updating could reduce unnecessary re-renders.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RunApp.tsx` around lines 1660 - 1676, The handler for
'agent:output' calls outputParserRef.current.getUsage() and unconditionally
computes normalizeUsage and new Map(prev) inside setTaskUsageMap, causing
allocations and possible re-renders even when usage hasn't changed; change the
update to first compute const newUsage = normalizeUsage(usage,
localContextWindowRef.current), read the existing value from the current task
usage map (using currentTaskIdRef.current or the taskId), compare shallow/deep
equality of newUsage vs existing, and only call setTaskUsageMap (creating a new
Map and setting the value) when the normalized usage actually differs; ensure
you reference outputParserRef.current.getUsage(), normalizeUsage,
setTaskUsageMap, and currentTaskIdRef.current/taskId in the implementation.

1057-1071: Provider-exhaustive fallback search could be slow on first invocation.

When a model string lacks a provider prefix, this code iterates every provider and fetches each provider's full model list sequentially. While the result is cached (so this only runs once per unique model string), the initial lookup could be noticeably slow if the models-dev data source has many providers.

Consider short-circuiting with a break after a match, or doing a single flat lookup if the data source supports it.

🔧 Suggested improvement
         const providers = await getProviders();
         for (const provider of providers) {
           const models = await getModelsForProvider(provider.id);
           const match = models.find(
             (m) =>
               m.id === key ||
               m.name.toLowerCase() === key.toLowerCase()
           );
           if (match?.contextLimit && Number.isFinite(match.contextLimit)) {
             modelContextCacheRef.current.set(cacheKey, match.contextLimit);
             return match.contextLimit;
           }
         }

The loop already returns on match, so the early exit is present. But consider parallelising with Promise.all or adding a bulk lookup API to models-dev if this becomes a bottleneck.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RunApp.tsx` around lines 1057 - 1071, The sequential
provider loop that calls getModelsForProvider(provider.id) can be slow; change
RunApp.tsx to fetch models in parallel: call getProviders(), map providers to
getModelsForProvider(...) and await Promise.all to get all model arrays, then
search the flattened results for a match (compare m.id === key or
m.name.toLowerCase() === key.toLowerCase()), and when you find match with a
valid match.contextLimit, set modelContextCacheRef.current.set(cacheKey,
match.contextLimit) and return the value; this preserves the existing cacheKey
and match/contextLimit semantics (and avoids per-provider sequential latency).

886-901: Duplicate usage-update pattern in agent:output and agent:usage remote event handlers.

The logic to extract a task ID and update remoteTaskUsageMap with normalizeUsage is near-identical between the agent:output handler (lines 890-901) and the agent:usage handler (lines 912-926). Consider extracting a shared helper to reduce duplication — for example:

♻️ Proposed refactor
// Helper inside the event handler or at module level
function updateRemoteUsage(
  taskId: string | undefined,
  usage: TokenUsageSummary,
  contextWindow: number | undefined,
  setter: React.Dispatch<React.SetStateAction<Map<string, TokenUsageSummary>>>
) {
  if (!taskId) return;
  setter((prev) => {
    const next = new Map(prev);
    next.set(taskId, normalizeUsage(usage, contextWindow));
    return next;
  });
}

The same pattern appears in the local engine event handlers at lines 1667-1675 and 1683-1694.

Also applies to: 912-926

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RunApp.tsx` around lines 886 - 901, The agent:output and
agent:usage handlers duplicate the same pattern for extracting taskId and
updating remoteTaskUsageMap with normalizeUsage; extract that logic into a small
helper (e.g., updateRemoteUsage) and call it from both handlers to remove
duplication. The helper should accept taskId (or use
remoteCurrentTaskIdRef.current when undefined), a TokenUsageSummary (from
remoteOutputParserRef.current.getUsage() or the agent:usage payload), the
context window (remoteContextWindowRef.current), and the setter
(setRemoteTaskUsageMap), then early-return on missing taskId and perform the Map
clone/set with normalizeUsage before calling the setter. Update both handlers to
invoke this helper and remove the inline Map update code.

1281-1286: Usage map spread into every task on each displayedTasks recomputation.

Lines 1282-1285 iterate all sourceTasks and spreads usage from the map into each task object, creating new object references for every task on every recomputation. Since taskUsageMap and remoteTaskUsageMap are in the dependency array (lines 1302-1303), any usage update for a single task causes all task objects to be recreated.

This is unlikely to cause visible performance issues given the typical number of tasks, but is worth noting as a potential optimisation point if task lists grow large.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RunApp.tsx` around lines 1281 - 1286, The current mapping
always creates new task objects by spreading usage into each task (in the block
that sets sourceTasks using usageMap.get), causing all tasks to be recreated
whenever taskUsageMap or remoteTaskUsageMap changes; change the mapping to only
produce a new object when the usage actually differs (e.g., for each task in
sourceTasks compare task.usage to usageMap.get(task.id) and return the original
task object if equal, otherwise return a shallow-cloned task with the updated
usage), so updates to taskUsageMap/remoteTaskUsageMap only replace the affected
task(s) rather than all tasks during the displayedTasks recomputation.

1119-1162: Cascading state updates when localContextWindow resolves could cause multiple re-renders.

When localContextWindow changes, this effect updates three separate state values (taskUsageMap, iterations, historicalOutputCache) in sequence. React batches state updates within event handlers but may not batch across async boundaries in older React versions. In modern React 18+, automatic batching should handle this, but if the framework in use doesn't fully support it, this could trigger three consecutive re-renders.

Not a blocking concern given the functional updater pattern is correctly used, but worth noting for awareness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RunApp.tsx` around lines 1119 - 1162, When
localContextWindow changes, the three sequential state setters (setTaskUsageMap,
setIterations, setHistoricalOutputCache) cause cascading re-renders; wrap these
updates in React's startTransition to ensure they are batched (import
startTransition from 'react') and call startTransition(() => {
setTaskUsageMap(...); setIterations(...); setHistoricalOutputCache(...); })
while still using normalizeUsage to compute values so the updates remain
functional and localized to the effect.
src/tui/components/RightPanel.tsx (2)

598-663: Usage and context display logic is clear and well-structured.

The side-by-side layout (timing left, usage right) with fallback messaging for absent usage data is a clean UX approach. The context window info being gated on isRunning makes sense — it's only meaningful during live execution.

One minor observation: if usage is provided but all token counts are 0 (e.g., very early in streaming before the first usage event), the UI will display Input: 0 Output: 0 Total: 0 rather than the "waiting for live telemetry..." message. This is technically correct but could be slightly confusing. Consider whether you'd like to show the waiting message when usage.events === 0 as well.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RightPanel.tsx` around lines 598 - 663, The UI shows
telemetry when a non-null usage object exists even if usage.events === 0; update
the rendering guards in RightPanel (the usage block and the isRunning context
block) to treat usage with zero events as "no telemetry yet" by checking usage
&& usage.events > 0 instead of just usage, and likewise gate
remainingPercent/remainingTokens/contextWindow displays on usage.events > 0;
update the conditional around the usage display and the sub-conditions that
render remainingPercent/remainingTokens so the "waiting for live telemetry..."
message appears until at least one usage event arrives.

570-578: Redundant totalTokens recalculation — already handled by normalizeUsage upstream.

The normalizeUsage function in RunApp.tsx (line 450-454) already ensures totalTokens is set to inputTokens + outputTokens when it's 0. This recalculation at line 572-575 is therefore defensive duplication. It's not harmful, but worth noting for future maintenance — if the normalisation logic changes in one place, the other could silently diverge.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RightPanel.tsx` around lines 570 - 578, The code in
RightPanel.tsx redundantly recalculates totalTokens even though normalizeUsage
in RunApp.tsx already sets totalTokens; replace the ternary block that computes
totalTokens with a simple nullish-coalescing assignment (e.g., const totalTokens
= usage?.totalTokens ?? inputTokens + outputTokens) or otherwise just use
usage.totalTokens, remove the defensive duplication, and keep references to
inputTokens and outputTokens unchanged; this ensures totalTokens logic remains
single-sourced in normalizeUsage while preserving a safe fallback in
RightPanel.tsx.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/plugins/agents/usage.ts`:
- Around line 67-76: CONTEXT_WINDOW_KEYS currently includes
'maxTokens'/'max_tokens' which are often generation limits, not model context
sizes; update the constant or its usage to avoid misinterpreting generation
limits as context windows: either remove 'maxTokens' and 'max_tokens' from
CONTEXT_WINDOW_KEYS, or implement a heuristic in the code that reads
CONTEXT_WINDOW_KEYS (e.g., in the function that maps payload keys to
contextWindowTokens) to treat these keys as context-only when their numeric
value exceeds a large threshold (suggested >10000) before assigning to
contextWindowTokens; update references to CONTEXT_WINDOW_KEYS and the
value-parsing logic to reflect this change so remaining-context calculations no
longer use typical generation limits.

In `@src/tui/components/RunApp.tsx`:
- Around line 982-1012: The displayAggregateUsage memo currently computes and
returns a tasksWithUsage field that's never used by ProgressDashboard; remove
the tasksWithUsage variable (delete its declaration and the tasksWithUsage += 1
increment inside the loop) and stop returning it from the object returned by the
useMemo in RunApp.tsx (keep returning only inputTokens, outputTokens, and
totalTokens from displayAggregateUsage). Leave all other logic and dependencies
intact and ensure no other references to tasksWithUsage remain.

---

Outside diff comments:
In `@src/tui/components/LeftPanel.tsx`:
- Around line 79-111: The titleWidth calculation can be negative when maxWidth
is small (causing overflow even with Math.max(5,...)); change the layout logic
in LeftPanel so you compute remaining space after id and indent, then decide
whether to render the usageIndicator (formatTaskUsageIndicator) only if there is
enough room: compute availableForTitle = Math.max(0, maxWidth - indentWidth - 3
- idDisplay.length) and if availableForTitle <= minimalTitlePlusIndicator (e.g.,
5 + usageIndicator.length+1) omit usageIndicator and set titleWidth =
Math.max(5, availableForTitle) otherwise include usageIndicatorWidth and set
titleWidth appropriately; update truncatedTitle via truncateText(title,
titleWidth) and only render the <span> for usageIndicator when you decided to
include it.

In `@src/tui/components/RunApp.tsx`:
- Around line 903-910: The handler for the "agent:output" event currently calls
instanceManager.getRemoteState() on every stdout chunk which causes excessive
network round-trips; update the agent:output handler to avoid fetching full
remote state per chunk by either debouncing/throttling calls to
instanceManager.getRemoteState() (e.g., accumulate chunks and refresh at a fixed
interval) or replacing the call with a lightweight dedicated API for just the
subagent tree, and then call setRemoteSubagentTree(state.subagentTree) only when
the debounced/throttled fetch or lightweight endpoint returns updated data; keep
references to instanceManager.getRemoteState and setRemoteSubagentTree so the
change is localized to the existing agent:output callback.

---

Nitpick comments:
In `@src/engine/index.ts`:
- Around line 1137-1168: The Droid JSONL parsing loop currently only forwards
messages to subagentParser (droidJsonlParser.push(...) ->
subagentParser.processMessage(...)) but does not extract or emit incremental
usage/model info; update the block inside the isDroidAgent && droidJsonlParser
condition to parse usage and model from each normalized or raw result.message
(reuse the same logic used by summarizeTokenUsageFromOutput or the existing
post-iteration extraction), and call this.emit with type 'agent:usage' and
'agent:model' (including timestamp, taskId, iteration and the extracted
usage/model payload) for each message that contains such info so Droid
iterations produce real-time usage/model events. Ensure to keep using
toClaudeJsonlMessages(result.message) for normalization and preserve existing
subagentParser.processMessage calls.

In `@src/logs/persistence.ts`:
- Around line 565-587: The parsed `usage` object currently sets `events` to 0
which intentionally drops the original events count because the formatted header
(via formatMetadataHeader) doesn't persist it; update the code around the
`usage` construction (the `usage` variable in
parseMetadataHeader/parseNumber/extractValue flow) to add a clarifying inline
comment stating this intentional data loss for `events` (e.g., "events not
persisted in header; default to 0") so future readers understand this is
deliberate rather than a bug.

In `@src/plugins/agents/usage.ts`:
- Around line 188-197: The normalizePercent function uses a heuristic (value > 0
&& value <= 1) to treat fractional inputs as proportions and multiply by 100,
which misinterprets inputs like 0.5 meant to represent 0.5% (becoming 50%);
update the normalizePercent function by adding a clear comment above it
documenting this heuristic and its limitation (including the 0.5% example), why
it was chosen, and advising callers to pass values in the intended unit (0–100
for percent or 0–1 for fraction) or to normalize before calling
normalizePercent; reference the function name normalizePercent in the comment so
maintainers can find it easily.

In `@src/tui/components/ProgressDashboard.tsx`:
- Around line 75-86: Duplicate token formatting logic exists as formatTokenCount
in ProgressDashboard.tsx and in LeftPanel.tsx; extract this into a shared
utility exported from a new module (e.g., tokenFormat.ts or
ui/utils/tokenFormat) and replace both local implementations with imports of the
single exported function; update ProgressDashboard.tsx and LeftPanel.tsx to
import formatTokenCount from the new util and remove the local formatTokenCount
definition so both components use the same centralized implementation.

In `@src/tui/components/RightPanel.tsx`:
- Around line 598-663: The UI shows telemetry when a non-null usage object
exists even if usage.events === 0; update the rendering guards in RightPanel
(the usage block and the isRunning context block) to treat usage with zero
events as "no telemetry yet" by checking usage && usage.events > 0 instead of
just usage, and likewise gate remainingPercent/remainingTokens/contextWindow
displays on usage.events > 0; update the conditional around the usage display
and the sub-conditions that render remainingPercent/remainingTokens so the
"waiting for live telemetry..." message appears until at least one usage event
arrives.
- Around line 570-578: The code in RightPanel.tsx redundantly recalculates
totalTokens even though normalizeUsage in RunApp.tsx already sets totalTokens;
replace the ternary block that computes totalTokens with a simple
nullish-coalescing assignment (e.g., const totalTokens = usage?.totalTokens ??
inputTokens + outputTokens) or otherwise just use usage.totalTokens, remove the
defensive duplication, and keep references to inputTokens and outputTokens
unchanged; this ensures totalTokens logic remains single-sourced in
normalizeUsage while preserving a safe fallback in RightPanel.tsx.

In `@src/tui/components/RunApp.tsx`:
- Around line 2652-2658: The useMemo in the parallel-mode branch currently calls
summarizeTokenUsageFromOutput(output) which reparses the entire joined output on
every recompute; replace that full-parse call with a per-worker incremental
accumulator (e.g., TokenUsageAccumulator) or read from the existing taskUsageMap
for this worker/task so token usage is updated incrementally as new segments
arrive instead of re-parsing output. Locate the memo where
summarizeTokenUsageFromOutput is used (the return object with
iteration/output/segments/usage/timing) and wire it to the per-worker
accumulator or taskUsageMap entry, ensuring the accumulator is updated whenever
new output segments are appended to segments or when the StreamingOutputParser
emits tokens.
- Around line 1660-1676: The handler for 'agent:output' calls
outputParserRef.current.getUsage() and unconditionally computes normalizeUsage
and new Map(prev) inside setTaskUsageMap, causing allocations and possible
re-renders even when usage hasn't changed; change the update to first compute
const newUsage = normalizeUsage(usage, localContextWindowRef.current), read the
existing value from the current task usage map (using currentTaskIdRef.current
or the taskId), compare shallow/deep equality of newUsage vs existing, and only
call setTaskUsageMap (creating a new Map and setting the value) when the
normalized usage actually differs; ensure you reference
outputParserRef.current.getUsage(), normalizeUsage, setTaskUsageMap, and
currentTaskIdRef.current/taskId in the implementation.
- Around line 1057-1071: The sequential provider loop that calls
getModelsForProvider(provider.id) can be slow; change RunApp.tsx to fetch models
in parallel: call getProviders(), map providers to getModelsForProvider(...) and
await Promise.all to get all model arrays, then search the flattened results for
a match (compare m.id === key or m.name.toLowerCase() === key.toLowerCase()),
and when you find match with a valid match.contextLimit, set
modelContextCacheRef.current.set(cacheKey, match.contextLimit) and return the
value; this preserves the existing cacheKey and match/contextLimit semantics
(and avoids per-provider sequential latency).
- Around line 886-901: The agent:output and agent:usage handlers duplicate the
same pattern for extracting taskId and updating remoteTaskUsageMap with
normalizeUsage; extract that logic into a small helper (e.g., updateRemoteUsage)
and call it from both handlers to remove duplication. The helper should accept
taskId (or use remoteCurrentTaskIdRef.current when undefined), a
TokenUsageSummary (from remoteOutputParserRef.current.getUsage() or the
agent:usage payload), the context window (remoteContextWindowRef.current), and
the setter (setRemoteTaskUsageMap), then early-return on missing taskId and
perform the Map clone/set with normalizeUsage before calling the setter. Update
both handlers to invoke this helper and remove the inline Map update code.
- Around line 1281-1286: The current mapping always creates new task objects by
spreading usage into each task (in the block that sets sourceTasks using
usageMap.get), causing all tasks to be recreated whenever taskUsageMap or
remoteTaskUsageMap changes; change the mapping to only produce a new object when
the usage actually differs (e.g., for each task in sourceTasks compare
task.usage to usageMap.get(task.id) and return the original task object if
equal, otherwise return a shallow-cloned task with the updated usage), so
updates to taskUsageMap/remoteTaskUsageMap only replace the affected task(s)
rather than all tasks during the displayedTasks recomputation.
- Around line 1119-1162: When localContextWindow changes, the three sequential
state setters (setTaskUsageMap, setIterations, setHistoricalOutputCache) cause
cascading re-renders; wrap these updates in React's startTransition to ensure
they are batched (import startTransition from 'react') and call
startTransition(() => { setTaskUsageMap(...); setIterations(...);
setHistoricalOutputCache(...); }) while still using normalizeUsage to compute
values so the updates remain functional and localized to the effect.

In `@tests/plugins/agents/usage.test.ts`:
- Around line 49-102: Add edge-case tests to improve robustness: add tests for
extractTokenUsageFromJsonLine to handle invalid/empty/non-JSON lines and verify
it returns undefined or a safe default; add tests for
TokenUsageAccumulator.reset() to ensure totals (inputTokens, outputTokens,
totalTokens, contextWindowTokens, remainingContextTokens,
remainingContextPercent) return to initial state; add a test for
summarizeTokenUsageFromOutput handling lines that only contain totalTokens
without input/output breakdown; and add tests for normalizePercent to assert
behavior when given fractional values (e.g., 0.76) versus already-percentage
values (e.g., 76). Use the existing test file structure and reference the
functions TokenUsageAccumulator, extractTokenUsageFromJsonLine,
summarizeTokenUsageFromOutput, and normalizePercent when adding these cases.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
src/engine/index.ts (1)

1356-1359: Simplify the usage ternary — getSummary() is never undefined.

TokenUsageAccumulator.getSummary() always returns a TokenUsageSummary object (never null/undefined), so the nullish-coalescing operator is dead code when hasData() is true. The intermediate undefined arm exists solely to make ?? fire on the false branch.

♻️ Suggested simplification
-      usage:
-        (iterationUsageAccumulator.hasData()
-          ? iterationUsageAccumulator.getSummary()
-          : undefined) ?? summarizeTokenUsageFromOutput(agentResult.stdout),
+      usage: iterationUsageAccumulator.hasData()
+        ? iterationUsageAccumulator.getSummary()
+        : summarizeTokenUsageFromOutput(agentResult.stdout),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/engine/index.ts` around lines 1356 - 1359, The usage expression currently
uses a ternary that returns undefined to force the nullish-coalescing operator
to run, but TokenUsageAccumulator.getSummary() never returns undefined; replace
the whole expression with a simple conditional: if
iterationUsageAccumulator.hasData() use iterationUsageAccumulator.getSummary(),
otherwise call summarizeTokenUsageFromOutput(agentResult.stdout) (remove the
unnecessary "?: undefined" and the "??" fallback).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tui/components/LeftPanel.tsx`:
- Around line 26-30: formatTaskUsageIndicator currently returns the placeholder
string 'c-- t0' when task.usage is missing causing un-run tasks to display a
misleading "0 tokens" indicator; change formatTaskUsageIndicator(TaskItem) to
return an empty string ('') for the no-data case instead of 'c-- t0', and update
the rendering guard in shouldShowUsageIndicator (the function/component that
checks indicator visibility and row width) to treat an empty indicator as falsy
(skip rendering) so rows with enough width do not show a phantom usage
indicator; keep existing width checks intact so only non-empty indicators are
subject to the width threshold.

In `@src/tui/components/RightPanel.tsx`:
- Around line 570-576: The current gating in RightPanel.tsx uses
hasUsageTelemetry = usage !== undefined && usage.events > 0 which hides valid
token telemetry when metadata sets events: 0; update hasUsageTelemetry to treat
any present usage fields as telemetry (e.g., usage !== undefined &&
(usage.events > 0 || (usage.inputTokens ?? 0) > 0 || (usage.outputTokens ?? 0) >
0 || (usage.totalTokens ?? 0) > 0 || usage.remainingContextTokens !== undefined
|| usage.contextWindowTokens !== undefined || usage.remainingContextPercent !==
undefined)), and then compute inputTokens, outputTokens, totalTokens,
remainingTokens, contextWindow and remainingPercent from usage? directly (or
keep existing nullish defaults) so token counts render when available even if
events === 0; refer to the hasUsageTelemetry, inputTokens, outputTokens,
totalTokens, remainingTokens, contextWindow and remainingPercent variables in
RightPanel.tsx.

In `@src/tui/utils/token-format.ts`:
- Around line 10-17: formatTokenCount currently produces invalid outputs for
NaN/Infinity/negative inputs; update the function (formatTokenCount) to validate
and sanitize the input first by coercing tokens to a Number, checking
Number.isFinite(value) and value >= 0 (or default to 0), then use Math.floor or
Math.round as desired before formatting into 'm'/'k' suffixes; return a safe
fallback string like "0" for invalid inputs so callers (e.g., LeftPanel.tsx
deriving totalTokens) never see "NaN" or "Infinity".

---

Nitpick comments:
In `@src/engine/index.ts`:
- Around line 1356-1359: The usage expression currently uses a ternary that
returns undefined to force the nullish-coalescing operator to run, but
TokenUsageAccumulator.getSummary() never returns undefined; replace the whole
expression with a simple conditional: if iterationUsageAccumulator.hasData() use
iterationUsageAccumulator.getSummary(), otherwise call
summarizeTokenUsageFromOutput(agentResult.stdout) (remove the unnecessary "?:
undefined" and the "??" fallback).

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/tui/components/RightPanel.tsx (1)

600-605: ⚠️ Potential issue | 🟡 Minor

Model can appear twice in the output view simultaneously.

TimingSummary renders the model from timing.model (lines 600–605), while TaskOutputView's header independently renders the model from currentModel (lines 855–863). If both props are populated — which is the expected steady state after the runtime model detection lands — the model label appears once in the header and again inside the timing panel on every output view.

Consider suppressing the header's modelDisplay when timing.model is also set (or vice versa), so the model is shown in exactly one place.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RightPanel.tsx` around lines 600 - 605, TimingSummary is
already rendering the model via timing.model, which causes a duplicate when
TaskOutputView also renders currentModel/modelDisplay in its header; update
TaskOutputView's header rendering logic to skip showing modelDisplay when the
TimingSummary/timing.model prop is present (or alternatively have TimingSummary
not render the model if currentModel is set), i.e., add a conditional around the
header's modelDisplay render so it only shows when timing.model is falsy (or
vice versa), referencing the TimingSummary component, timing.model,
TaskOutputView, currentModel and modelDisplay symbols to locate and change the
conditional rendering.
🧹 Nitpick comments (5)
src/tui/components/RightPanel.tsx (2)

563-568: Identical model-parsing IIFE is duplicated in TaskOutputView (lines 838–842).

Both blocks split a "provider/model" string into { provider, model, full, display } using the same logic. Extract it to a module-level helper to avoid drift.

♻️ Suggested extraction
+/**
+ * Parse a provider/model string into its components for display.
+ */
+function parseModelDisplay(modelString: string) {
+  const [provider, model] = modelString.includes('/')
+    ? modelString.split('/')
+    : ['', modelString];
+  return { provider, model, full: modelString, display: provider ? `${provider}/${model}` : model };
+}

Then in both TimingSummary and TaskOutputView:

-  const modelDisplay = timing.model
-    ? (() => {
-        const [provider, model] = timing.model!.includes('/') ? timing.model!.split('/') : ['', timing.model!];
-        return { provider, model, full: timing.model!, display: provider ? `${provider}/${model}` : model };
-      })()
-    : null;
+  const modelDisplay = timing.model ? parseModelDisplay(timing.model) : null;
-  const modelDisplay = currentModel
-    ? (() => {
-        const [provider, model] = currentModel.includes('/') ? currentModel.split('/') : ['', currentModel];
-        return { provider, model, full: currentModel, display: provider ? `${provider}/${model}` : model };
-      })()
-    : null;
+  const modelDisplay = currentModel ? parseModelDisplay(currentModel) : null;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RightPanel.tsx` around lines 563 - 568, The model-parsing
IIFE used to compute modelDisplay is duplicated in TimingSummary and
TaskOutputView; extract that logic into a module-level helper (e.g., a function
named parseModel or formatModelString) that accepts the timing.model string and
returns { provider, model, full, display }, then replace the inline IIFE in
TimingSummary (where modelDisplay is computed) and the duplicated block in
TaskOutputView with calls to this new helper to avoid drift and centralize
parsing logic.

505-507: toLocaleString() without a fixed locale produces inconsistent output across environments.

In a TUI running across different user locales, the same token count may render as 15,234, 15.234, or 15 234. Passing 'en-US' makes the output deterministic, or the helper can be removed in favour of formatTokenCount from token-format.ts (already imported in LeftPanel.tsx).

♻️ Proposed fix
 function formatTokenNumber(tokens: number): string {
-  return tokens.toLocaleString();
+  return tokens.toLocaleString('en-US');
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RightPanel.tsx` around lines 505 - 507, The
formatTokenNumber function uses tokens.toLocaleString() which yields
locale-dependent separators; change it to produce deterministic output by either
calling tokens.toLocaleString('en-US') or, better, replace formatTokenNumber
with the existing utility formatTokenCount from token-format.ts (import and use
formatTokenCount in RightPanel.tsx) so token rendering is consistent across
environments; update any references to formatTokenNumber accordingly.
src/tui/components/LeftPanel.tsx (1)

74-83: minimalTitlePlusIndicator restates what usageIndicatorWidth already captures.

usageIndicatorWidth is usageIndicator.length + 1 and minimalTitlePlusIndicator is 5 + usageIndicator.length + 1, which is just 5 + usageIndicatorWidth. Reusing the already-computed value makes the intent clearer.

♻️ Suggested simplification
-  const minimalTitlePlusIndicator = 5 + usageIndicator.length + 1;
+  const minimalTitlePlusIndicator = 5 + usageIndicatorWidth;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/LeftPanel.tsx` around lines 74 - 83, The
minimalTitlePlusIndicator calculation duplicates usageIndicatorWidth; update
minimalTitlePlusIndicator to use the precomputed usageIndicatorWidth (i.e., set
minimalTitlePlusIndicator = 5 + usageIndicatorWidth) to clarify intent and avoid
recomputing usageIndicator.length; keep the rest of the logic using
usageIndicatorWidth, shouldShowUsageIndicator, and titleWidth unchanged
(references: usageIndicator, usageIndicatorWidth, minimalTitlePlusIndicator,
shouldShowUsageIndicator, titleWidth).
src/engine/index.ts (2)

1394-1415: Consider preserving partial usage in failedResult.

iterationUsageAccumulator (defined in the enclosing scope) may already hold samples collected before the exception was thrown. Adding it to failedResult would surface partial token consumption for error cases rather than silently discarding it.

♻️ Proposed addition to failedResult
       const failedResult: IterationResult = {
         iteration,
         status: 'failed',
         task,
         taskCompleted: false,
         promiseComplete: false,
         durationMs: endedAt.getTime() - startedAt.getTime(),
         error: errorMessage,
+        usage: iterationUsageAccumulator.hasData()
+          ? iterationUsageAccumulator.getSummary()
+          : undefined,
         startedAt: startedAt.toISOString(),
         endedAt: endedAt.toISOString(),
       };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/engine/index.ts` around lines 1394 - 1415, The failedResult currently
omits partial token usage collected in iterationUsageAccumulator; update the
construction of failedResult inside the catch block to include the accumulator's
current value (e.g., a usage/usageAccumulator or iterationUsage field) so
partial usage is preserved on error, ensuring the IterationResult shape
(type/interface) accommodates this property and referencing
iterationUsageAccumulator and failedResult when making the change.

1158-1176: Extract shared Droid result-processing logic into a local helper.

The streaming loop body (lines 1158–1176) and the flush loop body (1204–1222) are now byte-for-byte identical. Extracting a shared helper eliminates the divergence risk if emitStreamingTelemetry or the subagent parser wiring ever needs updating again.

♻️ Proposed refactor — local helper before the outer try block
+    const processDroidParserResult = (
+      result: { success: boolean; message: unknown }
+    ): void => {
+      if (!result.success) return;
+      if (isDroidJsonlMessage(result.message)) {
+        emitStreamingTelemetry(result.message.raw);
+        for (const normalized of toClaudeJsonlMessages(result.message)) {
+          this.subagentParser.processMessage(normalized);
+        }
+      } else {
+        const parsed = result.message as unknown;
+        if (typeof parsed === 'object' && parsed !== null) {
+          emitStreamingTelemetry(parsed as Record<string, unknown>);
+        }
+        this.subagentParser.processMessage(result.message);
+      }
+    };
+
     try {
       ...
       onStdout: (data) => {
         ...
         if (droidJsonlParser && isDroidAgent) {
           const results = droidJsonlParser.push(data);
-          for (const result of results) {
-            if (result.success) {
-              if (isDroidJsonlMessage(result.message)) {
-                emitStreamingTelemetry(result.message.raw);
-                for (const normalized of toClaudeJsonlMessages(result.message)) {
-                  this.subagentParser.processMessage(normalized);
-                }
-              } else {
-                const parsed = result.message as unknown;
-                if (typeof parsed === 'object' && parsed !== null) {
-                  emitStreamingTelemetry(parsed as Record<string, unknown>);
-                }
-                this.subagentParser.processMessage(result.message);
-              }
-            }
-          }
+          for (const result of results) { processDroidParserResult(result); }
         }
       },
       ...
       // Flush block
       if (droidJsonlParser && isDroidAgent) {
         const remaining = droidJsonlParser.flush();
-        for (const result of remaining) {
-          if (result.success) {
-            if (isDroidJsonlMessage(result.message)) {
-              emitStreamingTelemetry(result.message.raw);
-              for (const normalized of toClaudeJsonlMessages(result.message)) {
-                this.subagentParser.processMessage(normalized);
-              }
-            } else {
-              const parsed = result.message as unknown;
-              if (typeof parsed === 'object' && parsed !== null) {
-                emitStreamingTelemetry(parsed as Record<string, unknown>);
-              }
-              this.subagentParser.processMessage(result.message);
-            }
-          }
-        }
+        for (const result of remaining) { processDroidParserResult(result); }
       }

Also applies to: 1204-1222

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/engine/index.ts` around lines 1158 - 1176, Extract the duplicate Droid
result-processing logic into a single local helper function (e.g.,
processDroidResults) declared before the outer try block and call it from both
the streaming loop and the flush loop; the helper should accept the results
array returned by droidJsonlParser.push/flush and implement the existing
behavior: for each result where result.success is true, if
isDroidJsonlMessage(result.message) call
emitStreamingTelemetry(result.message.raw) and for each message from
toClaudeJsonlMessages(result.message) call
this.subagentParser.processMessage(normalized), else if result.message is a
non-null object call emitStreamingTelemetry(parsed) and finally call
this.subagentParser.processMessage(result.message). Replace the duplicated
blocks with a single call to processDroidResults(results) in both places so
emitStreamingTelemetry, toClaudeJsonlMessages, isDroidJsonlMessage and
this.subagentParser.processMessage are used consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tui/components/RightPanel.tsx`:
- Line 583: The totalTokens calculation in RightPanel.tsx uses nullish
coalescing (const totalTokens = usage?.totalTokens ?? inputTokens +
outputTokens) which incorrectly preserves 0; change the logic to mirror
LeftPanel's guard by using a conditional that prefers usage.totalTokens only
when it's greater than 0 (e.g., use usage?.totalTokens > 0 ? usage.totalTokens :
inputTokens + outputTokens) so totalTokens falls back to inputTokens +
outputTokens when usage.totalTokens is absent or zero; update the expression
referencing usage, totalTokens, inputTokens, and outputTokens accordingly.

---

Outside diff comments:
In `@src/tui/components/RightPanel.tsx`:
- Around line 600-605: TimingSummary is already rendering the model via
timing.model, which causes a duplicate when TaskOutputView also renders
currentModel/modelDisplay in its header; update TaskOutputView's header
rendering logic to skip showing modelDisplay when the TimingSummary/timing.model
prop is present (or alternatively have TimingSummary not render the model if
currentModel is set), i.e., add a conditional around the header's modelDisplay
render so it only shows when timing.model is falsy (or vice versa), referencing
the TimingSummary component, timing.model, TaskOutputView, currentModel and
modelDisplay symbols to locate and change the conditional rendering.

---

Nitpick comments:
In `@src/engine/index.ts`:
- Around line 1394-1415: The failedResult currently omits partial token usage
collected in iterationUsageAccumulator; update the construction of failedResult
inside the catch block to include the accumulator's current value (e.g., a
usage/usageAccumulator or iterationUsage field) so partial usage is preserved on
error, ensuring the IterationResult shape (type/interface) accommodates this
property and referencing iterationUsageAccumulator and failedResult when making
the change.
- Around line 1158-1176: Extract the duplicate Droid result-processing logic
into a single local helper function (e.g., processDroidResults) declared before
the outer try block and call it from both the streaming loop and the flush loop;
the helper should accept the results array returned by
droidJsonlParser.push/flush and implement the existing behavior: for each result
where result.success is true, if isDroidJsonlMessage(result.message) call
emitStreamingTelemetry(result.message.raw) and for each message from
toClaudeJsonlMessages(result.message) call
this.subagentParser.processMessage(normalized), else if result.message is a
non-null object call emitStreamingTelemetry(parsed) and finally call
this.subagentParser.processMessage(result.message). Replace the duplicated
blocks with a single call to processDroidResults(results) in both places so
emitStreamingTelemetry, toClaudeJsonlMessages, isDroidJsonlMessage and
this.subagentParser.processMessage are used consistently.

In `@src/tui/components/LeftPanel.tsx`:
- Around line 74-83: The minimalTitlePlusIndicator calculation duplicates
usageIndicatorWidth; update minimalTitlePlusIndicator to use the precomputed
usageIndicatorWidth (i.e., set minimalTitlePlusIndicator = 5 +
usageIndicatorWidth) to clarify intent and avoid recomputing
usageIndicator.length; keep the rest of the logic using usageIndicatorWidth,
shouldShowUsageIndicator, and titleWidth unchanged (references: usageIndicator,
usageIndicatorWidth, minimalTitlePlusIndicator, shouldShowUsageIndicator,
titleWidth).

In `@src/tui/components/RightPanel.tsx`:
- Around line 563-568: The model-parsing IIFE used to compute modelDisplay is
duplicated in TimingSummary and TaskOutputView; extract that logic into a
module-level helper (e.g., a function named parseModel or formatModelString)
that accepts the timing.model string and returns { provider, model, full,
display }, then replace the inline IIFE in TimingSummary (where modelDisplay is
computed) and the duplicated block in TaskOutputView with calls to this new
helper to avoid drift and centralize parsing logic.
- Around line 505-507: The formatTokenNumber function uses
tokens.toLocaleString() which yields locale-dependent separators; change it to
produce deterministic output by either calling tokens.toLocaleString('en-US')
or, better, replace formatTokenNumber with the existing utility formatTokenCount
from token-format.ts (import and use formatTokenCount in RightPanel.tsx) so
token rendering is consistent across environments; update any references to
formatTokenNumber accordingly.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/tui/components/RightPanel.tsx (2)

513-521: isRunning prop is redundant — timing?.isRunning is already accessible inside the component.

TimingSummary has two sources of truth for the same boolean:

  • The isRunning: boolean prop (used for layout margin, usage fallback message, and context-row gate).
  • timing?.isRunning (used for the useEffect guard and duration calculation).

Since the caller derives isLiveStreaming directly from iterationTiming?.isRunning === true and passes it as isRunning, these are always identical. If they ever drift (e.g., a caller passes a stale value), the timer and the context-row UI would behave inconsistently.

♻️ Suggested simplification
 function TimingSummary({
   timing,
   usage,
-  isRunning,
 }: {
   timing?: IterationTimingInfo;
   usage?: RightPanelProps['taskUsage'];
-  isRunning: boolean;
 }): ReactNode {
+  const isRunning = timing?.isRunning ?? false;

Remove the prop from the call site in TaskOutputView:

 <TimingSummary
   timing={iterationTiming}
   usage={taskUsage}
-  isRunning={isLiveStreaming}
 />

The isLiveStreaming variable can still remain in TaskOutputView for other local uses.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RightPanel.tsx` around lines 513 - 521, TimingSummary
accepts an unnecessary isRunning prop that duplicates timing?.isRunning and
creates two sources of truth; remove the isRunning parameter from TimingSummary
and update its signature and internal uses to read timing?.isRunning directly
(including the useEffect guard, duration calculation, layout margin, usage
fallback message, and context-row gate). In the caller TaskOutputView, stop
passing isLiveStreaming into TimingSummary (you may keep isLiveStreaming locally
in TaskOutputView for other logic) so TimingSummary solely relies on the timing
prop (IterationTimingInfo) for its running state.

563-568: Duplicate model-parsing logic between TimingSummary and TaskOutputView.

The same split('/') + shape construction appears verbatim in both components (lines 563–568 and 837–842). Extracting it to a module-level helper prevents the two from drifting.

♻️ Suggested helper extraction
+function parseModelDisplay(model: string) {
+  const [provider, ...rest] = model.split('/');
+  const modelName = rest.join('/') || provider;
+  const display = rest.length > 0 ? `${provider}/${modelName}` : modelName;
+  return { provider: rest.length > 0 ? provider : '', model: modelName, full: model, display };
+}

Then replace both inlined IIFEs:

-  const modelDisplay = timing.model
-    ? (() => {
-        const [provider, model] = timing.model!.includes('/') ? timing.model!.split('/') : ['', timing.model!];
-        return { provider, model, full: timing.model!, display: provider ? `${provider}/${model}` : model };
-      })()
-    : null;
+  const modelDisplay = timing.model ? parseModelDisplay(timing.model) : null;
-  const modelDisplay = currentModel
-    ? (() => {
-        const [provider, model] = currentModel.includes('/') ? currentModel.split('/') : ['', currentModel];
-        return { provider, model, full: currentModel, display: provider ? `${provider}/${model}` : model };
-      })()
-    : null;
+  const modelDisplay = currentModel ? parseModelDisplay(currentModel) : null;

The helper also fixes silent truncation of model strings with more than one / separator (e.g., anthropic/claude/v3).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tui/components/RightPanel.tsx` around lines 563 - 568, Duplicate inline
model-parsing logic in TimingSummary and TaskOutputView should be extracted into
a module-level helper (e.g., parseModel(modelStr: string | undefined | null))
that returns an object { provider, model, full, display } or null for falsey
input; implement the helper to split only on the first '/' (so strings like
"anthropic/claude/v3" become provider="anthropic", model="claude/v3") and build
display as provider ? `${provider}/${model}` : model, then replace the IIFE
blocks that compute modelDisplay in both TimingSummary and TaskOutputView to
call parseModel(timing.model) (preserving null handling and types).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/tui/components/RightPanel.tsx`:
- Around line 570-583: The calculation for totalTokens repeats
usage?.totalTokens ?? 0; introduce a single intermediate variable (e.g.,
totalTokensRaw or usageTotalTokens) and use it in computing totalTokens,
updating the expressions around totalTokens, inputTokens, outputTokens and
hasUsageTelemetry to reference that variable; change references in the
RightPanel component to use this new symbol instead of evaluating
usage?.totalTokens ?? 0 twice so the logic remains identical but avoids
duplicate evaluation.

---

Nitpick comments:
In `@src/tui/components/RightPanel.tsx`:
- Around line 513-521: TimingSummary accepts an unnecessary isRunning prop that
duplicates timing?.isRunning and creates two sources of truth; remove the
isRunning parameter from TimingSummary and update its signature and internal
uses to read timing?.isRunning directly (including the useEffect guard, duration
calculation, layout margin, usage fallback message, and context-row gate). In
the caller TaskOutputView, stop passing isLiveStreaming into TimingSummary (you
may keep isLiveStreaming locally in TaskOutputView for other logic) so
TimingSummary solely relies on the timing prop (IterationTimingInfo) for its
running state.
- Around line 563-568: Duplicate inline model-parsing logic in TimingSummary and
TaskOutputView should be extracted into a module-level helper (e.g.,
parseModel(modelStr: string | undefined | null)) that returns an object {
provider, model, full, display } or null for falsey input; implement the helper
to split only on the first '/' (so strings like "anthropic/claude/v3" become
provider="anthropic", model="claude/v3") and build display as provider ?
`${provider}/${model}` : model, then replace the IIFE blocks that compute
modelDisplay in both TimingSummary and TaskOutputView to call
parseModel(timing.model) (preserving null handling and types).

@subsy subsy merged commit df37772 into main Feb 20, 2026
7 checks passed
@subsy subsy deleted the fix/task-usage-indicators branch February 20, 2026 21:31
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 20, 2026

Codecov Report

❌ Patch coverage is 86.29893% with 77 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.95%. Comparing base (b9714f3) to head (2c76486).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
src/plugins/agents/usage.ts 91.27% 39 Missing ⚠️
src/engine/index.ts 45.09% 28 Missing ⚠️
src/logs/persistence.ts 91.83% 4 Missing ⚠️
src/remote/server.ts 25.00% 3 Missing ⚠️
src/remote/client.ts 0.00% 2 Missing ⚠️
src/remote/instance-manager.ts 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #326      +/-   ##
==========================================
+ Coverage   45.47%   45.95%   +0.48%     
==========================================
  Files          98      100       +2     
  Lines       31497    32252     +755     
==========================================
+ Hits        14323    14823     +500     
- Misses      17174    17429     +255     
Files with missing lines Coverage Δ
src/engine/types.ts 100.00% <ø> (ø)
src/logs/types.ts 100.00% <ø> (ø)
src/remote/types.ts 100.00% <ø> (ø)
src/tui/output-parser.ts 71.84% <100.00%> (+0.23%) ⬆️
src/remote/instance-manager.ts 41.65% <0.00%> (-0.07%) ⬇️
src/remote/client.ts 27.63% <0.00%> (-0.06%) ⬇️
src/remote/server.ts 4.88% <25.00%> (+0.05%) ⬆️
src/logs/persistence.ts 54.90% <91.83%> (+2.15%) ⬆️
src/engine/index.ts 58.18% <45.09%> (-0.39%) ⬇️
src/plugins/agents/usage.ts 91.27% <91.27%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant