fix(analytics): avg duration handles both quoted and unquoted duration_s by RyanAlberts · Pull Request #1728 · garrytan/gstack

RyanAlberts · 2026-05-26T17:53:33Z

Summary

gstack-analytics (the local-usage dashboard) silently displayed (avg 0s) next to any /skill whose completion events emit "duration_s" as a quoted JSON string. The completion-status preamble that every skill runs (scripts/resolvers/preamble/generate-completion-status.ts:70) and bin/gstack-codex-probe:84 both emit "duration_s":"<N>", so the dominant in-the-wild case rendered as 0s and most per-skill averages in the dashboard have been wrong since this code path was added. bin/gstack-telemetry-log emits the unquoted numeric form, which is why the bug wasn't visible for every skill.

The companion Total time: line at the bottom of the dashboard was always correct — it uses a different parser, so the bug was confined to the per-/skill (avg X) column.

Current behavior on upstream main

Repro on origin/main (cf50443b):

mkdir -p /tmp/repro/.gstack/analytics
cat > /tmp/repro/.gstack/analytics/skill-usage.jsonl <<'JSONL'
{"skill":"review","duration_s":"120","outcome":"success","ts":"2026-05-25T10:00:00Z"}
{"skill":"review","duration_s":"180","outcome":"success","ts":"2026-05-25T11:00:00Z"}
{"skill":"review","duration_s":"60","outcome":"success","ts":"2026-05-25T12:00:00Z"}
{"skill":"qa","duration_s":300,"outcome":"success","ts":"2026-05-25T13:00:00Z"}
{"skill":"qa","duration_s":420,"outcome":"success","ts":"2026-05-25T14:00:00Z"}
JSONL
GSTACK_STATE_DIR=/tmp/repro/.gstack bin/gstack-analytics all

Before:

gstack usage (all time)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  /review               ████████████████████  3 runs  (avg 0s)   <-- WRONG, should be 2m
  /qa                   █████████████  2 runs  (avg 6m)
Total time: 18m

Total time: 18m matches the real sum (120 + 180 + 60 + 300 + 420 = 1080s = 18m), confirming the total-time parser is already correct. Only the per-skill (avg ...) column is broken.

Root cause

The AVG_DUR block in bin/gstack-analytics:165-177 splits each line on the literal "duration_s": prefix and then strips non-numeric characters from parts[2]:

n = split($0, parts, "\"duration_s\":")
if (n >= 2) {
  gsub(/[^0-9.].*/, "", parts[2])      # greedy .*
  if (parts[2]+0 > 0) { total += parts[2]; count++ }
}

Unquoted form 600,"ts":... → gsub matches at the comma → parts[2] becomes 600 → accumulates.
Quoted form "600","ts":... → gsub matches at the leading " (position 0), and the trailing .* consumes the rest of the line → parts[2] becomes the empty string → "" + 0 > 0 is false → never accumulates → count stays 0 → average renders as 0.

TOTAL_DURATION (bin/gstack-analytics:114-125) uses -F'[:,]' field splitting and a non-greedy gsub(/[^0-9.]/, "", val), which isolates the value before stripping. That parser handles both forms and isn't affected.

Fix

Strip an optional leading double-quote with sub(/^"/, "", parts[2]) before the existing gsub so quoted and unquoted forms take the same path through the existing digit-extraction:

sub(/^"/, "", parts[2])                # NEW
gsub(/[^0-9.].*/, "", parts[2])

Three lines in bin/gstack-analytics (one new statement + two-line comment update). No producer-side changes — the parser fix accommodates both emitter formats that are already in the wild, so existing JSONL files on users' disks render correctly without migration.

Edge cases verified by hand:

Input value	After `sub(/^"/)`	After `gsub(/[^0-9.].*/)`	Accumulated?
`600,"ts":...` (unquoted)	`600,"ts":...`	`600`	yes
`"600","ts":...` (quoted)	`600","ts":...`	`600`	yes (was: no)
`null,"ts":...`	`null,"ts":...`	`` (empty)	no — `"" + 0 > 0` is false
`"1.5","ts":...` (decimal)	`1.5","ts":...`	`1.5`	yes

Validation

After the fix, same repro renders correctly:

  /review               ████████████████████  3 runs  (avg 2m)   <-- (120+180+60)/3 = 120s = 2m
  /qa                   █████████████  2 runs  (avg 6m)
Total time: 18m

New regression test in test/gstack-analytics-avg-duration.test.ts exercises four cases (quoted-only, unquoted-only, mixed quoted+unquoted for the same skill, and total-time aggregation). I confirmed it fails on the pre-fix tree (expected "2m", got "0s") by stashing the patch and re-running, then passes after popping the stash.

$ bun test test/gstack-analytics-avg-duration.test.ts
 4 pass
 0 fail
 4 expect() calls

$ bun test test/telemetry.test.ts test/gstack-analytics-avg-duration.test.ts
 39 pass
 0 fail
 87 expect() calls

$ bun test
exit 0

Scope

bin/gstack-analytics (3 lines) + 1 new test file. No SKILL.md/template changes, no VERSION/CHANGELOG bumps (per the wave-process note in CONTRIBUTING.md). The producer-side emitters (generate-completion-status.ts, gstack-codex-probe, gstack-telemetry-log) are untouched — fixing the parser keeps existing JSONL on users' machines readable without a migration.

The per-skill (avg X) column in gstack-analytics silently rendered as "0s" for any /skill whose completion events emit "duration_s" as a quoted JSON string. The completion-status preamble that every skill runs (scripts/resolvers/preamble/generate-completion-status.ts:70) and gstack-codex-probe both emit "duration_s":"<N>", so the dominant case was wrong; gstack-telemetry-log emits the unquoted form which is why the bug didn't show up everywhere. The AVG_DUR awk block split each line on the literal "duration_s": prefix and then ran gsub(/[^0-9.].*/, "", parts[2]). For unquoted 600,"ts":... the gsub matches at the comma and yields "600". For quoted "600","ts":... it matches at the leading double-quote (position 0) and the trailing .* consumes the entire rest of the line, leaving an empty string that never accumulates. The companion TOTAL_DURATION calculation (line 114) uses a different parser keyed on -F'[:,]' field splitting + non-greedy gsub, so it isolates the value before stripping and was always correct. Only the per-skill average was broken. Fix: strip an optional leading double-quote with sub(/^"/, ...) before the existing gsub so quoted and unquoted forms take the same path. Adds test/gstack-analytics-avg-duration.test.ts covering quoted-only, unquoted-only, mixed, and total-time aggregation. The test fails on the pre-fix tree (expected "2m", got "0s") and passes after the fix.

jbetala7 · 2026-05-27T06:28:39Z

Independent validation — this looks correct.

Confirmed the bug on origin/main: the emit sites do write quoted strings — scripts/resolvers/preamble/generate-completion-status.ts:70 and bin/gstack-codex-probe:84 both emit "duration_s":"<N>", which is the dominant in-the-wild form. With the pre-fix awk, parts[2] for a quoted value is "120",...; the leading " is non-numeric, so gsub(/[^0-9.].*/, "", parts[2]) erases the whole token → 0, and those rows silently drop out of both the per-skill avg and the total.

Running this PR's test/gstack-analytics-avg-duration.test.ts against origin/main fails 2/4 (the mixed case yields 50s instead of 40s because the quoted "30" is dropped; the all-quoted review case averages to 0). On the PR branch all 4 pass. The sub(/^"/, "", parts[2]) strip is the minimal correct fix and leaves the unquoted numeric path (from gstack-telemetry-log) untouched.

No competing open PR touches bin/gstack-analytics for this. Nice catch and good regression coverage.

garrytan mentioned this pull request Jun 3, 2026

v1.56.1.0 fix(sync): staging-dir ownership guard + resume-correctness fixes (#1802) #1856

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(analytics): avg duration handles both quoted and unquoted duration_s#1728

fix(analytics): avg duration handles both quoted and unquoted duration_s#1728
RyanAlberts wants to merge 1 commit into
garrytan:mainfrom
RyanAlberts:fix-analytics-avg-duration-quoted

RyanAlberts commented May 26, 2026

Uh oh!

jbetala7 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RyanAlberts commented May 26, 2026

Summary

Current behavior on upstream main

Root cause

Fix

Validation

Scope

Uh oh!

jbetala7 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants