Skip to content

gstack-global-discover: session counts conflate originator types and undercount CC by ~5x #1315

@Akagilnc

Description

@Akagilnc

Summary

gstack-global-discover reports sessions.{claude_code,codex,gemini} counts that misrepresent actual development activity in two ways. /retro global (which builds narrative on these counts) inherits both errors.

Tested on gstack 1.26.2.0 with a 31-day window across 9 repos.

Problem 1 — codex sessions.codex conflates Codex Desktop with codex_exec subagent calls

The script counts every ~/.codex/sessions/.../rollout-*.jsonl whose payload.cwd resolves to a repo. But codex rollout files are produced by very different agents, and the originator field distinguishes them:

originator what it is dev signal
Codex Desktop user driving codex CLI interactively yes — real codex dev
codex_exec codex exec - (cron / scripts / subagent invocations) no — usually CC firing codex as cross-model review or external voice
Claude Code codex via CC's MCP / subagent integration no — codex is the tool, CC is the driver

Real numbers from one user's 31d window:

ak-ai-vela:        codex_exec=97  Codex Desktop=1   ← reported as 'codex 98 sessions' but vela was 100% CC dev
ak-where-to-go:    codex_exec=2   Codex Desktop=92  ← real codex dev phase (matches user's recollection)
ai-blogger-lab:    codex_exec=88  Codex Desktop=10  ← codex_exec are CC subagents during dev/review, not codex dev
ak-fund-advisor:   codex_exec=49  Codex Desktop=1   ← all CMR subagents, not codex dev
ak-cc-wiki:        codex_exec=57  Codex Desktop=0   ← CC subagents during wiki writing

Effect: /retro global narrative concluded "codex was the primary execution tool (414 sessions across 7 repos)" when in fact codex drove dev for only one repo's middle phase (~92 Desktop sessions in where-to-go). The other ~309 codex_exec entries were CC firing codex as cross-model review subagent.

Problem 2 — CC session count under-reports by ~5x on cron-driven projects

Discovery reported claude_code: 129 total sessions in the 31d window. A direct scan of ~/.claude/projects/*/*.jsonlfiltered bymtime >= since` shows ~621 in the same window.

The biggest gap is ai-blogger-lab:

  • Discovery: 6 CC sessions
  • Direct scan: ~450 CC jsonl files (all CCR / cron-driven; no human in loop)

Plausible causes (didn't fully diagnose):

  • Discovery may skip CC project dirs that map to git worktree paths (-claude-worktrees-* / --worktrees-* show up as separate dirs)
  • Or it dedupes by some session_id key that collapses identical-cwd entries

Either way the count is wrong by enough to flip narrative conclusions.

Suggested fix

  1. Parse codex payload.originator in scanCodex() and bucket separately:

    • codex_desktop_sessions — interactive dev signal
    • codex_exec_invocations — subagent / cron / scripted (annotate as "called by another agent" in retro)
    • claude_code_subagent_invocations — codex via CC

    Retro narrative should then attribute Codex Desktop counts to "codex dev" and codex_exec counts to " using codex as subagent" — not lump them.

  2. Investigate the CC undercount — likely worktree path normalization. Repro: any project whose CC sessions live in WorkSpace-<repo>--worktrees-<branch>/ style dirs.

  3. Annotate /retro global output that "sessions" = "tool invocations / file count", not "distinct dev sessions". Especially with CCR / cron drivers in the mix, the same repo can show 450 CC "sessions" with zero interactive work.

Repro

# Originator breakdown for a 31d window:
python3 -c \"
import os, json, time
since = time.time() - 31*86400
root = os.path.expanduser('~/.codex/sessions')
counts = {}
for dirpath, _, files in os.walk(root):
    for f in files:
        if not (f.startswith('rollout-') and f.endswith('.jsonl')): continue
        fp = os.path.join(dirpath, f)
        if os.stat(fp).st_mtime < since: continue
        with open(fp,'rb') as fh: buf = fh.read(131072)
        line = buf.split(b'\n',1)[0].decode('utf-8','replace')
        d = json.loads(line)
        p = d.get('payload',{})
        repo = os.path.basename(p.get('cwd',''))
        orig = p.get('originator','?')
        counts[(repo,orig)] = counts.get((repo,orig),0)+1
for k,v in sorted(counts.items(), key=lambda x:-x[1]):
    print(f'{k[0]:35s} {k[1]:25s} {v:4d}')
\"

# Compare against discovery output:
~/.claude/skills/gstack/bin/gstack-global-discover --since 31d

Why this matters

Without these fixes, /retro global produces confidently-wrong narratives. In my case it told me codex was my main dev tool (driving 7 repos) when codex actually drove dev for one repo's middle phase. CC was the primary driver everywhere — including powering blogger-lab's cron pipeline through CCR. The session counts were the entire foundation of the retro's "tool usage analysis" section.

Same shape as a class of bugs already documented in vault wiki: a metric that conflates distinct things will mislead any narrative built on it, regardless of how thorough the narrative is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions