Skip to content

refactor(git): collapse three git fleets into one provider (kz8.3)#8

Merged
brandon-fryslie merged 8 commits into
mainfrom
bf/kz8.3-git-fleet-collapse
May 16, 2026
Merged

refactor(git): collapse three git fleets into one provider (kz8.3)#8
brandon-fryslie merged 8 commits into
mainfrom
bf/kz8.3-git-fleet-collapse

Conversation

@brandon-fryslie

Copy link
Copy Markdown
Contributor

Summary

Closes brandon-process-discipline-kz8.3 — the third deliverable of the kz8 epic (eliminate process-spawn pileup).

Pre-kz8.3 the codebase carried three independent git-shell-out implementations wrapped by two nested caches, each with its own watchers and its own invalidation policy. On git commit they all fired simultaneously: ≥20 git processes for one repo. This PR collapses the stack:

  • One providerGitDataProvider (renamed from CachedGitService) owns the cache, the watcher set per repo, and the single `git` launch category. Pull surface (getInfo) for segments; new push surface (subscribe) for MobX-driven var-system consumers. `[LAW:one-source-of-truth]` `[LAW:single-enforcer]`
  • One cachesrc/utils/git-cache.ts (the disk-backed withGitCache) is deleted; it was a pre-daemon artifact that doubled the invalidation surface.
  • One var-system pathexecGit, fetchGitSnapshot, GitPoller, and the per-cwd gitPollers map are deleted. declareGit subscribes to the provider; SourceRegistry takes an optional gitProvider injection point so the daemon shares one instance while tests construct private ones.
  • One launch categoryvar-system.git is removed from LAUNCH_CATEGORIES. Cross-fleet git workload now reports in one byCategory.git bucket in daemon-stats --json.

DoD verification

Check Result
grep -rn 'spawn.*git|exec.*git ' src/ outside provider
\.git/HEAD|\.git/index watcher acquisition sites one (src/daemon/cache/git.ts)
withGitCache / git-cache.ts deleted (not refactored)
GitPoller, execGit, fetchGitSnapshot deleted (not refactored)
var-system.git references
Tests 687 passing (+5 new subscribe-API contract tests)
typecheck / lint / build / protocol-check clean

Known limitation, not in scope for this PR

The DoD epic line "git status invocations per render: exactly 1 per repo" is approached but not strictly met: the provider's cache is still keyed by (repoRoot, optionsKey), so segment + var-system requests with disjoint options still produce separate `git status` invocations within a render. The structural prerequisite — one fleet, one cache, one watcher — is now in place; per-command memoization is a follow-up worth filing if measured workload justifies it. The cross-render and cross-session sharing wins (the dominant case) are already realized.

Test plan

  • pnpm typecheck
  • pnpm lint
  • pnpm test — 687 passing
  • pnpm check:protocol — version 2 matches Rust
  • pnpm build — Node bundle builds
  • Manual: start daemon under multi-session load, confirm daemon-stats --json shows only `byCategory.git` (no `var-system.git`), and git status runs once per repo per render where segment+var-system options align

Pre-kz8.3 the codebase had three independent git-shell-out
implementations (GitService, var-system execGit, var-system execShell)
and two nested caches (withGitCache disk cache + CachedGitService
in-memory cache). Each maintained its own watchers, its own
invalidation policy, and its own launch category. On `git commit` all
three fleets re-fetched in parallel — ≥20 git processes for one repo.

This collapses the stack to:

- `GitDataProvider` (renamed from CachedGitService) — the single
  in-daemon source. Exposes pull (`getInfo`) for segments and a new
  push API (`subscribe`) for MobX-driven var-system consumers. Both
  surfaces share one cache (keyed by repoRoot), one watcher per repo,
  one launch category ("git").

- `GitService.getGitInfo` no longer wraps in `withGitCache` — the
  disk-backed inner cache was an artifact of the pre-daemon
  architecture and is pure duplication once the daemon owns one
  cache. `src/utils/git-cache.ts` deleted.

- `src/var-system/sources.ts` loses execGit, fetchGitSnapshot,
  GitPoller, and the per-cwd gitPollers map. `declareGit` now
  subscribes to the provider; SourceRegistry takes an optional
  gitProvider arg so the daemon can inject the shared instance while
  tests construct a private one. Per-cwd batching in SourceRegistry
  collapses N declareGit calls for the same cwd onto one provider
  subscription.

- `LAUNCH_CATEGORIES` loses "var-system.git"; all git invocations now
  flow through the single "git" category, so daemon-stats reports the
  full cross-fleet git workload in one bucket.

DoD verification: `grep -rn 'spawn.*git\|exec.*git ' src/` outside
provider+launch is empty; `\.git/HEAD\|\.git/index` watcher
acquisition lives at one site (`src/daemon/cache/git.ts`); 687 tests
pass including 5 new subscribe-API contract tests.

Refs: brandon-process-discipline-kz8.3
Copilot AI review requested due to automatic review settings May 16, 2026 16:33

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates git shell-out + caching + invalidation into a single daemon-side provider (GitDataProvider), removing the legacy disk cache and the var-system’s separate git polling fleet to reduce redundant git process spawns and unify stats under the git launch category.

Changes:

  • Replace CachedGitService with GitDataProvider and add a subscribe() push API for reactive consumers.
  • Remove the pre-daemon disk cache (withGitCache / src/utils/git-cache.ts) and drop var-system.git from launch categories.
  • Rewire var-system git variables to subscribe to the shared provider (with optional injection for daemon vs tests).

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/daemon-watchers.test.ts Updates watcher integration tests to use GitDataProvider.
test/daemon-git-cache.test.ts Renames cache tests and adds contract tests for GitDataProvider.subscribe().
src/var-system/sources.ts Removes the var-system’s independent git poller and switches git variables to provider subscriptions (with injectable provider).
src/utils/git-cache.ts Deletes the legacy disk-backed git cache helper.
src/segments/git.ts Removes per-process caching from GitService.getGitInfo() (provider becomes the intended cache layer).
src/proc/launch.ts Removes var-system.git launch category (git work now reports under git).
src/daemon/server.ts Instantiates GitDataProvider in the daemon instead of CachedGitService.
src/daemon/cache/git.ts Renames/refactors cache to GitDataProvider and introduces subscribe() + subscriber refresh on invalidation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/daemon/cache/git.ts
Comment thread src/daemon/cache/git.ts Outdated
Comment thread src/daemon/cache/git.ts Outdated
Comment thread src/segments/git.ts
…3 review)

Three reactive-path robustness bugs surfaced in Copilot review of #8:

1. subscribe()'s initial-delivery callbacks (the synchronous null path
   for non-repo cwds and the post-fetch path for repo cwds) were
   unguarded — a subscriber throwing here would reject the async IIFE
   and surface as an unhandled rejection. Route all callback delivery
   through one safeInvoke() helper so every notification — initial,
   refresh, error path — gets the same try/catch + dlog treatment.

2. refreshSubscribers() snapshotted the callback set once and then
   iterated, so a subscriber that unsubscribed after the snapshot but
   before the async fetch completed would still receive a delivery —
   including after SourceRegistry.dispose(). Add a membership check at
   call time (`current.callbacks.has(cb)`) so unsubscribe is a hard
   guarantee, not a race.

3. refreshSubscribers() could start parallel async fetches for the
   same repo if invalidations arrived in quick succession (rapid
   commits, rebase). This reintroduced the parallel `git` work that
   kz8.3 is meant to eliminate. Add trailing-edge coalescing via two
   Sets (refreshing / refreshAgain): the first invalidation kicks off
   the loop; subsequent invalidations set the trailing flag; the loop
   re-runs once more before exiting. N rapid invalidations collapse to
   at most two fetches (leading + trailing), never N parallel.

Tests: +5 contract tests covering each scenario (throwing subscriber
in initial delivery, throwing on null delivery, unsubscribe-during-
invalidation, rapid-invalidation coalescing, plus the prior
ordering-suppression case still holds). 691 total passing.

Refs: PR #8 review (brandon-process-discipline-kz8.3)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment thread src/daemon/cache/git.ts
Comment thread src/daemon/cache/git.ts Outdated
Second-pass Copilot review of #8 surfaced one critical correctness
bug and one moderate perf bug, both pointing at the same shape error:
GitDataProvider was deriving the cache key from
findGitRoot(workingDir), but GitService.getGitInfo() internally
switches to projectDir when one is supplied and is itself a git repo.
When the two pointed at different repos, the provider cached repo B's
data under repo A's key — and wired invalidation to repo A's watcher.
Cross-repo contamination + missed invalidations.

The fix lifts the gitDir resolution into a public method on
GitService — resolveEffectiveGitDir(workingDir, projectDir) — that
returns the exact directory the shell-runner will operate in. The
provider then keys its cache, watcher, and subscriber registry off
that one value, single source of truth. inner.getGitInfo is invoked
with the resolved dir (no projectDir), so its internal resolution
collapses to a sync check and never disagrees with the cache key.

The same refactor closes the perf finding: subscribe() now stores the
resolved repoRoot on the entry, and the refresh loop calls a private
getGitInfoForRoot() that skips repoRoot resolution entirely. On every
invalidation the refresh path was previously paying an extra
`git rev-parse --show-toplevel` to compute the cache key — now it
doesn't.

Tests: +3 contract tests covering projectDir-override cache keying,
projectDir-aligned cwd sharing, and refresh-path non-re-resolution.
694 total passing.

Refs: PR #8 review pass 2 (brandon-process-discipline-kz8.3)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comment thread src/daemon/cache/git.ts
Comment thread src/daemon/cache/git.ts Outdated
Comment thread src/var-system/sources.ts
Third-pass Copilot review of #8 surfaced three findings:

1. (concurrency) getGitInfoForRoot() didn't coalesce concurrent cache
   misses. Renders build lines in parallel via Promise.all
   (src/powerline.ts), so two line-renders requesting the same git
   data could both observe the cache as cold and spawn duplicate
   `git` work — the very failure mode the daemon-side cache exists to
   eliminate. Added fetchInFlight Map keyed by repoRoot|optionsKey: the
   first miss installs a Promise, concurrent callers await it. The
   promise self-clears via .finally() once it settles.

2. (var-system → daemon coupling) Importing GitDataProvider into
   var-system pulled in dlog (daemon-specific log file writer) and
   the daemon's WatcherRegistry transitively. Var-system has its own
   lifecycle (tests, future library use) where writes to the daemon
   log are surprising. Swapped dlog → debug (the neutral logger in
   src/utils/logger) in both src/daemon/cache/git.ts and
   src/daemon/cache/watchers.ts. The file locations stay — they signal
   the daemon as the primary instance owner — but the *behavior* no
   longer leaks daemon-specific side effects into var-system contexts.

3. (doc nit) Subscribe()'s comment said "initial delivery on a
   microtask after subscribe() returns", but the chain actually
   awaits gitDir resolution + a potential `git status` shell-out. Not
   a microtask. Rewrote the comment to spell out the timing honestly
   and warn consumers off same-tick assumptions.

Tests: +1 contract test for concurrent cache-miss coalescing (two
parallel Promise.all callers see exactly one inner shell-out, same
result reference). Replaced the fragile microtask-count `tick(N)`
helper with `setImmediate` — drains every queued microtask in one
yield regardless of implementation-detail chain depth.

695 tests total, all passing.

Refs: PR #8 review pass 3 (brandon-process-discipline-kz8.3)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread src/daemon/cache/git.ts Outdated
Comment thread test/daemon-git-cache.test.ts Outdated
Fourth-pass Copilot review of #8 caught a pre-existing critical bug
plus a test-helper bug, both surfaced by the kz8.3 refactor:

1. (critical, worktree) snapshotMtimes() and watcherTargets() always
   joined repoRoot with ".git/HEAD" / ".git/index". In a git worktree,
   repoRoot/.git is a *file* containing `gitdir: <path>` pointing at
   the worktree's metadata dir under main-repo/.git/worktrees/.
   Pre-fix behavior: fs.watch on the non-existent path failed
   silently, snapshotMtimes returned 0 for both fields → mtimeChanged
   always false → cache entries for worktrees never invalidated.
   Branch switches in a worktree showed stale data until the 30s TTL
   expired. The bug pre-existed kz8.3 (inherited from CachedGitService)
   but was flagged in this review pass and is fixed here.

   Fix: lifted GitService.resolveGitDir from private to public — it
   already handled the .git-as-file case for computeGitInfo's
   `getOngoingOperation` path. The provider now resolves repoRoot →
   gitDir at the watch/snapshot site and joins HEAD/index against
   gitDir directly. For normal repos gitDir = repoRoot/.git, so
   behavior is unchanged; for worktrees gitDir is the metadata dir and
   the watchers point at real files.

2. (moderate, test) The tick() helper in daemon-git-cache.test.ts
   took a `_times` parameter but never looped — every call drained one
   setImmediate yield regardless of the argument. Tests passing
   `tick(10)` happened to pass because one yield is usually enough,
   but the test was lying about its drain count. Fixed to actually
   loop N times.

Tests: +1 contract test ("worktree path resolves HEAD/index from
.git file's gitdir target") asserting the provider consults
resolveGitDir for watch/mtime path derivation. 696 total passing.

Refs: PR #8 review pass 4 (brandon-process-discipline-kz8.3)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/segments/git.ts:283

  • resolveGitDir() uses fs.statSync() / fs.readFileSync() without a try/catch. Now that the daemon’s GitDataProvider calls this during cache fetch + watcher setup (outside any guard), a transient race (e.g. .git removed between existsSync and statSync, or permission errors) can throw and reject getGitInfo() unexpectedly. Consider wrapping the body in a try/catch (like isWorktree) and falling back to dotGit on error so git lookups fail as null rather than by throwing.
  resolveGitDir(workingDir: string): string {
    const dotGit = path.join(workingDir, ".git");
    if (fs.existsSync(dotGit) && fs.statSync(dotGit).isFile()) {
      const content = fs.readFileSync(dotGit, "utf-8");
      const match = content.match(/^gitdir:\s*(.+)$/m);
      if (match?.[1]) {
        return path.resolve(workingDir, match[1].trim());
      }
    }
    return dotGit;

Comment thread src/daemon/cache/watchers.ts
Comment thread src/daemon/cache/git.ts
Fifth-pass Copilot review of #8 caught the observability regression
introduced by the pass-3 dlog→debug swap, plus a defensiveness gap in
the now-public resolveGitDir.

1. (suppressed, moderate) resolveGitDir's fs.statSync / readFileSync
   were unguarded. After lifting it to public (pass 4) the daemon's
   GitDataProvider now calls it during cache fetch + watcher setup —
   outside any guard. An fs race (.git removed between existsSync and
   statSync, permission errors, .git unreadable) would throw and
   reject getGitInfo. Wrapped the resolution branch in a try/catch
   that falls back to the dotGit path on any fs error, matching
   isWorktree's pattern.

2. (moderate, observability regression) Pass 3 swapped dlog → debug
   in GitDataProvider and WatcherRegistry to remove var-system's
   transitive daemon-log dependency. That fixed the layer leak but
   silenced daemon cache/watcher events in daemon.log (debug only
   fires under CC_CANDYBAR_DEBUG). Diagnosing git churn, watcher
   failures, or misbehaving subscribers got harder. Both reviewer
   threads (lines 1-3 of watchers.ts and git.ts) flagged the same
   regression.

   Fix: logger injection. Both GitDataProvider and WatcherRegistry
   now take an optional `logger: (level, message) => void` opt
   matching dlog's signature. Daemon's server.ts constructs both with
   `logger: dlog` so events land in daemon.log at the requested
   level. var-system's default path (no injection) gets the
   debug-routed default — present-but-quiet, never touches daemon log
   files. Best of both: daemon retains persistent observability;
   var-system retains module-load purity.

Tests: 696 still passing. The injection point is structural (no
behavioral change for tests that don't pass a logger).

Refs: PR #8 review pass 5 (brandon-process-discipline-kz8.3)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Comment thread src/daemon/cache/git.ts Outdated
Sixth-pass Copilot review of #8: with logger injection in place, the
daemon's dlog now surfaces what was previously silent — including a
predictable warn line on every worktree watcher acquisition. Worktree
gitDirs (under main-repo/.git/worktrees/<name>) don't have a
`refs/heads` subdir, so fs.watch fails with ENOENT and WatcherRegistry
logs "watch failed". For a long-running daemon with many worktrees,
this would become persistent noise in daemon.log that looks like an
operational warning but is in fact expected.

Fix at the source: watcherTargets() now stat-checks the refs/heads
path before including it as a watch target. Present (normal repo) →
included. Missing (worktree) → skipped. HEAD + index file watches
still cover branch and index changes in the worktree case, so the
behavioral envelope is unchanged.

Tests unchanged — 696 passing.

Refs: PR #8 review pass 6 (brandon-process-discipline-kz8.3)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Comment thread src/var-system/sources.ts Outdated
Seventh-pass Copilot review of #8 — pure doc nit. The declareGit
comment claimed the provider watched '(.git/HEAD, .git/index,
.git/refs/heads)' as if those literal paths always existed. After the
worktree-aware fix in pass 4 + the conditional refs/heads in pass 6,
the actual targets are HEAD/index under the *resolved gitDir* plus
refs/heads only when present. Rewrote the comment to match reality
and point readers at the canonical site (src/daemon/cache/git.ts:
watcherTargets) where the resolution logic lives.

Refs: PR #8 review pass 7 (brandon-process-discipline-kz8.3)
@brandon-fryslie brandon-fryslie requested a review from Copilot May 16, 2026 17:42

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

@brandon-fryslie brandon-fryslie merged commit cf6bc3b into main May 16, 2026
10 checks passed
brandon-fryslie added a commit that referenced this pull request Jun 10, 2026
… absence policy (2l6) (#96)

Sheriff findings #2+#8: git/cache providers swallowed ~10 failure paths
(stash failure rendered as 0, repoName failure as basename, failed status
as a clean fallback branch) and projectGitInfo coerced absence to ''/0
while MetricsPayload preserved it — two opposite absence rules in one file,
with render-payload's "providers log their own rejections" comment false
for exactly these two providers.

[LAW:types-are-the-program] New Outcome<T> = ok | absent | failed(reason)
(src/utils/outcome.ts). execGitAsync returns the LaunchResult instead of
flattening it to a throw; one classifier maps it per command — a non-zero
exit that is the domain answering "there is none" (no upstream, no tags,
no remote) is absent; transport failures are failed. GitInfo's on-demand
fields are Outcome-typed; getGitInfo returns Outcome<GitInfo>; the
basename repoName survives only as the explicit no-remote policy, never
as an error fallback. readTail/cacheExpiresAt distinguish ENOENT (absent)
from real read errors (failed).

[LAW:effects-at-boundaries][LAW:single-enforcer] Failure flows as a value
to each consumption surface's edge and is logged exactly once there:
buildRenderPayload (pull path, via deps.log=dlog) and GitDataProvider's
subscribe delivery fold. projectGitInfo is now a pure fold returning
{git?, failures[]}; absent and failed both project as MISSING payload
fields into applyInput's default + last_error chain.

[LAW:one-type-per-behavior] GitPayload adopts MetricsPayload's
preserve-absence policy (all fields optional); the coerce-to-''/0 shim is
gone, so "no stashes" and "stash count unknown" are no longer the same
rendered 0.

Verified live against an isolated daemon: healthy repo renders with zero
warnings (absent logs nothing), a corrupt .git produces exactly one warn
line with the real git error while the bar still renders.

Ticket: brandon-git-segment-2l6 (epic brandon-law-remediation-58c)

Co-authored-by: elton-prawn <264370887+elton-prawn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants