Skip to content

v0.41.32.0 fix(staleness): commit-relative sync staleness (supersedes #1623)#1656

Merged
garrytan merged 5 commits into
masterfrom
garrytan/supersede-pr-1623
May 30, 2026
Merged

v0.41.32.0 fix(staleness): commit-relative sync staleness (supersedes #1623)#1656
garrytan merged 5 commits into
masterfrom
garrytan/supersede-pr-1623

Conversation

@garrytan

Copy link
Copy Markdown
Owner

Summary

Supersedes #1623 (from garrytan-agents), re-implemented in the base repo with a real correctness fix and the v0.41.27.0 trust boundary preserved.

A quiet, fully-caught-up source no longer false-alarms as SEVERELY STALE in gbrain doctor / sources status. Staleness now means "is there committed content the sync hasn't ingested?" — not raw wall-clock since the last sync ran.

The bug: checkSyncFreshness's v0.41.27.0 short-circuit required a fully clean working tree. git status --porcelain counts untracked dirs (?? companies/, ?? media/) as dirty, so a quiet repo with stray untracked folders failed the gate, fell through to wall-clock, and escalated to SEVERE — even though the sync had everything the repo committed.

The fix (by layer):

  • git-head.tsrequireCleanWorkingTree gains 'ignore-untracked' mode (git status --porcelain --untracked-files=no). Untracked files no longer defeat the short-circuit. Sync's incremental path keys off the commit diff and never imports untracked files, so doctor now agrees with sync. (One-line headline fix.)
  • source-health.tsnewestCommitMs (HEAD committer time, git log -1 --format=%ct) + pure lagFromContentMs comparator. computeAllSourceMetrics({probeContent}) routes LOCAL → live commit-hash (isSourceUnchangedSinceSync, robust against HEAD moving to an old-dated commit), REMOTE → stored column. Dead isSourceStale removed.
  • migration v108 sources.newest_content_at + fresh-schema blobs (schema.sql, pglite-schema.ts, regenerated schema-embedded.ts).
  • sync.tswriteSyncAnchor stamps newest_content_at atomically with last_commit/last_sync_at; buildSyncStatusReport (the remote get_status_snapshot MCP op) reads the column — no git subprocess on a DB-supplied local_path.
  • doctor.tscheckSyncFreshness short-circuit ignores untracked; remote (non-localOnly) path reads the column; < 0 clock-skew check stays on raw wall-clock.

Trust boundary: local consumers (which have the checkouts) probe live git and catch post-sync commits — the authoritative signal. Remote-callable surfaces (gbrain remote doctor, federation_health, get_status_snapshot) read the durable column, never shelling out to a path an OAuth client could influence (v0.41.27.0 codex P0-1 preserved).

Outside-voice (codex) caught: the original timestamp comparison (newest content <= last sync) false-reports "caught up" when HEAD moves to an older-dated commit (rebase preserving dates, branch rewind). Switching to the commit hash fixes it and drops a fragile porcelain-mtime parse.

CI tooling (this PR also adds)

  • scripts/ship-remote-tests.sh — offload the suite to GitHub's on-demand runners and block on gh run watch --exit-status. Built because the local machine is regularly saturated by sibling Conductor agents (load avg 120 on 16 cores → PGLite OOM/crawl).
  • workflow_dispatch on test.yml so the suite can be triggered from any branch.

Behavior matrix

Scenario Old New
Quiet repo, caught up grows → SEVERE 0 → fresh
New unsynced commits wall-clock wall-clock → stale
HEAD → older-dated commit could read "caught up" stale (hash, not date)
Non-git / never synced wall-clock wall-clock (unchanged)
Future last_sync_at warns warns (unchanged)

Test Coverage

  • test/source-health.test.tsnewestCommitMs, lagFromContentMs matrix, commit-hash caught-up incl. old-dated-commit regression, probeContent local vs remote/column (27 cases).
  • test/doctor.test.tsT1 untracked-folders headline bug → unchanged; T2/T2b/T2c remote path reads column with zero git subprocess (trust-boundary regression).
  • test/sync-all-parallel.test.tsbuildSyncStatusReport column-path staleness.
  • Migration v108 round-trip + schema-bootstrap-coverage parity (both engines).

Every touched file passes in isolation; typecheck clean; bun run verify 29/29. Full local parallel suite was gated by host memory pressure (sibling agents) — CI runs it clean here.

Plan Completion

All plan items DONE: A1 (lazy) · A2 (durable column) · A3 (local-live + remote-column) · CM1 (HEAD-hash, drop mtime parsing) · CM2 (scope out checkCycleFreshness) · C1 (shared comparator) · TODO-1 (filed) · TODO-2 (dead isSourceStale removed).

Documentation

CLAUDE.md annotations (git-head.ts, source-health.ts, doctor.ts) + bun run build:llms regenerated. CHANGELOG v0.41.32.0 with "To take advantage" block. TODOS.md: probe-phase follow-up filed.

🤖 Generated with Claude Code

garrytan and others added 3 commits May 30, 2026 09:04
…ble column remote)

Quiet, fully-caught-up repos no longer false-alarm as SEVERELY STALE in
gbrain doctor / sources status. Staleness now means "is there committed
content the sync hasn't ingested?" not raw wall-clock since the last sync.

- git-head.ts: requireCleanWorkingTree gains 'ignore-untracked' mode (git
  status --porcelain --untracked-files=no). Untracked dirs no longer defeat
  the freshness short-circuit — sync's incremental path keys off the commit
  diff and never imports untracked files, so doctor agrees with sync.
- source-health.ts: newestCommitMs (HEAD committer time) + pure
  lagFromContentMs comparator; computeAllSourceMetrics {probeContent} routes
  local→live commit-hash, remote→stored column. Dead isSourceStale removed.
- migration v108 sources.newest_content_at + fresh-schema blobs.
- sync.ts: writeSyncAnchor stamps newest_content_at atomically with
  last_commit/last_sync_at; buildSyncStatusReport (remote get_status_snapshot)
  reads the column — no git subprocess (v0.41.27.0 trust boundary intact).
- doctor.ts: checkSyncFreshness short-circuit ignores untracked; remote path
  reads the column; clock-skew check stays on raw wall-clock.

Local consumers probe live git (catch HEAD moving to an old-dated commit, which
a timestamp compare would miss); remote consumers read the durable column so a
remote-callable endpoint never shells out to a DB-supplied local_path.

Supersedes #1623 (re-implemented in base repo with the trust boundary preserved).

Co-Authored-By: t <t@t>
scripts/ship-remote-tests.sh pushes the branch, dispatches the test workflow,
and blocks on `gh run watch --exit-status` — a local caller (human or agent)
awaits the GitHub run exactly like a local `bun run test`, with a real pass/fail
exit code. Frees a load-saturated local machine (many Conductor agents running
their own bun-test suites at once → load avg 120 on 16 cores → PGLite OOM/crawl).

test.yml gains workflow_dispatch so the suite can be triggered from any branch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
garrytan added 2 commits May 30, 2026 09:16
…pr-1623

# Conflicts:
#	CHANGELOG.md
#	TODOS.md
#	VERSION
#	package.json
…pr-1623

# Conflicts:
#	CHANGELOG.md
#	CLAUDE.md
#	VERSION
#	llms-full.txt
#	package.json
#	src/core/migrate.ts
@garrytan garrytan merged commit f79c130 into master May 30, 2026
21 checks passed
mgunnin added a commit to mgunnin/gbrain that referenced this pull request Jun 3, 2026
* upstream/master:
  v0.41.36.0 feat(mcp): publish agent skills (list_skills / get_skill) for thin clients (garrytan#1661)
  v0.41.35.0 feat(guardrails): vendor-neutral content guardrail seams (supersedes garrytan#1652) (garrytan#1660)
  v0.41.34.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence (garrytan#1657)
  v0.41.33.0 feat(search): intent-aware adaptive return-sizing + agent-facing query param (garrytan#1640)
  v0.41.32.0 fix(staleness): commit-relative sync staleness (supersedes garrytan#1623) (garrytan#1656)
  v0.41.31.0 feat(embed): delta-aware sync --all cost gate + real stale-embedding semantics (garrytan#1632)
  v0.41.30.0 fix(brainstorm/lsd): --save writes the advertised .md file via canonical ingestion path (garrytan#1655)

# Conflicts:
#	src/core/operations.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant