v0.41.6.0 feat(ci): CI test speedup — 23min → ~9min via matrix 4→6 + weight-aware sharding + auto SHA cache + parallel verify#1444
Merged
Conversation
Fans out the 21 pre-test grep guards via & + wait, captures per-check exit codes in a tempdir, aggregates failures with named check + log tail to stderr on miss. Wallclock 27s sequential → 13s parallel locally (2x). Bigger CI win is shard 1 deload (workflow restructure in a later commit). Pinned by test/scripts/run-verify-parallel.test.ts (6 cases: CLI contract + synthetic dispatcher failure-surfacing).
scripts/sharding.ts (NEW) — pure TypeScript LPT bin-packer. Sort weights desc, assign each file to the shard with current minimum total. Worst-case makespan within 4/3 of optimal, O(n log n). Missing weights fall back to corpus median (not 0). New test file → ships immediately without regenerating weights. Pinned by test/scripts/sharding.test.ts (23 cases). scripts/mine-shard-weights.ts (NEW) — scrapes per-file timing from gh run view --log via timestamp delta between ##[group]test/foo.test.ts: headers within a shard. Three input modes: --run <ID>, --from-file <PATH>, stdin. Stable JSON output (sorted keys). Initial weights mined from run 26398061007. Pinned by test/scripts/mine-shard-weights.test.ts (15 cases). scripts/ci-cache-hash.sh (NEW) — deterministic 16-char sha256 over git ls-files -s minus deny-list (CHANGELOG/TODOS/README/LICENSE/ docs/**/*.md). CLAUDE.md, AGENTS.md, skills/**/* deliberately INCLUDED (8+ test files read them; deny-listing would create false-pass holes). ~40ms on 1891 files. Pinned by test/scripts/ci-cache-hash.test.ts (24 cases: 8 CRITICAL false-pass guards + 7 SAFE deny-list invariants + 9 edge cases). scripts/test-weights.json (NEW) — 712 weights. Total 3306s observed runtime; median 30ms; max 6 min outlier.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
garrytan
added a commit
that referenced
this pull request
May 25, 2026
Master shipped v0.41.6.0 (CI speedup: 23min → ~9min via matrix 4→6 + weight-aware sharding + auto SHA cache + parallel verify, #1444). Master now holds the v0.41.6.0 slot that our branch previously claimed before the v0.41.9.0 retarget. Resolved VERSION + package.json + CHANGELOG conflicts. Our v0.41.9.0 remains correct — it deliberately skipped past master's allocator to avoid collision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mgunnin
added a commit
to mgunnin/gbrain
that referenced
this pull request
May 28, 2026
* upstream/master: v0.41.10.1 fix-wave: dream.* config + batch retry + extract_atoms idempotency + ze-switch env-gate (garrytan#1445) v0.41.10.0 feat: orphan reduction via --by-mention + UTF-16 surrogate-pair fix (garrytan#1442) v0.41.9.0 — UX/reliability fix wave (5 defects from production report) (garrytan#1440) v0.41.8.0 fix(pglite): search/query/get exit cleanly + garrytan#1340 hint + garrytan#1342 breadcrumbs (garrytan#1405) v0.41.7.0 feat: compact list-format resolver + 300-skill scaling tutorial (garrytan#1407) v0.41.6.0 feat(ci): CI test speedup — 23min → ~9min via matrix 4→6 + weight-aware sharding + auto SHA cache + parallel verify (garrytan#1444) v0.41.5.0 fix-wave: warm-narwhal — 6 community PRs + E2E reliability (garrytan#1374) # Conflicts: # src/core/ai/recipes/openai.ts
garrytan-agents
pushed a commit
to garrytan-agents/gbrain
that referenced
this pull request
Jun 13, 2026
…weight-aware sharding + auto SHA cache + parallel verify (garrytan#1444) * feat(ci): scripts/run-verify-parallel.sh — parallel verify dispatcher Fans out the 21 pre-test grep guards via & + wait, captures per-check exit codes in a tempdir, aggregates failures with named check + log tail to stderr on miss. Wallclock 27s sequential → 13s parallel locally (2x). Bigger CI win is shard 1 deload (workflow restructure in a later commit). Pinned by test/scripts/run-verify-parallel.test.ts (6 cases: CLI contract + synthetic dispatcher failure-surfacing). * feat(ci): weight-aware LPT bin-packer + auto SHA cache hash scripts/sharding.ts (NEW) — pure TypeScript LPT bin-packer. Sort weights desc, assign each file to the shard with current minimum total. Worst-case makespan within 4/3 of optimal, O(n log n). Missing weights fall back to corpus median (not 0). New test file → ships immediately without regenerating weights. Pinned by test/scripts/sharding.test.ts (23 cases). scripts/mine-shard-weights.ts (NEW) — scrapes per-file timing from gh run view --log via timestamp delta between ##[group]test/foo.test.ts: headers within a shard. Three input modes: --run <ID>, --from-file <PATH>, stdin. Stable JSON output (sorted keys). Initial weights mined from run 26398061007. Pinned by test/scripts/mine-shard-weights.test.ts (15 cases). scripts/ci-cache-hash.sh (NEW) — deterministic 16-char sha256 over git ls-files -s minus deny-list (CHANGELOG/TODOS/README/LICENSE/ docs/**/*.md). CLAUDE.md, AGENTS.md, skills/**/* deliberately INCLUDED (8+ test files read them; deny-listing would create false-pass holes). ~40ms on 1891 files. Pinned by test/scripts/ci-cache-hash.test.ts (24 cases: 8 CRITICAL false-pass guards + 7 SAFE deny-list invariants + 9 edge cases). scripts/test-weights.json (NEW) — 712 weights. Total 3306s observed runtime; median 30ms; max 6 min outlier. * chore: bump version and changelog (v0.41.6.0) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
CI wallclock on every PR drops from ~23 min to ~9-10 min, with ≤2 min on cache-hit PRs. Five orthogonal levers, one cathedral PR with atomic bisect-friendly commits.
Performance levers
test.ymljobs —verify+serial-testsextracted from shard 1 into their own runners. Shard 1 stops carrying ~3min of verify + serial overhead on top of its matrix work.scripts/sharding.tsreplaces FNV-1a path-hash partition. Real-weight 6-shard projection: every shard estimated at 534s = 8.9 min wallclock (compare to current p100 of 23-26 min on the unlucky shard).scripts/mine-shard-weights.tsscrapes per-file wallclock fromgh run view --log(timestamp delta between##[group]test/foo.test.ts:headers within a shard). Free, real-world data, methodologically right (measures CI shard runtime, not isolated cold-start).scripts/ci-cache-hash.shproduces a 16-char sha256 over every git-tracked file EXCEPT the deny-list (CHANGELOG, TODOS, README, LICENSE, docs/.md/.txt).actions/cache/restore@v4.2.3in lookup-only mode probes the cache key first; on hit, every gated job skips and thetest-statusaggregator goes green. Cache write is post-all-pass (if: success() && ...) — never blesses a bad state.Plan + reviews
Plan + 11 captured decisions at
~/.claude/plans/system-instruction-you-are-working-graceful-platypus.md. Eng review (cleared) + Codex outside-voice review produced four material plan changes baked into this ship: (a) verifiede2e.ymlis 3-5 min (NOT the critical path), confirmingtest.ymltargeting is right; (b) corrected deny-list to keep CLAUDE.md and AGENTS.md IN the hash (8+ test files read them, deny-listing would create false-pass holes); (c) replaced original draft's isolated per-file profiling with log-mining; (d) added the job restructure missing from original plan.Test Coverage
8 CRITICAL false-pass guards in
test/scripts/ci-cache-hash.test.ts:7 SAFE deny-list invariants (CHANGELOG, README, TODOS, LICENSE, docs/.md, docs/sub/.md, docs/*.txt → SAME hash). 9 edge cases (symlinks, rename detection, untracked-file-excluded, new-file-type-discovery defaults to include, deny-list typo guard, locale-stable sort, determinism, usage errors).
LPT bin-packer: 23 cases (happy path, fallback semantics, full coverage, determinism, balance ratio ≤1.5 on synthetic Zipf corpus, N=1 trivial, malformed weights). Verify dispatcher: 6 cases. Mine-shard-weights: 15 cases. Extended test-shard.slow.test.ts with LPT balance + slow-file inclusion regression.
Tests: 120/120 green across
test/scripts/. Full local fast loop: 10,195 unit pass + 475 serial pass. Verify chain: 21/21 parallel checks pass in 13s (was 27s sequential).Pre-Landing Review
Eng review CLEARED via
/plan-eng-review. Codex outside-voice ran (4 findings produced material plan changes; rest considered and noted). Plan document carries the full## GSTACK REVIEW REPORTblock.Eval Results
No prompt-related files changed — evals skipped.
Plan Completion
All 8 implementation tasks DONE per
~/.claude/plans/system-instruction-you-are-working-graceful-platypus.md.Test plan
bun run verify(parallel): 21/21 green in 13sbun test test/scripts/: 120/120 greenbun run testfull fast loop: 10,195 unit pass + 475 serial pass🤖 Generated with Claude Code