Skip to content

chore: refresh benchmarks for v1.18.2#851

Merged
jdx merged 1 commit into
mainfrom
bench-refresh
Jun 9, 2026
Merged

chore: refresh benchmarks for v1.18.2#851
jdx merged 1 commit into
mainfrom
bench-refresh

Conversation

@jdx

@jdx jdx commented Jun 9, 2026

Copy link
Copy Markdown
Owner

🤖 Refreshed benchmarks

benchmarks/results.json was pinned to aube 1.18.0; the workspace is now 1.18.2. Re-ran mise run bench:bump on the hermetic Verdaccio registry (500mbit / 50ms per the mise task) and regenerated benchmarks/results.json plus the README BENCH_RATIOS block. The benchmark matrix pins aube's GVS mode via npm_config_enable_global_virtual_store=true|false (the auto-synthesized env alias for the enableGlobalVirtualStore setting), so GitHub Actions' inherited CI=true environment does not change whether aube runs with GVS enabled or disabled.

Benchmark changes

Versions:

  • aube: 1.18.0 -> 1.18.2

Public ratios: warm installs vs Bun 7x -> 7x; warm installs vs pnpm 9x -> 6x; repeat test vs Bun 5x -> 5x; repeat test vs pnpm 37x -> 29x.

Benchmark aube bun deno pnpm npm yarn
Fresh install (warm cache) 272ms -> 383ms (+41%) 2003ms -> 2728ms (+36%) 1327ms -> 1329ms (+0%) 2341ms -> 2364ms (+1%) 7141ms -> 7196ms (+1%) 8856ms -> 8382ms (-5%)
Fresh install (cold cache) 7950ms -> 8132ms (+2%) 5777ms -> 5747ms (-1%) 8153ms -> 8815ms (+8%) 15870ms -> 23602ms (+49%) 9507ms -> 31075ms (+227%) 13192ms -> 18714ms (+42%)
npm install && npm run test 9ms -> 13ms (+44%) 41ms -> 69ms (+68%) 84ms -> 83ms (-1%) 335ms -> 375ms (+12%) 745ms -> 1005ms (+35%) 1175ms -> 1595ms (+36%)

Review the numbers before merging — if anything looks wildly off vs. the previous release, investigate before landing. Hermetic proxy jitter or an npmjs uplink hiccup can occasionally skew results.

Once merged to main, the updated bench results flow into the next release-plz-pr run automatically.


Generated by the bench-refresh workflow.


Note

Low Risk
Documentation and benchmark artifact updates only; no runtime or install logic changes.

Overview
Re-runs the hermetic benchmark suite for aube 1.18.2 and commits the refreshed benchmarks/results.json (timestamp, pinned versions.aube, and timing stats for warm/cold install and repeat install && test scenarios).

Updates the README BENCH_RATIOS marketing copy to match the new data: warm install vs pnpm 9x → 6x, repeat test vs pnpm 37x → 29x (vs Bun ratios unchanged at 7x and 5x). Absolute aube times are somewhat higher than the prior pin (e.g. warm 272ms → 383ms); cold-cache numbers for other managers also shifted, which the workflow flags for human review before merge.

Reviewed by Cursor Bugbot for commit 20460a7. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • Documentation
    • Updated benchmark results with new performance measurements. Warm install and repeat-command speedup metrics have been refreshed to reflect current performance data.

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Benchmark results are refreshed with new data collected on 2026-06-09, aube toolchain updated to v1.18.2, and all warm/cold/install-test metrics recalculated. README documentation is updated to reflect the new benchmark ratios.

Changes

Benchmark Results Refresh

Layer / File(s) Summary
Benchmark metrics and version refresh
benchmarks/results.json
Timestamp, aube version (1.18.01.18.2), and all benchmark metrics (gvs-warm, gvs-cold, install-test) with updated values and statistical summaries (mean, stddev, min, max) for each package manager.
Documentation update
README.md
Fast installs benchmark ratio numbers updated to match the refreshed metrics.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 Fresh metrics hop into the light,
New benchmarks shining, tight and right!
From aube 1.18.2 we've measured the way—
Faster installs reported today!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly and specifically describes the main change: refreshing benchmark data for aube version 1.18.2, which aligns directly with the file modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bench-refresh

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Refreshes benchmark data from aube 1.18.0 → 1.18.2 and updates the README BENCH_RATIOS blurb to reflect the new ratios. The values fields in results.json are internally consistent with the per-cell stats.mean values, and the README ratios (6x/7x warm, 29x/5x repeat) correctly round from the new means.

  • aube warm-install regression (+41%): 272 ms → 383 ms; the warm-install advantage over Bun and pnpm is preserved, but the absolute number increased noticeably.
  • Cold-cache outliers: pnpm cold jumped +49% (15.9 s → 23.6 s) with a very wide ±6.55 s stddev, and npm cold jumped +227% (9.5 s → 31.1 s); the PR description flags these as possible proxy-jitter artefacts and asks reviewers to investigate before merging.

Confidence Score: 4/5

Safe to merge as-is; the only open question is whether the pnpm/npm cold-cache spikes reflect real regressions or a noisy benchmark run.

All changes are data-only (benchmark JSON + README copy). The README ratios are arithmetically correct against the new means. The warm-install and install-test numbers look reasonable and consistent. The cold-cache numbers for pnpm and npm are the outliers — they changed dramatically with high variance — which the PR description itself flags as worth investigating before landing.

benchmarks/results.json — specifically the gvs-cold pnpm and npm entries with high stddev and large absolute regressions.

Important Files Changed

Filename Overview
README.md Updates the BENCH_RATIOS marketing blurb (pnpm 9x→6x, pnpm repeat 37x→29x); ratios are consistent with the new results.json means
benchmarks/results.json Bumps aube to 1.18.2 and refreshes all benchmark values; internal consistency (values == round(mean*1000) ms) checks out, but the pnpm and npm cold-cache entries show unusually high variance / large regressions that warrant a re-run

Fix All in Claude Code

Reviews (1): Last reviewed commit: "chore: refresh benchmarks" | Re-trigger Greptile

Comment thread benchmarks/results.json
Comment on lines 97 to +115
"bun": {
"text": "5.777s ± 0.442s",
"mean": 5.776975449539999,
"stddev": 0.44177328485860695,
"min": 4.96066754994,
"max": 6.36779450894
"text": "5.747s ± 0.202s",
"mean": 5.747125209120001,
"stddev": 0.20228616147832887,
"min": 5.458690011120001,
"max": 6.09460248712
},
"deno": {
"text": "8.153s ± 0.385s",
"mean": 8.152680589700001,
"stddev": 0.38522163142190247,
"min": 7.2359744523,
"max": 8.5316405233
"text": "8.815s ± 0.324s",
"mean": 8.81505870046,
"stddev": 0.3240567157409274,
"min": 8.30310792076,
"max": 9.37489252876
},
"pnpm": {
"text": "15.870s ± 0.392s",
"mean": 15.87049447584,
"stddev": 0.39198277953927113,
"min": 15.39932277324,
"max": 16.208181022239998
"text": "23.602s ± 6.550s",
"mean": 23.60153016548,
"stddev": 6.549877553224676,
"min": 18.20302976388,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 High variance / extreme regression in cold-cache results

The cold-cache numbers for npm (+227%, 9.5 s → 31.1 s) and pnpm (+49%, 15.9 s → 23.6 s) look unusually large, and pnpm's stddev of ±6.55 s on a 23.6 s mean gives a coefficient of variation of ~28% — far higher than any other cell in the file. This level of spread strongly suggests at least one outlier run (pnpm max was 33.6 s vs. min 18.2 s). Per the PR description's own guidance to "investigate before landing" if anything "looks wildly off," these two cells are the most obvious candidates for a re-run before the results get rolled into the next release-plz-pr.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

@jdx jdx merged commit 19275a1 into main Jun 9, 2026
18 checks passed
@jdx jdx deleted the bench-refresh branch June 9, 2026 07:16
@cursor cursor Bot mentioned this pull request Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant