Conversation
Tracks both stacks in one Bencher project (`pnpm`), under separate testbeds (`pnpm`, `pacquet`). - `benchmarks/bench.sh` emits a hyperfine-shaped `bencher-results.json` combining the six pnpm scenarios. - `benchmark.yml` adds `push: branches: [main]` so each merge updates the `pnpm` baseline, then uploads via the Bencher CLI. - `pacquet-integrated-benchmark.yml` adds the same main-push baseline for `pacquet`, combines the four scenario JSONs into a Bencher report, and stages it into the existing artifact. - `pacquet-integrated-benchmark-comment.yml` uploads PR results from the trusted `workflow_run` context so fork PRs are covered too. Requires `BENCHER_API_TOKEN` in repo secrets; workflows no-op with a `::notice::` if it's missing.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
🧰 Additional context used📓 Path-based instructions (1)pacquet/**/*.rs📄 CodeRabbit inference engine (pacquet/AGENTS.md)
Files:
🧠 Learnings (1)📚 Learning: 2026-05-20T23:07:58.444ZApplied to files:
🔇 Additional comments (1)
📝 WalkthroughWalkthroughAdds generation of a Bencher-compatible ChangesBencher Integration for Continuous Benchmark Tracking
Sequence Diagram(s)sequenceDiagram
participant BenchScript as benchmarks/bench.sh
participant PacquetWF as .github/workflows/pacquet-integrated-benchmark.yml
participant ArtifactStore as benchmark-artifact
participant BencherCLI as bencherdev/bencher
participant Bencher as Bencher
BenchScript->>PacquetWF: produce per-scenario BENCHMARK_REPORT*.json
PacquetWF->>PacquetWF: transform (jq) -> bencher-results.json
PacquetWF->>ArtifactStore: stage SUMMARY.md + bencher-results.json
PacquetWF->>BencherCLI: install (if artifact present)
BencherCLI->>Bencher: bencher run --file bench-work-env/bencher-results.json
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Integrated-Benchmark Report (Linux)Scenario: Isolated linker: fresh restore, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 2.45578310486,
"stddev": 0.09326222936858554,
"median": 2.4691350727600003,
"user": 2.72747364,
"system": 3.64591852,
"min": 2.3349557407600003,
"max": 2.60107321976,
"times": [
2.58224209276,
2.3777718727600003,
2.5043210297600003,
2.4853639977600004,
2.60107321976,
2.3904305587600003,
2.46731795576,
2.34340239076,
2.47095218976,
2.3349557407600003
]
},
{
"command": "pacquet@main",
"mean": 2.40877509376,
"stddev": 0.055796526661784515,
"median": 2.39292858376,
"user": 2.7326028399999993,
"system": 3.64037572,
"min": 2.32833523776,
"max": 2.50067259976,
"times": [
2.4387470277600003,
2.3781797747600004,
2.32833523776,
2.40174072876,
2.50067259976,
2.4442199487600003,
2.37892379076,
2.3518024467600003,
2.48101294376,
2.38411643876
]
},
{
"command": "pnpm",
"mean": 4.98704360176,
"stddev": 0.059346966314228956,
"median": 4.97784088926,
"user": 8.512679539999999,
"system": 4.29128902,
"min": 4.89777282276,
"max": 5.0618745587600005,
"times": [
5.05509064976,
4.97504024076,
4.90549900376,
4.97434922776,
5.03395664576,
5.0618745587600005,
4.9471211557600006,
4.98064153776,
5.03909017476,
4.89777282276
]
}
]
}Scenario: Isolated linker: fresh restore, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.66057326798,
"stddev": 0.026764163366911724,
"median": 0.65100267388,
"user": 0.376959,
"system": 1.4552125200000001,
"min": 0.64185736538,
"max": 0.73143110438,
"times": [
0.73143110438,
0.66071642138,
0.64645952638,
0.64426809138,
0.6600363273800001,
0.64780142738,
0.67369751838,
0.64526097738,
0.65420392038,
0.64185736538
]
},
{
"command": "pacquet@main",
"mean": 0.6874608495800001,
"stddev": 0.07271920949142308,
"median": 0.66199984288,
"user": 0.3794958,
"system": 1.4577622200000002,
"min": 0.64412904038,
"max": 0.8845756193800001,
"times": [
0.70797076638,
0.67519814038,
0.64613644638,
0.69687002338,
0.65126717638,
0.64412904038,
0.65763179238,
0.8845756193800001,
0.66636789338,
0.64446159738
]
},
{
"command": "pnpm",
"mean": 2.67635808238,
"stddev": 0.14141326031949564,
"median": 2.62726667838,
"user": 3.3351839999999995,
"system": 2.24173212,
"min": 2.54452487938,
"max": 3.0054958903799998,
"times": [
2.73232054538,
2.54974553338,
2.54452487938,
3.0054958903799998,
2.59206930038,
2.65830230738,
2.59623104938,
2.59308456538,
2.79410421238,
2.69770254038
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 5.00442922534,
"stddev": 0.13605322345806645,
"median": 4.980987213840001,
"user": 6.715868559999999,
"system": 3.6474317,
"min": 4.81486797634,
"max": 5.192435731340001,
"times": [
5.17766849834,
4.95867423334,
4.89625747134,
5.043606731340001,
4.91101108334,
5.192435731340001,
4.81486797634,
5.16560349534,
5.00330019434,
4.88086683834
]
},
{
"command": "pacquet@main",
"mean": 4.96670029294,
"stddev": 0.15456930609280436,
"median": 4.92934746484,
"user": 6.7043497599999995,
"system": 3.6502285,
"min": 4.77396755034,
"max": 5.23156078634,
"times": [
4.78459658734,
5.1156350143400005,
4.89885866034,
5.23156078634,
4.77396755034,
5.164178032340001,
4.89176830134,
4.93252076234,
4.94774306734,
4.92617416734
]
},
{
"command": "pnpm",
"mean": 6.893589930940001,
"stddev": 0.10379715535190981,
"median": 6.90533459134,
"user": 11.09385476,
"system": 4.555295499999999,
"min": 6.70726206434,
"max": 7.02219315634,
"times": [
6.70726206434,
6.90192451334,
6.98618751434,
7.02219315634,
6.80668045134,
6.79960062434,
6.90874466934,
7.01445588634,
6.94844889434,
6.840401535340001
]
}
]
}Scenario: Isolated linker: fresh install, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 3.8645455069000008,
"stddev": 0.08898497919121567,
"median": 3.8569782523000002,
"user": 4.27743982,
"system": 2.1574152399999997,
"min": 3.6960852417999996,
"max": 3.9747887118,
"times": [
3.8632898108,
3.9659548748,
3.9147713887999998,
3.8403342228,
3.9502861528,
3.7914508678,
3.8506666938,
3.6960852417999996,
3.9747887118,
3.7978271038
]
},
{
"command": "pacquet@main",
"mean": 3.7919680630999997,
"stddev": 0.06678050341733613,
"median": 3.7881918138,
"user": 4.21596012,
"system": 2.13757864,
"min": 3.7022157768,
"max": 3.8866727817999998,
"times": [
3.7957109328,
3.8374044158,
3.7022157768,
3.7404925928,
3.8466309308,
3.8866727817999998,
3.7556964408,
3.7037095628,
3.7806726948,
3.8704745018
]
},
{
"command": "pnpm",
"mean": 4.3982595759,
"stddev": 0.05226865875660416,
"median": 4.4051863218000005,
"user": 5.58554892,
"system": 2.62874224,
"min": 4.2989558068,
"max": 4.472200594799999,
"times": [
4.3991476718,
4.2989558068,
4.3710678307999995,
4.4439152048,
4.4410585798,
4.4200532868,
4.472200594799999,
4.388949580799999,
4.3360222308,
4.4112249718
]
}
]
} |
The inline Bencher upload was gated to `event_name == 'push'`, which meant manual dispatch from a feature branch ran the bench but skipped the upload. Both push and workflow_dispatch execute in the base-repo privilege context, so it's safe to upload from both — the fork-safety gate only needs to keep `pull_request` runs out. On dispatch from a non-main branch we record into the ref name with `--start-point main --start-point-reset`, matching the pnpm bench's branch policy.
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/benchmark.yml:
- Around line 126-131: The branch-selection logic treats workflow_dispatch on
main like a non-main ref; update the conditional so manual dispatches whose
REF_NAME is "main" are handled like push: add an explicit check (e.g., elif [
"$REF_NAME" = "main" ]; then) that sets args+=(--branch main) before the
non-main else branch that currently appends --start-point and
--start-point-reset; reference the existing EVENT_NAME check and the args+=(...)
lines to locate where to insert this new branch.
In `@benchmarks/bench.sh`:
- Around line 128-139: The current if block that builds bencher inputs using jq
silently skips generating bencher-results.json when jq is missing; update the
script around the existing "if command -v jq >/dev/null; then" block so that the
else branch prints a clear error (e.g., "jq is required to produce
bencher-results.json") and exits non‑zero instead of doing nothing. Reference
the same symbols used in the diff (jq, SCENARIOS, bencher_inputs, BENCH_DIR,
bencher-results.json) and ensure the script fails early when jq is not available
so missing tooling is surfaced.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 13ac2fe6-5f83-4b07-a9e7-72c445c80cfe
📒 Files selected for processing (5)
.github/workflows/benchmark.yml.github/workflows/pacquet-integrated-benchmark-comment.yml.github/workflows/pacquet-integrated-benchmark.ymlbenchmarks/README.mdbenchmarks/bench.sh
Two small fixes from PR review: - `benchmark.yml`: manual dispatch from `main` was falling into the non-main branch arm, recording `--branch main --start-point main --start-point-reset` — equivalent to forking main from itself. Treat it like a `push` event so a manual run on main updates the baseline directly. Matches the pacquet workflow's branch policy. - `bench.sh`: emit a stderr warning when `jq` is missing instead of silently skipping `bencher-results.json`. Keeps behaviour optional for local users but makes the skip discoverable.
Replaces the hyperfine-leaking names (`clean-install`, `frozen-lockfile`, `peek`, `gvs-warm`, …) with a consistent grid that spells out every state the benchmark depends on: - "Fresh" — node_modules wiped at start (future variants will start with a populated node_modules). - "Install" vs "Restore" vs "Add new dep" — the work being measured. - "hot/cold cache + hot/cold store" — both pnpm directories, spelled out separately because they're distinct on disk. - "isolated linker" — nodeLinker mode (future variants will cover `hoisted` and `pnp`). The slugs map directly from the clap-derived kebab-case names, so `--scenario=fresh-restore-cold-cache-cold-store-isolated` is the new CLI surface. Updates land across the Rust orchestrator (`BenchmarkScenario`), `benchmarks/bench.sh`, the pacquet workflow, `benchmarks/README.md`, and `pacquet/CONTRIBUTING.md` so the names agree end-to-end. Adds a justified `#[allow(clippy::enum_variant_names)]` on the enum because every variant currently shares the `Fresh` prefix; the lint will stop firing once `Filled*`/`Resynced*` counterparts land. Bencher's stored history for the old benchmark names will become orphaned and can be archived in the UI.
Micro-Benchmark ResultsLinux |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #11875 +/- ##
==========================================
- Coverage 87.81% 87.80% -0.02%
==========================================
Files 205 205
Lines 24429 24424 -5
==========================================
- Hits 21453 21445 -8
- Misses 2976 2979 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Reshapes the scenario identifiers so the linker mode is the leading group: `<linker>.<action>.<cache state>.<store state>`. Dots separate the four axes the bench varies, and `isolated-linker.*` / `gvs-linker.*` sort together in any dashboard that groups by prefix. Future buckets (`hoisted-linker.*`, `pnp-linker.*`) will slot in without disturbing the existing names. GVS is its own top-level bucket rather than a sub-variant of isolated — its perf profile differs enough to chart separately. Renames: - `clean-install` → `isolated-linker.fresh-install.cold-cache.cold-store` - `full-resolution` → `isolated-linker.fresh-install.hot-cache.hot-store` - `frozen-lockfile` → `isolated-linker.fresh-restore.cold-cache.cold-store` - `frozen-lockfile-hot-cache` → `isolated-linker.fresh-restore.hot-cache.hot-store` - `peek` → `isolated-linker.fresh-add-dep.hot-cache.hot-store` - `gvs-warm` → `gvs-linker.fresh-restore.hot-cache.hot-store` Each Rust variant now carries `#[value(name = "…")]` so clap accepts the dotted CLI form (`--scenario=isolated-linker.fresh-install.cold-cache.cold-store`). Display labels follow the slug structure: `Isolated linker: fresh install, cold cache + cold store` and `GVS linker: fresh restore, hot cache + hot store`. The `#[allow(clippy::enum_variant_names)]` is renewed; 5 of 6 variants share the `Isolated` prefix today. Once `Hoisted*` / `Pnp*` buckets land the lint will stop firing on its own.
The longer match-arm pattern produced by the linker-first rename exceeded the rustfmt width budget. Auto-format breaks the `&["install", "--frozen-lockfile"]` body onto its own line so the arm stays within the limit.
Summary
Wires both benchmark workflows to Bencher so install-perf history is tracked continuously instead of only being visible in per-PR comments. Both stacks live in one Bencher project (slug:
pnpm) under separate testbeds (pnpm,pacquet) so the scenario charts are directly comparable.benchmarks/bench.shemits a hyperfine-shapedbencher-results.jsonthat combines the six pnpm scenarios —@HEADresult only,commandrenamed to the scenario name so Bencher uses the scenario as the benchmark identifier..github/workflows/benchmark.yml(pnpm TS bench): addspush: branches: [main]so each merge updates thepnpmbaseline, then uploads viabencher run --testbed pnpm. Branch policy:mainon push,pr/<n>on manual PR dispatch,ref_nameotherwise (all forked from main)..github/workflows/pacquet-integrated-benchmark.yml: adds the same main-push baseline for thepacquettestbed, builds the Bencher report from the four scenario JSONs, and stages it into the existing artifact. Inlinebencher runonly fires onpushtomain(where secrets are available)..github/workflows/pacquet-integrated-benchmark-comment.yml: after the existing summary comment, downloadsbencher-results.jsonfrom the artifact and runsbencher run --branch pr/<n> --start-point main --start-point-reset --ci-number <n>. Runs in the trustedworkflow_runprivilege context, so fork PRs are covered too.The existing hand-rolled PR comments are kept alongside Bencher's auto-comment during rollout; we can drop them once thresholds are tuned.
Required out-of-band setup
pnpmat bencher.dev. Testbeds (pnpm,pacquet), themainbranch, and scenario benchmarks will auto-create on the first push.BENCHER_API_TOKENto repo secrets (Actions). Without it, the new steps no-op with a::notice::— so this PR is safe to merge before the token lands, but Bencher won't actually record anything until both are in place.Out of scope
pacquet-micro-benchmark.ymlis unchanged. It uses criterion, not hyperfine, and would need Bencher'srust_criterionadapter. Easy follow-up if useful.Test plan
benchmark.ymlandpacquet-integrated-benchmark.yml; both should attempt their Bencher upload steps. With the token unset they should emit::notice::BENCHER_API_TOKEN not setand exit 0.benchmark.ymlmanually with a recent PR number and verify Bencher comments on the PR.pacquet/**) and confirm both the existing summary comment and Bencher's comment appear.github.event_name == 'push', so fork PRs skip it and rely entirely on the comment workflow.Written by an agent (Claude Code, claude-opus-4-7).
Summary by CodeRabbit
New Features
Documentation