ci(bencher): enforce PR thresholds and grant checks: write by zkochan · Pull Request #11883 · pnpm/pnpm

zkochan · 2026-05-23T16:23:01Z

Summary

Add --start-point-clone-thresholds to the non-main upload arms in benchmark.yml, pacquet-integrated-benchmark.yml, and pacquet-integrated-benchmark-comment.yml, so PR / feature-branch records inherit thresholds configured on main in the Bencher UI. Pair it with --err so the workflow fails when a sample breaches the upper boundary — without this, a regression is recorded but the GitHub check stays green.
Add checks: write to all three workflows. On push: main (no --ci-number, not a pull_request event) Bencher falls back to creating a GitHub Check on the commit; without the permission the upload step exits 1 with Failed to create GitHub Check, which is what's currently happening on main.

Main-branch uploads still skip the threshold/--err flags on purpose: by the time main fails, the regression has already landed.

This branch was forked from main so its own benchmark runs against the threshold can be compared against the main baseline once the workflows run.

Test plan

Configure Percentage thresholds in Bencher UI for main/pnpm/Latency and main/pacquet/Latency (upper boundary 0.20, min samples 10, max samples 30).
After merge (or via workflow_dispatch from this branch): confirm the next push: main run completes without the Failed to create GitHub Check error and shows a Check on the commit.
Dispatch Benchmarks and Pacquet integrated benchmark against this branch (or open a follow-up PR with an intentional perf regression) and confirm Bencher reports the upper-boundary breach and the job exits non-zero.

Written by an agent (Claude Code, claude-opus-4-7).

Summary by CodeRabbit

Chores
- Improved benchmark regression reporting: workflow now surfaces regressions as errors for better visibility during pull request reviews
- Enhanced threshold management: pull requests automatically inherit benchmark thresholds from the main branch for consistent comparison
- Upgraded GitHub integration: benchmark results now fully integrate with GitHub's check system for improved workflow visibility

- Add `--start-point-clone-thresholds` to the non-main upload arms so PR/feature branches inherit thresholds configured on main; pair it with `--err` so a sample over the upper boundary fails the job. - Add `checks: write` to the three workflows that call `bencher run`. On main pushes (no `--ci-number`, not a PR event) Bencher falls back to creating a GitHub Check on the commit; without the permission it exits 1 with "Failed to create GitHub Check".

qodo-code-review · 2026-05-23T16:23:05Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

coderabbitai · 2026-05-23T16:23:16Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 03fc6139-53e7-4dbb-90b5-774d01da01c4

📥 Commits

Reviewing files that changed from the base of the PR and between 4088de0 and 42bb020.

📒 Files selected for processing (3)

.github/workflows/benchmark.yml
.github/workflows/pacquet-integrated-benchmark-comment.yml
.github/workflows/pacquet-integrated-benchmark.yml

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Run benchmark on ubuntu-latest
GitHub Check: Analyze (javascript)
GitHub Check: Compile & Lint

🔇 Additional comments (3)

.github/workflows/benchmark.yml (1)

31-31: LGTM!

Also applies to: 127-149

.github/workflows/pacquet-integrated-benchmark-comment.yml (1)

29-29: LGTM!

Also applies to: 158-172

.github/workflows/pacquet-integrated-benchmark.yml (1)

51-51: LGTM!

Also applies to: 338-353

📝 Walkthrough

Walkthrough

Three GitHub Actions benchmark workflows are enhanced with checks: write permission and extended Bencher CLI flags. The --start-point-clone-thresholds flag enables PR branches to inherit performance thresholds from main, while --err surfaces regressions as workflow errors. Two workflows are reformatted to use multi-line argument construction.

Changes

Benchmark workflow enhancements

Layer / File(s)	Summary
Permission expansion for GitHub checks writing `.github/workflows/benchmark.yml`, `.github/workflows/pacquet-integrated-benchmark-comment.yml`, `.github/workflows/pacquet-integrated-benchmark.yml`	All three workflows are granted `checks: write` permission alongside existing permissions, enabling Bencher to write GitHub Checks output for regression reporting.
Bencher regression detection and threshold inheritance `.github/workflows/benchmark.yml`, `.github/workflows/pacquet-integrated-benchmark-comment.yml`, `.github/workflows/pacquet-integrated-benchmark.yml`	All three `bencher run` invocations are extended with `--start-point-clone-thresholds` (to inherit configured thresholds from main for PR branches) and `--err` (to surface regressions as errors). Two workflows are reformatted into multi-line argument arrays for clarity.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

pnpm/pnpm#11875: Modifies the same Bencher-upload sections in benchmark workflows and bencher command argument construction for branch and start-point handling.

Poem

🐰 Three workflows, one mission so clear,
Benchmarks now check with fresh-found cheer!
Thresholds inherited, regressions exposed,
Main's wisdom shared when PR branches proposed!
checks: write grants the power to report,
A rabbit's delight—performance support! 🏃‍♂️✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and specifically summarizes the main changes: adding threshold enforcement via --start-point-clone-thresholds and --err, and granting checks: write permission.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bencher-pr-thresholds

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-23T16:46:19Z

Integrated-Benchmark Report (Linux)

Scenario: Isolated linker: fresh restore, cold cache + cold store

Command	Mean [s]	Min [s]	Max [s]	Relative
`pacquet@HEAD`	2.350 ± 0.103	2.238	2.502	1.01 ± 0.05
`pacquet@main`	2.322 ± 0.065	2.230	2.413	1.00
`pnpm`	4.537 ± 0.075	4.437	4.655	1.95 ± 0.06

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.3502830936600008,
      "stddev": 0.10279984077340087,
      "median": 2.3222429619600002,
      "user": 2.81672868,
      "system": 3.46245154,
      "min": 2.2375798329600003,
      "max": 2.5023931049600003,
      "times": [
        2.35498251396,
        2.5023931049600003,
        2.2895034099600005,
        2.42350979896,
        2.4890570699600003,
        2.2720729099600003,
        2.4263596299600003,
        2.26724772896,
        2.24012493696,
        2.2375798329600003
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.32187790036,
      "stddev": 0.06474385030324165,
      "median": 2.33550308646,
      "user": 2.73499808,
      "system": 3.4453162400000004,
      "min": 2.22973320696,
      "max": 2.41264889396,
      "times": [
        2.41264889396,
        2.23445012696,
        2.33499297096,
        2.3410116329600004,
        2.3991390689600003,
        2.2745589619600004,
        2.33601320196,
        2.22973320696,
        2.3746351739600002,
        2.28159576496
      ]
    },
    {
      "command": "pnpm",
      "mean": 4.537254470360001,
      "stddev": 0.07459892171558272,
      "median": 4.54326139346,
      "user": 7.65196818,
      "system": 4.002413839999999,
      "min": 4.4372579089599995,
      "max": 4.65468278096,
      "times": [
        4.65468278096,
        4.59426766196,
        4.51839921496,
        4.5984458759599995,
        4.4372579089599995,
        4.50407990096,
        4.568123571959999,
        4.44252524696,
        4.58952553596,
        4.46523700496
      ]
    }
  ]
}

Scenario: Isolated linker: fresh restore, hot cache + hot store

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`pacquet@HEAD`	670.8 ± 47.2	637.6	802.5	1.01 ± 0.08
`pacquet@main`	664.8 ± 20.4	639.0	697.4	1.00
`pnpm`	2435.3 ± 148.1	2270.4	2753.7	3.66 ± 0.25

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.6708084764400001,
      "stddev": 0.047240236745938545,
      "median": 0.66072281024,
      "user": 0.37954557999999994,
      "system": 1.43517426,
      "min": 0.63758168024,
      "max": 0.80246104524,
      "times": [
        0.80246104524,
        0.65856663124,
        0.66187004524,
        0.66215016824,
        0.64658823624,
        0.66977004224,
        0.65957557524,
        0.66233284324,
        0.6471884972399999,
        0.63758168024
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.6648497021399999,
      "stddev": 0.02043169834858673,
      "median": 0.65836854374,
      "user": 0.37647668,
      "system": 1.4398314600000002,
      "min": 0.6389551072399999,
      "max": 0.69737859524,
      "times": [
        0.68593009624,
        0.6389551072399999,
        0.69737859524,
        0.6468575022399999,
        0.65901990924,
        0.65771717824,
        0.69342408324,
        0.66596846724,
        0.65360194024,
        0.64964414224
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.43530926094,
      "stddev": 0.1481373263485777,
      "median": 2.41473290724,
      "user": 2.9190334799999995,
      "system": 2.19278166,
      "min": 2.2703722862399998,
      "max": 2.75374672424,
      "times": [
        2.3187324022399998,
        2.2703722862399998,
        2.28225344724,
        2.75374672424,
        2.4605422622399997,
        2.35298852124,
        2.5019534962399996,
        2.36892355224,
        2.50363245024,
        2.5399474672399998
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, cold cache + cold store

Command	Mean [s]	Min [s]	Max [s]	Relative
`pacquet@HEAD`	4.981 ± 0.121	4.811	5.231	1.00
`pacquet@main`	5.052 ± 0.205	4.752	5.389	1.01 ± 0.05
`pnpm`	6.352 ± 0.152	6.042	6.617	1.28 ± 0.04

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 4.9810691868,
      "stddev": 0.121280682407395,
      "median": 4.9621373679000005,
      "user": 6.77160982,
      "system": 3.50315648,
      "min": 4.8108650464,
      "max": 5.2308265784,
      "times": [
        4.9211465104,
        4.9691413284,
        4.8108650464,
        4.9712631984,
        4.9250876744,
        4.909152966400001,
        4.9610340454,
        5.148933829400001,
        5.2308265784,
        4.9632406904
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 5.051945107400001,
      "stddev": 0.20454059186307916,
      "median": 5.0785759509,
      "user": 6.808686019999999,
      "system": 3.5192217799999996,
      "min": 4.7519255404,
      "max": 5.3894233194000005,
      "times": [
        5.3894233194000005,
        5.1070102154,
        5.0036677544,
        5.068471183400001,
        5.3129088354,
        5.0886807184,
        4.8634214264,
        5.1211762354,
        4.7519255404,
        4.8127658454
      ]
    },
    {
      "command": "pnpm",
      "mean": 6.352407902400001,
      "stddev": 0.15205407973003585,
      "median": 6.362504315900001,
      "user": 10.49427922,
      "system": 4.390154379999999,
      "min": 6.0421095224000005,
      "max": 6.6169474554,
      "times": [
        6.3489738944,
        6.206308331400001,
        6.0421095224000005,
        6.3760347374,
        6.346234394400001,
        6.4068275374,
        6.3973043044,
        6.3148436644000006,
        6.4684951824,
        6.6169474554
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, hot cache + hot store

Command	Mean [s]	Min [s]	Max [s]	Relative
`pacquet@HEAD`	4.040 ± 0.110	3.842	4.217	1.00
`pacquet@main`	4.121 ± 0.118	3.963	4.380	1.02 ± 0.04
`pnpm`	4.233 ± 0.107	4.154	4.513	1.05 ± 0.04

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 4.040380751980001,
      "stddev": 0.10956843588089388,
      "median": 4.03451855718,
      "user": 4.388032920000001,
      "system": 2.2139571399999998,
      "min": 3.84191133968,
      "max": 4.21692350968,
      "times": [
        4.028736975679999,
        4.21692350968,
        4.05177583068,
        3.9509969266800002,
        4.02812827968,
        4.15463901568,
        3.95580159568,
        3.84191133968,
        4.04030013868,
        4.134593907679999
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 4.12069823848,
      "stddev": 0.11819397151077456,
      "median": 4.1203162961799995,
      "user": 4.48911702,
      "system": 2.22456244,
      "min": 3.96288766968,
      "max": 4.37951931868,
      "times": [
        4.08028023068,
        4.37951931868,
        4.13095970068,
        3.96288766968,
        4.170711992679999,
        4.00918251568,
        4.12580481568,
        4.208768527679999,
        4.02403983668,
        4.114827776679999
      ]
    },
    {
      "command": "pnpm",
      "mean": 4.233272924979999,
      "stddev": 0.1065494973898752,
      "median": 4.20804441368,
      "user": 5.18660462,
      "system": 2.61768664,
      "min": 4.154485870679999,
      "max": 4.51251677268,
      "times": [
        4.27431497768,
        4.26310997568,
        4.20751348868,
        4.154485870679999,
        4.16655527868,
        4.51251677268,
        4.21559589768,
        4.208575338679999,
        4.15934210768,
        4.17071954168
      ]
    }
  ]
}

github-actions · 2026-05-23T16:46:21Z

Bencher Report

Branch	pr/11883
Testbed	pacquet

⚠️ WARNING: No Threshold found!
Without a Threshold, no Alerts will ever be generated.
Latency (nanoseconds (ns))
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

Click to view all benchmark results

Benchmark	Latency	milliseconds (ms)
isolated-linker.fresh-install.cold-cache.cold-store	📈 view plot ⚠️ NO THRESHOLD	4,981.07 ms
isolated-linker.fresh-install.hot-cache.hot-store	📈 view plot ⚠️ NO THRESHOLD	4,040.38 ms
isolated-linker.fresh-restore.cold-cache.cold-store	📈 view plot ⚠️ NO THRESHOLD	2,350.28 ms
isolated-linker.fresh-restore.hot-cache.hot-store	📈 view plot ⚠️ NO THRESHOLD	670.81 ms

🐰 View full continuous benchmarking report in Bencher

coderabbitai Bot approved these changes May 23, 2026

View reviewed changes

zkochan merged commit bf581bb into main May 23, 2026
16 checks passed

zkochan deleted the bencher-pr-thresholds branch May 23, 2026 16:49

coderabbitai Bot mentioned this pull request Jun 2, 2026

ci(pnpr): benchmark the install accelerator (new Bencher pnpr testbed) #12154

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ci(bencher): enforce PR thresholds and grant checks: write#11883

ci(bencher): enforce PR thresholds and grant checks: write#11883
zkochan merged 1 commit into
mainfrom
bencher-pr-thresholds

zkochan commented May 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

qodo-code-review Bot commented May 23, 2026

Uh oh!

coderabbitai Bot commented May 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

github-actions Bot commented May 23, 2026

⚠️ WARNING: No Threshold found!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

zkochan commented May 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

qodo-code-review Bot commented May 23, 2026

Qodo reviews are paused for this user.

Uh oh!

coderabbitai Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

github-actions Bot commented May 23, 2026

Integrated-Benchmark Report (Linux)

Scenario: Isolated linker: fresh restore, cold cache + cold store

Scenario: Isolated linker: fresh restore, hot cache + hot store

Scenario: Isolated linker: fresh install, cold cache + cold store

Scenario: Isolated linker: fresh install, hot cache + hot store

Uh oh!

github-actions Bot commented May 23, 2026

Bencher Report

⚠️ WARNING: No Threshold found!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zkochan commented May 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 23, 2026 •

edited

Loading