Skip to content

ci: track test suite durations in Bencher#12404

Merged
zkochan merged 1 commit into
mainfrom
bench-tests
Jun 14, 2026
Merged

ci: track test suite durations in Bencher#12404
zkochan merged 1 commit into
mainfrom
bench-tests

Conversation

@zkochan

@zkochan zkochan commented Jun 14, 2026

Copy link
Copy Markdown
Member

Summary

Adds CI duration tracking for the pnpm-ci-performance Bencher project.

Tracked Rust testbeds and benchmarks:

  • pacquet.ubuntu, pacquet.windows, pacquet.macos -> tests.all
  • pnpr.ubuntu, pnpr.windows, pnpr.macos -> tests.all

Tracked pnpm testbeds and benchmarks for full test runs:

  • pnpm.ubuntu.node22, pnpm.ubuntu.node24, pnpm.ubuntu.node26 -> tests.all, tests.cli
  • pnpm.windows.node22, pnpm.windows.node24, pnpm.windows.node26 -> tests.all, tests.cli

The test workflows produce Bencher-compatible JSON artifacts without receiving BENCHER_API_TOKEN. A separate workflow_run workflow downloads those artifacts only for same-repository runs, validates their metadata, and uploads from trusted workflow code using the existing BENCHER_API_TOKEN secret. The pnpm CLI e2e duration is extracted from pnpm run --report-summary output during the same full-test execution, so the CLI e2e suite is not run a second time.

Validation

  • actionlint .github/workflows/ci.yml .github/workflows/test.yml .github/workflows/pacquet-ci.yml .github/workflows/ci-performance-bencher-upload.yml
  • git diff --check
  • node --check .github/scripts/bench-output-path.mjs && node --check .github/scripts/measure-command.mjs && node --check .github/scripts/bencher-result-from-pnpm-summary.mjs && node --check .github/scripts/merge-bencher-results.mjs
  • synthetic pnpm artifact smoke test for tests.all + tests.cli
  • synthetic Rust metadata smoke test for valid, missing, partial, and duplicate .reports
  • output path boundary smoke test for lexical escapes and symlink escapes under .bench/**
  • strict missing-input smoke test for .github/scripts/merge-bencher-results.mjs
  • just --dry-run test-pacquet
  • just --dry-run test-pnpr
  • pre-push hook passed during git push

Written by an agent (Codex, GPT-5).

Summary by CodeRabbit

Chores

  • Added CI benchmark measurement tooling that produces standardized benchmark JSON artifacts and merged summary/metadata outputs for later reporting.
  • Introduced automatic upload of performance results to Bencher after successful CI runs, including stricter validation and support for multiple report formats/testbeds.
  • Updated CI test execution to use consistent measurement outputs and improved test scope selection (full vs affected), while also adding dedicated pacquet vs pnpr nextest targets.
  • Enhanced CI reliability for artifact creation/packaging and added safer benchmark output path handling to prevent invalid output locations.

@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a4d7cace-d9b7-4e2e-8a13-8483806174ed

📥 Commits

Reviewing files that changed from the base of the PR and between b9653f6 and e138d22.

📒 Files selected for processing (10)
  • .github/scripts/bench-output-path.mjs
  • .github/scripts/bencher-result-from-pnpm-summary.mjs
  • .github/scripts/measure-command.mjs
  • .github/scripts/merge-bencher-results.mjs
  • .github/workflows/ci-performance-bencher-upload.yml
  • .github/workflows/ci.yml
  • .github/workflows/pacquet-ci.yml
  • .github/workflows/test.yml
  • justfile
  • package.json
🚧 Files skipped from review as they are similar to previous changes (9)
  • justfile
  • .github/workflows/pacquet-ci.yml
  • .github/workflows/ci.yml
  • package.json
  • .github/scripts/merge-bencher-results.mjs
  • .github/workflows/test.yml
  • .github/scripts/measure-command.mjs
  • .github/workflows/ci-performance-bencher-upload.yml
  • .github/scripts/bencher-result-from-pnpm-summary.mjs

📝 Walkthrough

Walkthrough

Three new Node.js CLI scripts measure command timing, extract metrics from pnpm summaries, and merge Bencher result files. The justfile gains separate test-pacquet and test-pnpr targets; package.json adds a --report-summary variant. CI workflows wrap test runs with the timing script to produce benchmark artifacts. A new orchestration workflow downloads those artifacts and uploads them to Bencher on CI completion.

Changes

Bencher Benchmark Integration

Layer / File(s) Summary
Benchmark output path validation
.github/scripts/bench-output-path.mjs
Shared utility exports resolveBenchOutputPath(output) that resolves paths to absolute form and validates they remain under .bench/, rejecting upward traversal and absolute paths outside the directory, with symlink safety checks across all path ancestors.
Benchmark measurement and conversion scripts
.github/scripts/measure-command.mjs, .github/scripts/bencher-result-from-pnpm-summary.mjs, .github/scripts/merge-bencher-results.mjs
measure-command.mjs spawns a command, records wall-clock duration via performance.now(), and writes a Bencher-schema JSON with exit code and duration stats; validates Windows shell metacharacters. bencher-result-from-pnpm-summary.mjs reads a pnpm exec-summary JSON, validates the package entry status/duration, converts milliseconds to seconds, and writes the Bencher output. merge-bencher-results.mjs concatenates multiple Bencher result files into a single output. All three include argument parsing with usage error reporting and validated output paths.
Per-subset test commands and reporting script
justfile, package.json
justfile adds test-pacquet (workspace nextest excluding pnpr packages) and test-pnpr (nextest limited to pnpr packages). package.json adds test-pkgs-all-report-summary running .test with --report-summary to enable pnpm execution summary JSON generation; ci:test-all is switched to use the reporting variant.
Test workflow instrumentation with measurement
.github/workflows/test.yml, .github/workflows/pacquet-ci.yml, .github/workflows/ci.yml
test.yml emits full_tests and benchmark outputs from scope determination, wraps test execution with measure-command.mjs to produce .bench/tests.all.json, and conditionally adds steps to extract CLI e2e duration and merge results into pnpm artifacts with computed testbed metadata. pacquet-ci.yml splits the single just test step into two separate measured runs (test-pacquet and test-pnpr), each producing per-crate .bench/*.json files, then stages them into a rust artifact directory with metadata.json and uploads with platform-keyed artifact names. ci.yml "Package compiled artifacts" step now uses explicit bash shell with mapfile arrays to safely collect matching lib directories and tsconfig.tsbuildinfo files.
Bencher upload orchestration workflow
.github/workflows/ci-performance-bencher-upload.yml
New workflow triggers on workflow_run completion from CI and Pacquet CI jobs. Resolves Bencher branch context (PR: pr/<number> with start_point=true; non-PR: head_branch with start_point=false on main). Counts and downloads ci-performance-* artifacts; skips upload steps when none found. Validates metadata structure (testbed patterns, allowed Rust result filenames) and routes pnpm and rust report formats to appropriate Bencher run invocations with optional start-point flags. Exits early if BENCHER_API_TOKEN is not set.

Sequence Diagram(s)

sequenceDiagram
  participant TestCI as Test/Pacquet CI
  participant measure as measure-command.mjs
  participant pnpmExtract as bencher-result-from-pnpm-summary.mjs
  participant merge as merge-bencher-results.mjs
  participant artifacts as GitHub Artifacts
  participant BencherUpload as ci-performance-bencher-upload
  participant BencherAPI as Bencher API

  TestCI->>measure: wrap test execution
  measure-->>TestCI: .bench/tests.all.json
  
  alt full_tests enabled
    TestCI->>pnpmExtract: extract CLI e2e duration
    pnpmExtract-->>TestCI: .bench/tests.cli.json
    TestCI->>merge: concatenate results
    merge-->>TestCI: merged results + metadata
  end
  
  TestCI->>artifacts: upload ci-performance-pnpm artifact
  
  artifacts-->>BencherUpload: workflow_run trigger on CI success
  BencherUpload->>artifacts: download ci-performance-* artifacts
  BencherUpload->>BencherUpload: validate metadata + testbed + filenames
  BencherUpload->>BencherAPI: bencher run --upload (if BENCHER_API_TOKEN set)
  BencherAPI-->>BencherUpload: results published to Bencher
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

  • pnpm/pnpm#11659: Both PRs modify the GitHub Actions .github/workflows/test.yml "Run tests" step to change how the test command is executed, making them directly related in the same workflow step logic.

Poem

🐇 Hippity-hop, the timers now run,
Each test measured under the CI sun.
Bencher receives our millisecond tale,
merge-bencher-results stitches without fail.
With workflow triggers and API tokens set,
The benchmark bunny's the best yet! 🌙✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding CI duration tracking for test suites in Bencher, which is the primary objective of this multi-file changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bench-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zkochan zkochan force-pushed the bench-tests branch 6 times, most recently from 8049b11 to 635910c Compare June 14, 2026 16:01
@zkochan zkochan marked this pull request as ready for review June 14, 2026 16:02
@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (5) 📘 Rule violations (0)

Grey Divider


Action required

1. Untrusted PR results uploaded 🐞 Bug ⛨ Security
Description
.github/workflows/ci-performance-bencher-upload.yml uploads Bencher results from any successful
run of workflows that themselves run on pull_request, without checking that the head repository is
this repo. This allows fork PRs to submit arbitrary benchmark JSON that gets uploaded with
BENCHER_API_TOKEN, polluting the pnpm-ci-performance project’s performance history/thresholds.
Code

.github/workflows/ci-performance-bencher-upload.yml[R18-22]

+  upload:
+    name: Upload CI performance results
+    runs-on: ubuntu-latest
+    if: github.event.workflow_run.conclusion == 'success'
+    steps:
Evidence
The CI workflow runs on pull_request, and the new uploader is triggered by those runs via
workflow_run and performs authenticated Bencher uploads with BENCHER_API_TOKEN but has no
same-repo/fork restriction.

.github/workflows/ci.yml[1-4]
.github/workflows/ci-performance-bencher-upload.yml[9-22]
.github/workflows/ci-performance-bencher-upload.yml[110-116]
.github/workflows/ci-performance-bencher-upload.yml[112-123]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `CI Performance Bencher Upload` workflow runs on `workflow_run` for workflows triggered by `pull_request`, and it uploads artifacts using `BENCHER_API_TOKEN`. Because the job only checks `conclusion == 'success'`, a fork PR can craft Bencher JSON artifacts and have them uploaded into the trusted Bencher project, compromising benchmark data integrity.
### Issue Context
This is a trust-boundary issue: artifacts are produced by code from the PR run, but uploaded by trusted default-branch workflow code with secrets.
### Fix Focus Areas
- .github/workflows/ci-performance-bencher-upload.yml[18-22]
### Suggested fix
Add a trust gate to the job-level `if:` to restrict uploads to runs whose head repo is the same repository (or to another explicit allowlist).
Example:

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. jq errors drop uploads 🐞 Bug ☼ Reliability
Description
In the rust-artifact path, the uploader iterates reports via process substitution (`done < <(jq -c
'.reports[]' ...)) without checking jq’s exit status, so malformed/missing .reports` can result
in zero uploads while the job still succeeds. This can silently drop CI performance data and make
Bencher reporting incomplete.
Code

.github/workflows/ci-performance-bencher-upload.yml[R167-179]

+              rust)
+                while IFS= read -r report; do
+                  testbed=$(jq -r '.testbed // empty' <<< "$report")
+                  file=$(jq -r '.file // empty' <<< "$report")
+                  case "$file" in
+                    pacquet-tests-all.json|pnpr-tests-all.json) ;;
+                    *)
+                      echo "::error::unexpected result file '$file'"
+                      exit 1
+                      ;;
+                  esac
+                  upload "$testbed" "$dir/$file"
+                done < <(jq -c '.reports[]' "$metadata")
Evidence
The workflow’s rust case reads .reports[] via jq in process substitution and has no explicit
validation that .reports exists/has content or that the jq invocation succeeded before
continuing.

.github/workflows/ci-performance-bencher-upload.yml[167-179]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The rust upload loop uses process substitution to feed `jq` output into `while read`. If `jq` fails (bad JSON, missing `.reports`, wrong type), the script may not fail at that point and will simply perform no uploads for that artifact.
### Issue Context
The workflow already uses `set -euo pipefail`, but process substitutions are a common place where failures are not reliably caught unless you structure the pipeline to propagate the exit code.
### Fix Focus Areas
- .github/workflows/ci-performance-bencher-upload.yml[167-179]
### Suggested fix
Restructure to a pipeline so `pipefail` can catch `jq` failures, and use `jq -e` to error on missing/wrong-type fields.
Example:

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Token usable in PRs 🐞 Bug ⛨ Security
Description
Bencher uploads run on pull_request events for same-repo PRs and read BENCHER_API_TOKEN, which
means PR workflow code has access to the token and can exfiltrate/misuse it by modifying workflow
steps. This increases the blast radius of a compromised contributor/branch (token misuse can tamper
with Bencher project data).
Code

.github/workflows/test.yml[R179-196]

+    - name: Install Bencher CLI
+      if: steps.test-scope.outputs.full_tests == 'true' && (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository)
+      uses: bencherdev/bencher@50fb1e138651a46d2fb704fab1adab38c181552e # v0.6.6
+    - name: Upload test durations to Bencher
+      if: steps.test-scope.outputs.full_tests == 'true' && (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository)
+      env:
+        BENCHER_API_TOKEN: ${{ secrets.BENCHER_API_TOKEN }}
+        EVENT_NAME: ${{ github.event_name }}
+        NODE_VERSION: ${{ inputs.node }}
+        PLATFORM: ${{ inputs.platform }}
+        PR_NUMBER: ${{ github.event.pull_request.number }}
+        REF_NAME: ${{ github.ref_name }}
+      shell: bash
+      run: |
+        if [ -z "${BENCHER_API_TOKEN:-}" ]; then
+          echo "::notice::BENCHER_API_TOKEN not set, skipping upload"
+          exit 0
+        fi
Evidence
The reusable test workflow explicitly runs Bencher upload on pull_request when the PR is from the
same repo and injects BENCHER_API_TOKEN into the environment; the top-level CI workflow forwards
that secret into the reusable workflow.

.github/workflows/test.yml[171-232]
.github/workflows/ci.yml[42-76]
.github/workflows/pacquet-ci.yml[146-203]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Bencher upload steps run in PR context (same-repo PRs) and receive `BENCHER_API_TOKEN`. Any PR that can change workflow steps can print/exfiltrate or abuse that token.
### Issue Context
You want PR duration tracking, but you also want to minimize CI secret exposure.
### Fix Focus Areas
- .github/workflows/test.yml[179-232]
- .github/workflows/ci.yml[42-76]
- .github/workflows/pacquet-ci.yml[146-203]
### Suggested fix
Pick one (or combine):
1) **Move Bencher upload into a separate workflow** triggered by `workflow_run` (or similar) that downloads artifacts from the test job and uploads using secrets, so the upload workflow code comes from the default branch rather than the PR branch.
2) **Restrict uploads to non-PR events** (push to main / release branches) if PR tracking isn’t strictly required.
3) **Use a minimally-scoped Bencher token** (project-scoped, least privilege) and place it behind an environment requiring approvals for PR runs if available in your process.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Missing inputs silently skipped 🐞 Bug ☼ Reliability
Description
.github/scripts/merge-bencher-results.mjs silently ignores non-existent input files, so the
Bencher upload can succeed with only a subset of the requested benchmarks (e.g., tests.all without
tests.cli). This can hide regressions in the reporting pipeline and produce incomplete/misleading
Bencher data.
Code

.github/scripts/merge-bencher-results.mjs[R9-15]

+for (const file of files) {
+  if (!existsSync(file)) continue
+  const report = JSON.parse(await readFile(file, 'utf8'))
+  if (Array.isArray(report.results)) {
+    results.push(...report.results)
+  }
+}
Evidence
The merge script explicitly continues when an input file does not exist, and only fails when *no*
results are found at all; meanwhile the workflow always asks it to merge two specific result
artifacts before uploading.

.github/scripts/merge-bencher-results.mjs[6-20]
.github/workflows/test.yml[181-206]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`merge-bencher-results.mjs` currently treats missing input files as “skip and continue”. In the workflows, we pass multiple expected benchmark artifacts (e.g. `.bench/tests.all.json` and `.bench/tests.cli.json`). If one artifact is not produced (bug, rename, conditional change), the merge will still succeed and the upload step will push incomplete benchmark data.
### Issue Context
The workflow is explicitly expecting to merge two benchmark result files before uploading to Bencher, but the merge script will not surface if one is absent.
### Fix Focus Areas
- .github/scripts/merge-bencher-results.mjs[6-20]
- .github/workflows/test.yml[198-206]
### Suggested fix
1. Track missing files while iterating.
2. If any are missing, print an error that lists them and `process.exit(1)` (strict by default), so benchmark completeness failures are visible.
3. (Optional) If you want to preserve the current “best-effort” behavior for future callers, add a flag like `--allow-missing` (default false) to keep skipping behavior when explicitly requested, and update callers accordingly.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Informational

5. Windows shell execution pitfalls 🐞 Bug ☼ Reliability
Description
measure-command.mjs uses spawn(..., { shell: process.platform === 'win32' }), so on Windows it
runs commands through cmd.exe, which changes quoting/escaping semantics and can break or
mis-execute commands if arguments contain spaces/metacharacters. This is exercised in CI because the
workflow matrix includes windows-latest while invoking measure-command.mjs.
Code

.github/scripts/measure-command.mjs[R68-73]

+function runCommand ([command, ...args]) {
+  return new Promise((resolve) => {
+    const child = spawn(command, args, { shell: process.platform === 'win32', stdio: 'inherit' })
+    child.on('error', (err) => {
+      console.error(err)
+      resolve(1)
Evidence
The helper script enables shell execution specifically on Windows, and the CI workflows run this
script on Windows runners (both the main CI matrix and the Rust workflow include windows jobs).

.github/scripts/measure-command.mjs[68-82]
.github/workflows/ci.yml[54-71]
.github/workflows/pacquet-ci.yml[58-67]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
On Windows, `shell: true` routes execution through `cmd.exe`, which can alter argument parsing/quoting and can become brittle as commands evolve.
### Issue Context
Current call sites use simple args, but the script is a general utility and will be used on `windows-latest` runners.
### Fix Focus Areas
- .github/scripts/measure-command.mjs[68-82]
- .github/workflows/ci.yml[54-71]
- .github/workflows/pacquet-ci.yml[58-67]
### Suggested fix
Prefer one of:
- Keep `shell: false` everywhere, and on Windows resolve `command` to an actual executable (`.exe`/`.cmd`) before spawning (or explicitly invoke `cmd.exe /d /s /c` yourself with robust quoting).
- If you keep `shell: true`, add a clear contract/comment and basic escaping/validation (e.g., reject args containing `&|<>` when `shell` is enabled) to prevent accidental injection/breakage if future callers pass untrusted or complex args.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Previous review results

Review updated until commit e138d22

Results up to commit 651b5d2


🐞 Bugs (5) 📘 Rule violations (0)


Action required
1. Untrusted PR results uploaded 🐞 Bug ⛨ Security ⭐ New
Description
.github/workflows/ci-performance-bencher-upload.yml uploads Bencher results from any successful
run of workflows that themselves run on pull_request, without checking that the head repository is
this repo. This allows fork PRs to submit arbitrary benchmark JSON that gets uploaded with
BENCHER_API_TOKEN, polluting the pnpm-ci-performance project’s performance history/thresholds.
Code

.github/workflows/ci-performance-bencher-upload.yml[R18-22]

+  upload:
+    name: Upload CI performance results
+    runs-on: ubuntu-latest
+    if: github.event.workflow_run.conclusion == 'success'
+    steps:
Evidence
The CI workflow runs on pull_request, and the new uploader is triggered by those runs via
workflow_run and performs authenticated Bencher uploads with BENCHER_API_TOKEN but has no
same-repo/fork restriction.

.github/workflows/ci.yml[1-4]
.github/workflows/ci-performance-bencher-upload.yml[9-22]
.github/workflows/ci-performance-bencher-upload.yml[110-116]
.github/workflows/ci-performance-bencher-upload.yml[112-123]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The `CI Performance Bencher Upload` workflow runs on `workflow_run` for workflows triggered by `pull_request`, and it uploads artifacts using `BENCHER_API_TOKEN`. Because the job only checks `conclusion == 'success'`, a fork PR can craft Bencher JSON artifacts and have them uploaded into the trusted Bencher project, compromising benchmark data integrity.

### Issue Context
This is a trust-boundary issue: artifacts are produced by code from the PR run, but uploaded by trusted default-branch workflow code with secrets.

### Fix Focus Areas
- .github/workflows/ci-performance-bencher-upload.yml[18-22]

### Suggested fix
Add a trust gate to the job-level `if:` to restrict uploads to runs whose head repo is the same repository (or to another explicit allowlist).

Example:
```yaml
if: >-
 github.event.workflow_run.conclusion == 'success' &&
 github.event.workflow_run.head_repository.full_name == github.repository
```

If you still want uploads for same-repo PRs but not forks, this is the simplest safe default. If you want a different policy (e.g., allow org members), add an explicit check (via GH API) before uploading.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended
2. jq errors drop uploads 🐞 Bug ☼ Reliability ⭐ New
Description
In the rust-artifact path, the uploader iterates reports via process substitution (`done < <(jq -c
'.reports[]' ...)) without checking jq’s exit status, so malformed/missing .reports` can result
in zero uploads while the job still succeeds. This can silently drop CI performance data and make
Bencher reporting incomplete.
Code

.github/workflows/ci-performance-bencher-upload.yml[R167-179]

+              rust)
+                while IFS= read -r report; do
+                  testbed=$(jq -r '.testbed // empty' <<< "$report")
+                  file=$(jq -r '.file // empty' <<< "$report")
+                  case "$file" in
+                    pacquet-tests-all.json|pnpr-tests-all.json) ;;
+                    *)
+                      echo "::error::unexpected result file '$file'"
+                      exit 1
+                      ;;
+                  esac
+                  upload "$testbed" "$dir/$file"
+                done < <(jq -c '.reports[]' "$metadata")
Evidence
The workflow’s rust case reads .reports[] via jq in process substitution and has no explicit
validation that .reports exists/has content or that the jq invocation succeeded before
continuing.

.github/workflows/ci-performance-bencher-upload.yml[167-179]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The rust upload loop uses process substitution to feed `jq` output into `while read`. If `jq` fails (bad JSON, missing `.reports`, wrong type), the script may not fail at that point and will simply perform no uploads for that artifact.

### Issue Context
The workflow already uses `set -euo pipefail`, but process substitutions are a common place where failures are not reliably caught unless you structure the pipeline to propagate the exit code.

### Fix Focus Areas
- .github/workflows/ci-performance-bencher-upload.yml[167-179]

### Suggested fix
Restructure to a pipeline so `pipefail` can catch `jq` failures, and use `jq -e` to error on missing/wrong-type fields.

Example:
```bash
jq -e -c '.reports[]' "$metadata" | while IFS= read -r report; do
 testbed=$(jq -r '.testbed // empty' <<< "$report")
 file=$(jq -r '.file // empty' <<< "$report")
 ...
done
```

Optionally pre-validate non-empty reports:
```bash
jq -e '.reports | type == "array" and length > 0' "$metadata" >/dev/null
```

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Token usable in PRs 🐞 Bug ⛨ Security
Description
Bencher uploads run on pull_request events for same-repo PRs and read BENCHER_API_TOKEN, which
means PR workflow code has access to the token and can exfiltrate/misuse it by modifying workflow
steps. This increases the blast radius of a compromised contributor/branch (token misuse can tamper
with Bencher project data).
Code

.github/workflows/test.yml[R179-196]

+    - name: Install Bencher CLI
+      if: steps.test-scope.outputs.full_tests == 'true' && (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository)
+      uses: bencherdev/bencher@50fb1e138651a46d2fb704fab1adab38c181552e # v0.6.6
+    - name: Upload test durations to Bencher
+      if: steps.test-scope.outputs.full_tests == 'true' && (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository)
+      env:
+        BENCHER_API_TOKEN: ${{ secrets.BENCHER_API_TOKEN }}
+        EVENT_NAME: ${{ github.event_name }}
+        NODE_VERSION: ${{ inputs.node }}
+        PLATFORM: ${{ inputs.platform }}
+        PR_NUMBER: ${{ github.event.pull_request.number }}
+        REF_NAME: ${{ github.ref_name }}
+      shell: bash
+      run: |
+        if [ -z "${BENCHER_API_TOKEN:-}" ]; then
+          echo "::notice::BENCHER_API_TOKEN not set, skipping upload"
+          exit 0
+        fi
Evidence
The reusable test workflow explicitly runs Bencher upload on pull_request when the PR is from the
same repo and injects BENCHER_API_TOKEN into the environment; the top-level CI workflow forwards
that secret into the reusable workflow.

.github/workflows/test.yml[171-232]
.github/workflows/ci.yml[42-76]
.github/workflows/pacquet-ci.yml[146-203]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Bencher upload steps run in PR context (same-repo PRs) and receive `BENCHER_API_TOKEN`. Any PR that can change workflow steps can print/exfiltrate or abuse that token.
### Issue Context
You want PR duration tracking, but you also want to minimize CI secret exposure.
### Fix Focus Areas
- .github/workflows/test.yml[179-232]
- .github/workflows/ci.yml[42-76]
- .github/workflows/pacquet-ci.yml[146-203]
### Suggested fix
Pick one (or combine):
1) **Move Bencher upload into a separate workflow** triggered by `workflow_run` (or similar) that downloads artifacts from the test job and uploads using secrets, so the upload workflow code comes from the default branch rather than the PR branch.
2) **Restrict uploads to non-PR events** (push to main / release branches) if PR tracking isn’t strictly required.
3) **Use a minimally-scoped Bencher token** (project-scoped, least privilege) and place it behind an environment requiring approvals for PR runs if available in your process.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Missing inputs silently skipped 🐞 Bug ☼ Reliability
Description
.github/scripts/merge-bencher-results.mjs silently ignores non-existent input files, so the
Bencher upload can succeed with only a subset of the requested benchmarks (e.g., tests.all without
tests.cli). This can hide regressions in the reporting pipeline and produce incomplete/misleading
Bencher data.
Code

.github/scripts/merge-bencher-results.mjs[R9-15]

+for (const file of files) {
+  if (!existsSync(file)) continue
+  const report = JSON.parse(await readFile(file, 'utf8'))
+  if (Array.isArray(report.results)) {
+    results.push(...report.results)
+  }
+}
Evidence
The merge script explicitly continues when an input file does not exist, and only fails when *no*
results are found at all; meanwhile the workflow always asks it to merge two specific result
artifacts before uploading.

.github/scripts/merge-bencher-results.mjs[6-20]
.github/workflows/test.yml[181-206]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`merge-bencher-results.mjs` currently treats missing input files as “skip and continue”. In the workflows, we pass multiple expected benchmark artifacts (e.g. `.bench/tests.all.json` and `.bench/tests.cli.json`). If one artifact is not produced (bug, rename, conditional change), the merge will still succeed and the upload step will push incomplete benchmark data.
### Issue Context
The workflow is explicitly expecting to merge two benchmark result files before uploading to Bencher, but the merge script will not surface if one is absent.
### Fix Focus Areas
- .github/scripts/merge-bencher-results.mjs[6-20]
- .github/workflows/test.yml[198-206]
### Suggested fix
1. Track missing files while iterating.
2. If any are missing, print an error that lists them and `process.exit(1)` (strict by default), so benchmark completeness failures are visible.
3. (Optional) If you want to preserve the current “best-effort” behavior for future callers, add a flag like `--allow-missing` (default false) to keep skipping behavior when explicitly requested, and update callers accordingly.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Informational
5. Windows shell execution pitfalls 🐞 Bug ☼ Reliability
Description
measure-command.mjs uses spawn(..., { shell: process.platform === 'win32' }), so on Windows it
runs commands through cmd.exe, which changes quoting/escaping semantics and can break or
mis-execute commands if arguments contain spaces/metacharacters. This is exercised in CI because the
workflow matrix includes windows-latest while invoking measure-command.mjs.
Code

.github/scripts/measure-command.mjs[R68-73]

+function runCommand ([command, ...args]) {
+  return new Promise((resolve) => {
+    const child = spawn(command, args, { shell: process.platform === 'win32', stdio: 'inherit' })
+    child.on('error', (err) => {
+      console.error(err)
+      resolve(1)
Evidence
The helper script enables shell execution specifically on Windows, and the CI workflows run this
script on Windows runners (both the main CI matrix and the Rust workflow include windows jobs).

.github/scripts/measure-command.mjs[68-82]
.github/workflows/ci.yml[54-71]
.github/workflows/pacquet-ci.yml[58-67]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
On Windows, `shell: true` routes execution through `cmd.exe`, which can alter argument parsing/quoting and can become brittle as commands evolve.
### Issue Context
Current call sites use simple args, but the script is a general utility and will be used on `windows-latest` runners.
### Fix Focus Areas
- .github/scripts/measure-command.mjs[68-82]
- .github/workflows/ci.yml[54-71]
- .github/workflows/pacquet-ci.yml[58-67]
### Suggested fix
Prefer one of:
- Keep `shell: false` everywhere, and on Windows resolve `command` to an actual executable (`.exe`/`.cmd`) before spawning (or explicitly invoke `cmd.exe /d /s /c` yourself with robust quoting).
- If you keep `shell: true`, add a clear contract/comment and basic escaping/validation (e.g., reject args containing `&|<>` when `shell` is enabled) to prevent accidental injection/breakage if future callers pass untrusted or complex args.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Results up to commit 635910c


🐞 Bugs (3) 📘 Rule violations (0)


Remediation recommended
1. Token usable in PRs 🐞 Bug ⛨ Security ⭐ New
Description
Bencher uploads run on pull_request events for same-repo PRs and read BENCHER_API_TOKEN, which
means PR workflow code has access to the token and can exfiltrate/misuse it by modifying workflow
steps. This increases the blast radius of a compromised contributor/branch (token misuse can tamper
with Bencher project data).
Code

.github/workflows/test.yml[R179-196]

+    - name: Install Bencher CLI
+      if: steps.test-scope.outputs.full_tests == 'true' && (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository)
+      uses: bencherdev/bencher@50fb1e138651a46d2fb704fab1adab38c181552e # v0.6.6
+    - name: Upload test durations to Bencher
+      if: steps.test-scope.outputs.full_tests == 'true' && (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository)
+      env:
+        BENCHER_API_TOKEN: ${{ secrets.BENCHER_API_TOKEN }}
+        EVENT_NAME: ${{ github.event_name }}
+        NODE_VERSION: ${{ inputs.node }}
+        PLATFORM: ${{ inputs.platform }}
+        PR_NUMBER: ${{ github.event.pull_request.number }}
+        REF_NAME: ${{ github.ref_name }}
+      shell: bash
+      run: |
+        if [ -z "${BENCHER_API_TOKEN:-}" ]; then
+          echo "::notice::BENCHER_API_TOKEN not set, skipping upload"
+          exit 0
+        fi
Evidence
The reusable test workflow explicitly runs Bencher upload on pull_request when the PR is from the
same repo and injects BENCHER_API_TOKEN into the environment; the top-level CI workflow forwards
that secret into the reusable workflow.

.github/workflows/test.yml[171-232]
.github/workflows/ci.yml[42-76]
.github/workflows/pacquet-ci.yml[146-203]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Bencher upload steps run in PR context (same-repo PRs) and receive `BENCHER_API_TOKEN`. Any PR that can change workflow steps can print/exfiltrate or abuse that token.

### Issue Context
You want PR duration tracking, but you also want to minimize CI secret exposure.

### Fix Focus Areas
- .github/workflows/test.yml[179-232]
- .github/workflows/ci.yml[42-76]
- .github/workflows/pacquet-ci.yml[146-203]

### Suggested fix
Pick one (or combine):
1) **Move Bencher upload into a separate workflow** triggered by `workflow_run` (or similar) that downloads artifacts from the test job and uploads using secrets, so the upload workflow code comes from the default branch rather than the PR branch.
2) **Restrict uploads to non-PR events** (push to main / release branches) if PR tracking isn’t strictly required.
3) **Use a minimally-scoped Bencher token** (project-scoped, least privilege) and place it behind an environment requiring approvals for PR runs if available in your process.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Missing inputs silently skipped 🐞 Bug ☼ Reliability
Description
.github/scripts/merge-bencher-results.mjs silently ignores non-existent input files, so the
Bencher upload can succeed with only a subset of the requested benchmarks (e.g., tests.all without
tests.cli). This can hide regressions in the reporting pipeline and produce incomplete/misleading
Bencher data.
Code

.github/scripts/merge-bencher-results.mjs[R9-15]

+for (const file of files) {
+  if (!existsSync(file)) continue
+  const report = JSON.parse(await readFile(file, 'utf8'))
+  if (Array.isArray(report.results)) {
+    results.push(...report.results)
+  }
+}
Evidence
The merge script explicitly continues when an input file does not exist, and only fails when *no*
results are found at all; meanwhile the workflow always asks it to merge two specific result
artifacts before uploading.

.github/scripts/merge-bencher-results.mjs[6-20]
.github/workflows/test.yml[181-206]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`merge-bencher-results.mjs` currently treats missing input files as “skip and continue”. In the workflows, we pass multiple expected benchmark artifacts (e.g. `.bench/tests.all.json` and `.bench/tests.cli.json`). If one artifact is not produced (bug, rename, conditional change), the merge will still succeed and the upload step will push incomplete benchmark data.
### Issue Context
The workflow is explicitly expecting to merge two benchmark result files before uploading to Bencher, but the merge script will not surface if one is absent.
### Fix Focus Areas
- .github/scripts/merge-bencher-results.mjs[6-20]
- .github/workflows/test.yml[198-206]
### Suggested fix
1. Track missing files while iterating.
2. If any are missing, print an error that lists them and `process.exit(1)` (strict by default), so benchmark completeness failures are visible.
3. (Optional) If you want to preserve the current “best-effort” behavior for future callers, add a flag like `--allow-missing` (default false) to keep skipping behavior when explicitly requested, and update callers accordingly.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Informational
3. Windows shell execution pitfalls 🐞 Bug ☼ Reliability ⭐ New
Description
measure-command.mjs uses spawn(..., { shell: process.platform === 'win32' }), so on Windows it
runs commands through cmd.exe, which changes quoting/escaping semantics and can break or
mis-execute commands if arguments contain spaces/metacharacters. This is exercised in CI because the
workflow matrix includes windows-latest while invoking measure-command.mjs.
Code

.github/scripts/measure-command.mjs[R68-73]

+function runCommand ([command, ...args]) {
+  return new Promise((resolve) => {
+    const child = spawn(command, args, { shell: process.platform === 'win32', stdio: 'inherit' })
+    child.on('error', (err) => {
+      console.error(err)
+      resolve(1)
Evidence
The helper script enables shell execution specifically on Windows, and the CI workflows run this
script on Windows runners (both the main CI matrix and the Rust workflow include windows jobs).

.github/scripts/measure-command.mjs[68-82]
.github/workflows/ci.yml[54-71]
.github/workflows/pacquet-ci.yml[58-67]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
On Windows, `shell: true` routes execution through `cmd.exe`, which can alter argument parsing/quoting and can become brittle as commands evolve.

### Issue Context
Current call sites use simple args, but the script is a general utility and will be used on `windows-latest` runners.

### Fix Focus Areas
- .github/scripts/measure-command.mjs[68-82]
- .github/workflows/ci.yml[54-71]
- .github/workflows/pacquet-ci.yml[58-67]

### Suggested fix
Prefer one of:
- Keep `shell: false` everywhere, and on Windows resolve `command` to an actual executable (`.exe`/`.cmd`) before spawning (or explicitly invoke `cmd.exe /d /s /c` yourself with robust quoting).
- If you keep `shell: true`, add a clear contract/comment and basic escaping/validation (e.g., reject args containing `&|<>` when `shell` is enabled) to prevent accidental injection/breakage if future callers pass untrusted or complex args.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Qodo Logo

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

PR Summary by Qodo

CI: Upload pnpm/pacquet test suite durations to Bencher
✨ Enhancement ⚙️ Configuration changes 🕐 20-40 Minutes

Grey Divider

Walkthroughs

Description
• Measure CI test suite durations and emit Bencher-compatible hyperfine JSON results.
• Upload pnpm/pacquet/pnpr test durations to Bencher when BENCHER_API_TOKEN is configured.
• Extract pnpm CLI e2e duration from --report-summary output without rerunning tests.
Diagram
graph TD
  CI[".github/workflows/ci.yml"] --> TEST[".github/workflows/test.yml"]
  TEST --> MEASURE[".github/scripts/measure-command.mjs"] --> ALL[(".bench/tests.*.json")] --> MERGE[".github/scripts/merge-bencher-results.mjs"] --> BENCHERCLI["Bencher CLI"] --> BENCHER{{"Bencher Cloud"}}
  TEST --> SUMMARY[("pnpm-exec-summary.json")] --> EXTRACT[".github/scripts/bencher-result-from-pnpm-summary.mjs"] --> CLI[(".bench/tests.cli.json")] --> MERGE
  PACQ[".github/workflows/pacquet-ci.yml"] --> MEASURE --> RUST[(".bench/*-tests-all.json")] --> BENCHERCLI

  subgraph Legend
    direction LR
    _wf[Workflow] ~~~ _scr["Script"] ~~~ _file[(Result file)] ~~~ _ext{{External}}
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Use hyperfine directly in CI
  • ➕ Avoids maintaining a custom timer script
  • ➕ Produces native hyperfine output compatible with Bencher adapters
  • ➖ Adds a new tool dependency and installation time on all runners
  • ➖ Harder to integrate with pnpm --report-summary-derived durations
2. Derive durations from GitHub Actions timings (API/job metadata)
  • ➕ Zero runtime overhead inside the job steps
  • ➕ Captures end-to-end job duration, including setup overhead
  • ➖ Less granular/portable (depends on GitHub APIs and job structure)
  • ➖ Harder to attribute durations to specific suites (tests.all vs tests.cli)

Recommendation: The PR’s approach (lightweight Node scripts producing shell_hyperfine-compatible JSON, plus guarded Bencher uploads) is a good fit: it keeps CI dependencies minimal, works cross-platform, and reuses pnpm --report-summary to avoid rerunning CLI e2e tests. If future needs expand to richer statistics (multiple samples, variance), consider switching to hyperfine for the measured steps.

Grey Divider

File Changes

Other (8)
bencher-result-from-pnpm-summary.mjs Convert pnpm --report-summary duration into Bencher result JSON +76/-0

Convert pnpm --report-summary duration into Bencher result JSON

• Adds a Node script that reads pnpm's execution summary JSON, validates the target package entry, and emits a hyperfine-style Bencher result with duration seconds. This enables extracting the pnpm CLI e2e suite duration without rerunning tests.

.github/scripts/bencher-result-from-pnpm-summary.mjs


measure-command.mjs Measure wall-clock duration of an arbitrary command for Bencher +84/-0

Measure wall-clock duration of an arbitrary command for Bencher

• Introduces a Node script that spawns a command, measures elapsed time with performance.now(), and writes a single-sample hyperfine-compatible JSON report. Propagates the underlying command exit code for correct CI behavior.

.github/scripts/measure-command.mjs


merge-bencher-results.mjs Merge multiple Bencher JSON reports into one file +53/-0

Merge multiple Bencher JSON reports into one file

• Adds a small utility to concatenate results arrays across multiple Bencher JSON files, skipping missing inputs. Fails fast when nothing is found to prevent silent uploads of empty data.

.github/scripts/merge-bencher-results.mjs


ci.yml Forward BENCHER_API_TOKEN to test workflow and harden artifact tar step +8/-1

Forward BENCHER_API_TOKEN to test workflow and harden artifact tar step

• Updates the compiled artifact packaging step to use bash arrays for safer path handling. Forwards BENCHER_API_TOKEN into the reusable test workflow (including smoke) so uploads can be enabled when configured.

.github/workflows/ci.yml


pacquet-ci.yml Measure pacquet/pnpr Rust test durations and upload to Bencher +76/-2

Measure pacquet/pnpr Rust test durations and upload to Bencher

• Splits Rust tests into separate pacquet and pnpr invocations via just targets and wraps each with the duration measurement script. Installs Bencher CLI and uploads results to the pnpm-ci-performance project when running in a trusted context and token is present.

.github/workflows/pacquet-ci.yml


test.yml Measure pnpm test suite durations and upload merged results to Bencher +88/-7

Measure pnpm test suite durations and upload merged results to Bencher

• Wraps the test run step with duration measurement and assigns benchmark names based on full vs affected test scope. For full test runs, extracts the pnpm CLI e2e duration from pnpm's report summary, merges results, and conditionally uploads to Bencher using BENCHER_API_TOKEN.

.github/workflows/test.yml


justfile Add just targets to run pacquet-only and pnpr-only tests +8/-0

Add just targets to run pacquet-only and pnpr-only tests

• Introduces dedicated just recipes for running pacquet and pnpr test subsets, enabling separate duration tracking and reporting in CI.

justfile


package.json Run full CI tests with pnpm --report-summary enabled +2/-1

Run full CI tests with pnpm --report-summary enabled

• Adjusts ci:test-all to use a new test-pkgs-all-report-summary script that runs pnpm recursive tests with --report-summary. This produces the execution summary file needed to extract CLI e2e duration.

package.json


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0)

Grey Divider


Remediation recommended

1. Missing inputs silently skipped 🐞 Bug ☼ Reliability
Description
.github/scripts/merge-bencher-results.mjs silently ignores non-existent input files, so the
Bencher upload can succeed with only a subset of the requested benchmarks (e.g., tests.all without
tests.cli). This can hide regressions in the reporting pipeline and produce incomplete/misleading
Bencher data.
Code

.github/scripts/merge-bencher-results.mjs[R9-15]

+for (const file of files) {
+  if (!existsSync(file)) continue
+  const report = JSON.parse(await readFile(file, 'utf8'))
+  if (Array.isArray(report.results)) {
+    results.push(...report.results)
+  }
+}
Evidence
The merge script explicitly continues when an input file does not exist, and only fails when *no*
results are found at all; meanwhile the workflow always asks it to merge two specific result
artifacts before uploading.

.github/scripts/merge-bencher-results.mjs[6-20]
.github/workflows/test.yml[181-206]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`merge-bencher-results.mjs` currently treats missing input files as “skip and continue”. In the workflows, we pass multiple expected benchmark artifacts (e.g. `.bench/tests.all.json` and `.bench/tests.cli.json`). If one artifact is not produced (bug, rename, conditional change), the merge will still succeed and the upload step will push incomplete benchmark data.

### Issue Context
The workflow is explicitly expecting to merge two benchmark result files before uploading to Bencher, but the merge script will not surface if one is absent.

### Fix Focus Areas
- .github/scripts/merge-bencher-results.mjs[6-20]
- .github/workflows/test.yml[198-206]

### Suggested fix
1. Track missing files while iterating.
2. If any are missing, print an error that lists them and `process.exit(1)` (strict by default), so benchmark completeness failures are visible.
3. (Optional) If you want to preserve the current “best-effort” behavior for future callers, add a flag like `--allow-missing` (default false) to keep skipping behavior when explicitly requested, and update callers accordingly.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread .github/scripts/merge-bencher-results.mjs
Comment thread .github/workflows/test.yml Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/scripts/measure-command.mjs:
- Around line 9-10: The scripts have a path traversal vulnerability where the
unsanitized `output` argument can escape the `.bench/` directory (e.g., via
`--output ../../file` or absolute paths). At each of the three locations, you
must: (1) resolve the `output` argument to an absolute path, (2) validate that
the resolved path is contained within the workspace's `.bench/` directory by
checking that the resolved path starts with the .bench directory path, (3) throw
an error if the path escapes .bench, and (4) use the validated absolute path
when calling mkdir in `.github/scripts/measure-command.mjs` lines 9-10,
`.github/scripts/bencher-result-from-pnpm-summary.mjs` lines 26-27, and
`.github/scripts/merge-bencher-results.mjs` lines 22-23. Apply the identical
validation and boundary-check pattern at all three sites before creating or
writing files.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: bc87be62-25a2-443e-bbc0-4e0da2e31441

📥 Commits

Reviewing files that changed from the base of the PR and between 9874556 and 8bc9956.

📒 Files selected for processing (8)
  • .github/scripts/bencher-result-from-pnpm-summary.mjs
  • .github/scripts/measure-command.mjs
  • .github/scripts/merge-bencher-results.mjs
  • .github/workflows/ci.yml
  • .github/workflows/pacquet-ci.yml
  • .github/workflows/test.yml
  • justfile
  • package.json

Comment thread .github/scripts/measure-command.mjs Outdated
@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code Review by Qodo

Grey Divider

New Review Started

This review has been superseded by a new analysis

Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 651b5d2

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
.github/workflows/test.yml (1)

184-198: ⚡ Quick win

Consider validating platform and node version inputs against supported testbed patterns.

The testbed name pnpm.${platform_slug}.node${node_major} is constructed from workflow inputs without validation. The downstream Bencher upload workflow only accepts pnpm.(ubuntu|windows).node(22|24|26) patterns. If the calling workflow passes macos-latest or an unsupported Node version, the artifact will upload successfully here but fail during the Bencher upload step.

Consider adding early validation to fail fast with a clear error message:

🛡️ Suggested validation check
     - name: Stage Bencher test durations
       if: steps.test-scope.outputs.full_tests == 'true'
       env:
         NODE_VERSION: ${{ inputs.node }}
         PLATFORM: ${{ inputs.platform }}
       shell: bash
       run: |
         platform_slug=${PLATFORM%-latest}
         node_major=${NODE_VERSION%%.*}
+        
+        # Validate against supported testbed patterns
+        if [[ ! "$platform_slug" =~ ^(ubuntu|windows)$ ]]; then
+          echo "::error::Unsupported platform '$platform_slug' for pnpm benchmarks (expected ubuntu or windows)"
+          exit 1
+        fi
+        if [[ ! "$node_major" =~ ^(22|24|26)$ ]]; then
+          echo "::error::Unsupported Node major version '$node_major' for pnpm benchmarks (expected 22, 24, or 26)"
+          exit 1
+        fi
+        
         artifact_dir=.bench-artifact/pnpm
         mkdir -p "$artifact_dir"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test.yml around lines 184 - 198, The testbed name
construction using platform_slug and node_major lacks validation against the
supported patterns required by the downstream Bencher upload workflow. Add
validation checks after extracting platform_slug from PLATFORM and node_major
from NODE_VERSION to ensure platform_slug matches either "ubuntu" or "windows"
and node_major matches one of "22", "24", or "26". If either validation fails,
exit with a clear error message that specifies which value is unsupported and
what the allowed values are, so failures happen immediately rather than during
the later Bencher upload step.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/ci-performance-bencher-upload.yml:
- Around line 167-179: The Rust case in the workflow currently allows the job to
succeed even when the metadata is malformed or incomplete (missing reports
field, empty reports, or wrong number of reports). Before iterating through the
reports in the `rust)` case statement, add validation to fail closed by checking
that the reports field exists and contains exactly two entries with the exact
filenames pacquet-tests-all.json and pnpr-tests-all.json. Extract the reports
array from the metadata, verify its length is exactly 2, verify each report's
filename matches one of the expected names, and exit with status 1 if any
validation fails. This ensures the job fails rather than silently succeeding
with incomplete data when the producer contract is violated.

In @.github/workflows/ci.yml:
- Around line 32-34: The current implementation uses newline-delimited path
handling with find and mapfile -t, which creates an argument injection
vulnerability if any filename contains a newline character. This allows crafted
paths to be split into multiple argv tokens, with a fragment starting with --
being misinterpreted by tar as an option. Replace the newline-delimited approach
on lines 32 and 33 with null-delimited path handling by using find with the
-print0 flag and mapfile with the -d '' (null delimiter) option. This ensures
that paths containing newlines are treated as single arguments and prevents
argument injection attacks in the tar command on line 34.

---

Nitpick comments:
In @.github/workflows/test.yml:
- Around line 184-198: The testbed name construction using platform_slug and
node_major lacks validation against the supported patterns required by the
downstream Bencher upload workflow. Add validation checks after extracting
platform_slug from PLATFORM and node_major from NODE_VERSION to ensure
platform_slug matches either "ubuntu" or "windows" and node_major matches one of
"22", "24", or "26". If either validation fails, exit with a clear error message
that specifies which value is unsupported and what the allowed values are, so
failures happen immediately rather than during the later Bencher upload step.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0f9e8f72-8808-44c0-967f-a6350d1d8ee9

📥 Commits

Reviewing files that changed from the base of the PR and between 8bc9956 and 19c8910.

📒 Files selected for processing (9)
  • .github/scripts/bencher-result-from-pnpm-summary.mjs
  • .github/scripts/measure-command.mjs
  • .github/scripts/merge-bencher-results.mjs
  • .github/workflows/ci-performance-bencher-upload.yml
  • .github/workflows/ci.yml
  • .github/workflows/pacquet-ci.yml
  • .github/workflows/test.yml
  • justfile
  • package.json
🚧 Files skipped from review as they are similar to previous changes (4)
  • justfile
  • .github/scripts/merge-bencher-results.mjs
  • .github/scripts/measure-command.mjs
  • .github/scripts/bencher-result-from-pnpm-summary.mjs

Comment thread .github/workflows/ci-performance-bencher-upload.yml Outdated
Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci-performance-bencher-upload.yml
Comment thread .github/workflows/ci-performance-bencher-upload.yml Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/scripts/bench-output-path.mjs:
- Line 7: The boundary check in the condition on line 7 only validates lexical
path traversal and does not account for symlinks, which can be used to escape
the intended directory. After the existing lexical validations for empty paths,
parent directory references, and absolute paths, you must resolve the actual
filesystem path (following any symlinks) and verify that the resolved path is
within the intended .bench directory. Use a function like realpath() to resolve
symlinks and then check that the resolved path starts with the intended
directory's resolved path to ensure it cannot escape outside the boundary.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3794480f-6870-45b3-b1d8-a1e7696799bf

📥 Commits

Reviewing files that changed from the base of the PR and between 19c8910 and 651b5d2.

📒 Files selected for processing (10)
  • .github/scripts/bench-output-path.mjs
  • .github/scripts/bencher-result-from-pnpm-summary.mjs
  • .github/scripts/measure-command.mjs
  • .github/scripts/merge-bencher-results.mjs
  • .github/workflows/ci-performance-bencher-upload.yml
  • .github/workflows/ci.yml
  • .github/workflows/pacquet-ci.yml
  • .github/workflows/test.yml
  • justfile
  • package.json
🚧 Files skipped from review as they are similar to previous changes (8)
  • justfile
  • .github/scripts/merge-bencher-results.mjs
  • .github/workflows/test.yml
  • .github/scripts/bencher-result-from-pnpm-summary.mjs
  • package.json
  • .github/workflows/pacquet-ci.yml
  • .github/scripts/measure-command.mjs
  • .github/workflows/ci-performance-bencher-upload.yml

Comment thread .github/scripts/bench-output-path.mjs Outdated
@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code Review by Qodo

Grey Divider

New Review Started

This review has been superseded by a new analysis

Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit e138d22

@zkochan zkochan merged commit 9ddc86b into main Jun 14, 2026
24 of 25 checks passed
@zkochan zkochan deleted the bench-tests branch June 14, 2026 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant