Skip to content
This repository was archived by the owner on May 14, 2026. It is now read-only.

ci(integrated-benchmark): extract compilation step#390

Merged
KSXGitHub merged 4 commits into
mainfrom
claude/optimize-benchmark-compilation-kMsZg
May 5, 2026
Merged

ci(integrated-benchmark): extract compilation step#390
KSXGitHub merged 4 commits into
mainfrom
claude/optimize-benchmark-compilation-kMsZg

Conversation

@KSXGitHub

@KSXGitHub KSXGitHub commented May 5, 2026

Copy link
Copy Markdown
Contributor

Motivation

The "Benchmark: Frozen Lockfile" CI step has been responsible for compiling both revisions (HEAD and main) for both timed benchmark steps. With the Rust build cache hit it takes ~5 min — half its 10-min timeout. On a cache miss it climbs toward 10 min and edges close to the hard timeout.

Change

CI

Add a new "Precompile benchmark revisions" step that runs before both
timed benchmark steps, with its own 15-min timeout:

- name: Precompile benchmark revisions
  shell: bash
  timeout-minutes: 15
  run: just integrated-benchmark --build-only HEAD main

The two timed steps now fall through to a near-no-op incremental build, so
their timeouts only need to cover hyperfine itself.

integrated-benchmark binary

Add a --build-only flag that runs WorkEnv::build() (clone → fetch →
checkout → cargo build --release --bin=pacquet for each revision) and
exits — no init(), no proxy-cache priming, no hyperfine.

In this mode the binary also skips spawning verdaccio / connecting to a
running registry — the precompile step doesn't have one up yet, and
cargo build doesn't need a registry. The ensure_program checks for
bash / hyperfine / pnpm stay unconditional: CI installs them all
before the precompile step runs, so the checks are sub-second no-ops in
practice and keep fail-fast behavior if the environment is misconfigured.

WorkEnv::build is pub and called directly from main.rs.

CLI ergonomics

--scenario and --build-only are mutually exclusive: exactly one must
be given. Expressed purely through clap field-level attributes — no
runtime validation:

#[clap(long, short, required_unless_present = "build_only", conflicts_with = "build_only")]
pub scenario: Option<BenchmarkScenario>,
Invocation Result
neither flag clap error: --scenario required
both flags clap error: --scenario cannot be used with --build-only
--scenario=… only OK (full benchmark run)
--build-only only OK (precompile only)

WorkEnv.scenario becomes Option<BenchmarkScenario>; init() and
benchmark() unwrap it (only reached on the non---build-only path,
where clap guarantees --scenario is set).

Files touched

  • tasks/integrated-benchmark/src/cli_args.rs — new --build-only flag; --scenario becomes Option<…> with required_unless_present + conflicts_with.
  • tasks/integrated-benchmark/src/main.rs — gate verdaccio / virtual-registry setup on !build_only; dispatch to env.build() or env.run().
  • tasks/integrated-benchmark/src/work_env.rsbuild() made public; scenario field becomes optional; init() and benchmark() unwrap it at the boundary.
  • .github/workflows/integrated-benchmark.yml — new Precompile benchmark revisions step before the two timed steps.

Out of scope / not done

  • The total CI wall-clock for the job is roughly unchanged on a warm cache; this PR isolates the compile cost into its own step rather than shortening it.
  • Timeouts on the two timed steps stay at 10 min — they could be tightened later now that they no longer include compile time, but that's a follow-up.

The "Benchmark: Frozen Lockfile" step has been responsible for compiling
both revisions for both timed steps, which on a cold Rust build cache
edges close to its 10-minute timeout. Add a `--build-only` flag to the
integrated-benchmark binary that runs `WorkEnv::build()` (clone, fetch,
checkout, `cargo build --release --bin=pacquet`) without `init()`,
proxy-cache priming, or hyperfine, and call it from a new
"Precompile benchmark revisions" CI step before the two timed steps.

Both timed steps now hit a near-no-op incremental build, and the cold-
cache compile cost is isolated in its own step with its own timeout.
@KSXGitHub KSXGitHub requested a review from zkochan May 5, 2026 05:16
@coderabbitai

coderabbitai Bot commented May 5, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

A "build-only" mode is added to the integrated benchmark CLI tool, allowing pre-compilation of benchmark revisions without running scenarios. The CLI now accepts --build-only with conflicting scenario argument logic, the WorkEnv exposes a public build() method, and control flow in main.rs routes to build-only or run mode. A CI workflow step is added to precompile revisions before running benchmark scenarios.

Changes

Benchmark Build-Only Mode

Layer / File(s) Summary
CLI Arguments
tasks/integrated-benchmark/src/cli_args.rs
--build-only flag added; scenario changes from required BenchmarkScenario to Option<BenchmarkScenario> with required_unless_present = "build_only" and conflicts_with = "build_only".
Data Shape
tasks/integrated-benchmark/src/work_env.rs
WorkEnv.scenario field changes from BenchmarkScenario to Option<BenchmarkScenario>; both init() and benchmark() now unwrap scenario with expect() before use.
Method Exposure & Core Logic
tasks/integrated-benchmark/src/main.rs, tasks/integrated-benchmark/src/work_env.rs
WorkEnv::build() visibility changed from private to public; main.rs skips Verdaccio when build_only is true and branches to call env.build() or env.run() accordingly.
Workflow Integration
.github/workflows/integrated-benchmark.yml
New step "Precompile benchmark revisions" added to run just integrated-benchmark --build-only HEAD main after benchmark executor build, with 15-minute timeout.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

  • zkochan

Poem

A rabbit builds before the race,
Precompiling benchmarks in place.
With --build-only flags so keen,
No running scenarios in between! 🐰⚡

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main objective of the PR: extracting the compilation step into a separate CI workflow step with a new --build-only flag, which is the primary change across all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/optimize-benchmark-compilation-kMsZg

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented May 5, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.03%. Comparing base (ac5a7fb) to head (25beffc).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #390   +/-   ##
=======================================
  Coverage   81.03%   81.03%           
=======================================
  Files          64       64           
  Lines        3517     3517           
=======================================
  Hits         2850     2850           
  Misses        667      667           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.00     16.3±0.72ms   266.5 KB/sec    1.01     16.4±0.27ms   264.8 KB/sec

…d-only

Use clap's `required_unless_present` + `conflicts_with` to express
"exactly one of `--scenario` or `--build-only` must be given" purely
through field-level attributes. This drops the redundant
`--scenario=frozen-lockfile` from the precompile CI step and keeps clap
as the sole authority on the flag's required-ness — no runtime checks.

`WorkEnv.scenario` becomes `Option<BenchmarkScenario>`; `init()` and
`benchmark()` unwrap it (they're only reached on the non-`--build-only`
path, where clap guarantees `--scenario` is set).
@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown

Integrated-Benchmark Report (Linux)

Scenario: Frozen Lockfile

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.378 ± 0.079 2.268 2.486 1.01 ± 0.04
pacquet@main 2.361 ± 0.052 2.313 2.443 1.00
pnpm 6.097 ± 0.121 5.943 6.288 2.58 ± 0.08
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.3784095015599993,
      "stddev": 0.07922768489384564,
      "median": 2.39201534466,
      "user": 2.48489064,
      "system": 3.2769306,
      "min": 2.26758791866,
      "max": 2.48647895466,
      "times": [
        2.36649080266,
        2.42723610266,
        2.35455439266,
        2.42781061566,
        2.28610906666,
        2.46713626466,
        2.48647895466,
        2.41753988666,
        2.26758791866,
        2.28315101066
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.3612352952599998,
      "stddev": 0.051794648534639054,
      "median": 2.33063181216,
      "user": 2.47458294,
      "system": 3.2692129999999997,
      "min": 2.31275448866,
      "max": 2.44337903866,
      "times": [
        2.32474612766,
        2.36554323266,
        2.32995694166,
        2.42117714266,
        2.32640457266,
        2.44337903866,
        2.33130668266,
        2.31275448866,
        2.43518527466,
        2.32189945066
      ]
    },
    {
      "command": "pnpm",
      "mean": 6.09743259646,
      "stddev": 0.12108845002341861,
      "median": 6.05779383316,
      "user": 8.82574624,
      "system": 4.360614099999999,
      "min": 5.94341595866,
      "max": 6.28845834166,
      "times": [
        6.28498093966,
        6.05904600566,
        6.28845834166,
        6.15813594066,
        6.02157339166,
        5.94341595866,
        6.16230849866,
        5.98032542666,
        6.05654166066,
        6.01953980066
      ]
    }
  ]
}

Scenario: Frozen Lockfile (Hot Cache)

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 738.3 ± 32.2 711.6 820.6 1.00
pacquet@main 775.1 ± 71.2 720.8 937.0 1.05 ± 0.11
pnpm 2560.4 ± 93.5 2395.9 2687.1 3.47 ± 0.20
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.73834407698,
      "stddev": 0.03224447243529431,
      "median": 0.73289576658,
      "user": 0.25313732,
      "system": 1.3883405,
      "min": 0.71163747058,
      "max": 0.82057827458,
      "times": [
        0.82057827458,
        0.74180064658,
        0.74840542158,
        0.72741987558,
        0.71868768158,
        0.74805743358,
        0.73837165758,
        0.71243761358,
        0.71163747058,
        0.71604469458
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.7750859015800001,
      "stddev": 0.07118281722743486,
      "median": 0.74089749708,
      "user": 0.26308222,
      "system": 1.4112679,
      "min": 0.7207696985800001,
      "max": 0.93698042258,
      "times": [
        0.82280857958,
        0.7207696985800001,
        0.7322223235800001,
        0.84703883158,
        0.73709444358,
        0.7447005505800001,
        0.73086192258,
        0.93698042258,
        0.75040806658,
        0.72797417658
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.56035258838,
      "stddev": 0.09350488261088342,
      "median": 2.52866510608,
      "user": 2.9988653199999997,
      "system": 2.2234581999999996,
      "min": 2.39593579158,
      "max": 2.68714509358,
      "times": [
        2.6270820715800003,
        2.51221509158,
        2.39593579158,
        2.66236993258,
        2.50895860158,
        2.5364396085800003,
        2.52089060358,
        2.65725730558,
        2.49523178358,
        2.68714509358
      ]
    }
  ]
}

Comment thread tasks/integrated-benchmark/src/cli_args.rs Outdated
Comment thread tasks/integrated-benchmark/src/main.rs Outdated
Comment thread tasks/integrated-benchmark/src/work_env.rs Outdated
Comment thread tasks/integrated-benchmark/src/main.rs Outdated
Comment thread tasks/integrated-benchmark/src/work_env.rs Outdated
Comment thread .github/workflows/integrated-benchmark.yml Outdated
Comment thread .github/workflows/integrated-benchmark.yml Outdated
Comment thread tasks/integrated-benchmark/src/cli_args.rs Outdated
claude added 2 commits May 5, 2026 05:46
…y wrapper

Address review on #390:
  - Make `WorkEnv::build` public and call it from `main.rs` directly,
    instead of routing through a `build_only()` wrapper that did nothing
    but rename the call.
  - Strip the redundant doc/code comments that just restated what the
    surrounding code or clap's `--help` output already says.
  - Shorten the `--build-only` help text to one line, matching the
    other flag descriptions.
Address review on #390. The earlier change weakened the program-presence
guard for `--build-only` mode on the assumption it's an unrelated entry
point. It isn't: CI runs `--build-only` immediately before the benchmark
steps in the same job, after every required tool has already been
installed, so the check is always a no-op cost-wise. Gating it only
delays a missing-tool failure from the start of the precompile step to
15 min later in the benchmark step, which is strictly worse.

The verdaccio gating stays — that one is required, since the precompile
step doesn't have a registry running yet.
@KSXGitHub KSXGitHub marked this pull request as ready for review May 5, 2026 06:08
Copilot AI review requested due to automatic review settings May 5, 2026 06:08

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the integrated-benchmark CI workflow to separate “compilation of benchmark revisions” from the timed benchmark runs, reducing the risk that the timed steps hit their 10-minute timeout due to cold Rust caches.

Changes:

  • Add a --build-only mode to the integrated-benchmark binary to clone/fetch/checkout and cargo build --release --bin=pacquet for each revision, then exit.
  • Make --scenario optional at the type level and enforce “exactly one of --scenario or --build-only” via clap attributes (no runtime validation).
  • Update the GitHub Actions workflow to precompile HEAD and main in a dedicated step with its own timeout before running the timed hyperfine scenarios.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
tasks/integrated-benchmark/src/cli_args.rs Adds --build-only and enforces mutual exclusivity / requiredness with --scenario using clap attributes.
tasks/integrated-benchmark/src/main.rs Skips registry/verdaccio setup in build-only mode and dispatches to env.build() vs env.run().
tasks/integrated-benchmark/src/work_env.rs Makes build() public and updates scenario handling to Option, unwrapping only when entering init/benchmark paths.
.github/workflows/integrated-benchmark.yml Adds a precompile step (just integrated-benchmark --build-only HEAD main) ahead of the two timed benchmark steps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@KSXGitHub KSXGitHub enabled auto-merge (squash) May 5, 2026 06:11
@KSXGitHub KSXGitHub merged commit 1630ce7 into main May 5, 2026
19 checks passed
@KSXGitHub KSXGitHub deleted the claude/optimize-benchmark-compilation-kMsZg branch May 5, 2026 10:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants