Skip to content

Conversation

@AlexWaygood
Copy link
Member

Summary

An early version of #19811 dramatically increased the time it took ty to type-check the static-frame codebase: in mypy_primer runs, the execution time went from around 1 minute to around 10 minutes. However, on that early version of #19811, most benchmarks showed big improvements, with only a few benchmarks showing regressions of 1-2%.

The latest version of #19811 no longer exhibits a large increase in execution time when type-checking static-frame, but I find it somewhat concerning that none of our benchmarks flagged the early version; I only accidentally noticed the regression due to the fact that mypy_primer runs were taking much longer to complete! This PR therefore adds static-frame as a walltime benchmark, since it obviously contains Python code patterns that don't currently have good coverage in our ty benchmark suite. The benchmark setup is basically copied from the project's mypy_primer setup.

Test Plan

cargo bench -p ruff_benchmark --bench=ty_walltime

@AlexWaygood AlexWaygood added ci Related to internal CI tooling ty Multi-file analysis & type inference labels Aug 10, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 10, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@AlexWaygood AlexWaygood marked this pull request as draft August 10, 2025 13:26
@AlexWaygood AlexWaygood force-pushed the alex/static-frame-bench branch from 15ac01a to e8f27f1 Compare August 10, 2025 14:05
@AlexWaygood
Copy link
Member Author

AlexWaygood commented Aug 10, 2025

Hmm, I'm not sure why (as it's not the case for me locally), but it looks like the arraykit dependency of static-frame is being built from source in CI when preparing the environment for static-frame. numpy needs to be present in the build environment for building from source to work (which adds a fair amount of complexity to the benchmark setup). Building arraykit from source also means that setting up the environment for static-frame appears to take a fair bit longer than any other walltime benchmark (23.18 seconds). So this might not be worth it.

Having said all that, the walltime-benchmarks CI job is still completing in roughly the same amount of time as the instrumented-benchmarks CI job.

@AlexWaygood AlexWaygood marked this pull request as ready for review August 10, 2025 14:35
@MichaReiser
Copy link
Member

Looking at the install and wall time, it suggests that the wall time benchmark now takes 1 minute longer to run, reaching 10 minutes (for both wall and instrumented benchmarks). This is definitely reaching the point where it feels very long if you need the benchmark numbers. Unfortunately, codspeed also doesn't support splitting the benchmarks.

Having said all that, the walltime-benchmarks CI job is still completing in roughly the same amount of time as the instrumented-benchmarks CI job.

Hehe, and the next instrumentation benchmark addition will not be much longer than the wall time run ;)

I'll reach out to codspeed to see if there's anything we can do to improve our benchmark runtime. Are there any benchmark that we could remove in favor of static-frame?

@MichaReiser
Copy link
Member

Could we install some of the dependencies as part of the setup-uv build step?

@AlexWaygood
Copy link
Member Author

It actually looks like the big slowdown on the "bad version" of #19811 is reproducible even without arraykit installed, so the simplest solution here is probably to avoid that dependency (though I'm still not sure why the C extension is being built from source in the context of the Codspeed run, since it just downloads a built wheel from PyPI when I install arraykit locally, and mypy_primer also manages to use a built wheel).

I'll push that change and see how long the walltime benchmark takes then.

@AlexWaygood AlexWaygood force-pushed the alex/static-frame-bench branch from 27ea90a to 205e69a Compare August 11, 2025 09:45
@AlexWaygood
Copy link
Member Author

AlexWaygood commented Aug 11, 2025

Okay, on the latest version of this PR, the walltime job is completing in 10m7s overall, which appears ~indistinguishable to the time it takes to complete on main

@AlexWaygood AlexWaygood merged commit f3f4db7 into main Aug 11, 2025
35 checks passed
@AlexWaygood AlexWaygood deleted the alex/static-frame-bench branch August 11, 2025 14:38
dcreager added a commit that referenced this pull request Aug 11, 2025
* main:
  [ty] Add `static-frame` as a walltime benchmark (#19844)
  [ty] Update goto range for attribute access to only target the attribute (#19848)
dcreager added a commit that referenced this pull request Aug 11, 2025
* dcreager/bound-typevar: (41 commits)
  [ty] Use separate Rust types for bound and unbound type variables (#19796)
  fix ide tests
  better unbound typevar rendering
  Apply suggestions from code review
  [ty] Add `static-frame` as a walltime benchmark (#19844)
  add explanatory comment
  [ty] Update goto range for attribute access to only target the attribute (#19848)
  remove unneeded ord
  add TODO for broken hover test
  better PEP 695 binding context
  Add AIR301 rule (#17707)
  Avoid underflow in default ranges before a BOM (#19839)
  Update actions/download-artifact digest to de96f46 (#19852)
  Update docker/login-action action to v3.5.0 (#19860)
  Update rui314/setup-mold digest to 7344740 (#19853)
  Update cargo-bins/cargo-binstall action to v1.14.4 (#19855)
  Update actions/cache action to v4.2.4 (#19854)
  Update Rust crate hashbrown to v0.15.5 (#19858)
  Update Rust crate camino to v1.1.11 (#19857)
  Update Rust crate proc-macro2 to v1.0.96 (#19859)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Related to internal CI tooling ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants