-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Closed
Labels
performancePotential performance improvementPotential performance improvement
Description
I've implemented a series of optimizations for the apache-airflow resolver case, focussed on the resolver thread. Each step targets something that showed up as slow during profiling. Since the benchmarks are path dependent (the speedup of commit B looks different depending on whether a commit A was applied before or not. You need to speed up the slowest item to really see the effect on optimizing the next smaller one), I'm sharing the benchmark numbers here and keep the PRs focused on explaining the code change.
Commits:
- Remove
[u64; 4]from small version to moveArcto full version - Speed up file pins
- Simplify
requirements_for_extra(Note: Refactoring) - Optimize
requirements_for_extra - Split determining and executing batch prefetch (Note: Refactoring)
- Split out BatchPrefetcherRunner (Note: Refactoring)
- Deactivate tracing for choose version
- Avoid overcounting versions in batch prefetcher
- Add Send bounds to cache deserialization (Note: Refactoring)
The commits are split into refactorings and speedups.
Benchmark 1: ./uv-main pip compile scripts/requirements/airflow.in
Time (mean ± σ): 444.1 ms ± 10.5 ms [User: 626.4 ms, System: 190.7 ms]
Range (min … max): 434.5 ms … 468.0 ms 10 runs
Benchmark 2: ./uv-1 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 396.1 ms ± 7.0 ms [User: 554.1 ms, System: 171.3 ms]
Range (min … max): 386.8 ms … 408.7 ms 10 runs
Benchmark 3: ./uv-2 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 344.2 ms ± 3.7 ms [User: 481.9 ms, System: 167.7 ms]
Range (min … max): 339.7 ms … 349.8 ms 10 runs
Benchmark 4: ./uv-3 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 344.4 ms ± 4.0 ms [User: 488.3 ms, System: 154.4 ms]
Range (min … max): 340.0 ms … 354.0 ms 10 runs
Benchmark 5: ./uv-4 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 325.2 ms ± 3.0 ms [User: 467.4 ms, System: 159.7 ms]
Range (min … max): 321.5 ms … 329.4 ms 10 runs
Benchmark 6: ./uv-5 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 329.4 ms ± 6.1 ms [User: 474.2 ms, System: 157.8 ms]
Range (min … max): 324.0 ms … 344.2 ms 10 runs
Benchmark 7: ./uv-6 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 328.5 ms ± 3.7 ms [User: 466.4 ms, System: 162.5 ms]
Range (min … max): 324.4 ms … 335.0 ms 10 runs
Benchmark 8: ./uv-7 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 321.1 ms ± 3.2 ms [User: 453.4 ms, System: 163.0 ms]
Range (min … max): 316.8 ms … 327.5 ms 10 runs
Benchmark 9: ./uv-8 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 213.2 ms ± 3.2 ms [User: 300.7 ms, System: 115.9 ms]
Range (min … max): 207.5 ms … 220.2 ms 14 runs
Benchmark 10: ./uv-9 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 210.9 ms ± 1.7 ms [User: 295.8 ms, System: 118.4 ms]
Range (min … max): 207.7 ms … 214.6 ms 14 runs
Summary
./uv-9 pip compile scripts/requirements/airflow.in ran
1.01 ± 0.02 times faster than ./uv-8 pip compile scripts/requirements/airflow.in
1.52 ± 0.02 times faster than ./uv-7 pip compile scripts/requirements/airflow.in
1.54 ± 0.02 times faster than ./uv-4 pip compile scripts/requirements/airflow.in
1.56 ± 0.02 times faster than ./uv-6 pip compile scripts/requirements/airflow.in
1.56 ± 0.03 times faster than ./uv-5 pip compile scripts/requirements/airflow.in
1.63 ± 0.02 times faster than ./uv-2 pip compile scripts/requirements/airflow.in
1.63 ± 0.02 times faster than ./uv-3 pip compile scripts/requirements/airflow.in
1.88 ± 0.04 times faster than ./uv-1 pip compile scripts/requirements/airflow.in
2.11 ± 0.05 times faster than ./uv-main pip compile scripts/requirements/airflow.in
Methodology:
git switch main && cargo build --profile profiling && cp target/profiling/uv uv-main && git switch -
# [Rebase with stop at every commit]
cargo build --profile profiling && cp target/profiling/uv uv-1 && git rebase --continue
cargo build --profile profiling && cp target/profiling/uv uv-2 && git rebase --continue
cargo build --profile profiling && cp target/profiling/uv uv-3 && git rebase --continue
cargo build --profile profiling && cp target/profiling/uv uv-4 && git rebase --continue
cargo build --profile profiling && cp target/profiling/uv uv-5 && git rebase --continue
cargo build --profile profiling && cp target/profiling/uv uv-6 && git rebase --continue
cargo build --profile profiling && cp target/profiling/uv uv-7 && git rebase --continue
cargo build --profile profiling && cp target/profiling/uv uv-8 && git rebase --continue
cargo build --profile profiling && cp target/profiling/uv uv-9 && git rebase --continue
uv venv -p 3.12 && hyperfine --warmup 2 \
"./uv-main pip compile scripts/requirements/airflow.in" \
"./uv-1 pip compile scripts/requirements/airflow.in" \
"./uv-2 pip compile scripts/requirements/airflow.in" \
"./uv-3 pip compile scripts/requirements/airflow.in" \
"./uv-4 pip compile scripts/requirements/airflow.in" \
"./uv-5 pip compile scripts/requirements/airflow.in" \
"./uv-6 pip compile scripts/requirements/airflow.in" \
"./uv-7 pip compile scripts/requirements/airflow.in" \
"./uv-8 pip compile scripts/requirements/airflow.in" \
"./uv-9 pip compile scripts/requirements/airflow.in"Overall benchmarks:
# Scheduler: Performance
$ hyperfine --warmup 2 --runs 50 "./uv-main pip compile scripts/requirements/airflow.in" "./uv-8 pip compile scripts/requirements/airflow.in"
Benchmark 1: ./uv-main pip compile scripts/requirements/airflow.in
Time (mean ± σ): 439.7 ms ± 6.4 ms [User: 616.1 ms, System: 200.6 ms]
Range (min … max): 430.4 ms … 454.7 ms 50 runs
Benchmark 2: ./uv-8 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 214.1 ms ± 5.1 ms [User: 297.9 ms, System: 123.9 ms]
Range (min … max): 206.8 ms … 228.0 ms 50 runs
Summary
./uv-8 pip compile scripts/requirements/airflow.in ran
2.05 ± 0.06 times faster than ./uv-main pip compile scripts/requirements/airflow.in
# Scheduler: Performance
$ taskset -c 0-1 hyperfine --warmup 2 "./uv-main pip compile scripts/requirements/airflow.in" "./uv-8 pip compile scripts/requirements/airflow.in"
Benchmark 1: ./uv-main pip compile scripts/requirements/airflow.in
Time (mean ± σ): 595.8 ms ± 4.5 ms [User: 774.2 ms, System: 201.2 ms]
Range (min … max): 590.8 ms … 603.8 ms 10 runs
Benchmark 2: ./uv-8 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 301.9 ms ± 2.5 ms [User: 389.8 ms, System: 121.6 ms]
Range (min … max): 299.4 ms … 306.9 ms 10 runs
Summary
./uv-8 pip compile scripts/requirements/airflow.in ran
1.97 ± 0.02 times faster than ./uv-main pip compile scripts/requirements/airflow.in
# Scheduler: Powersave
$ hyperfine --warmup 1 "./uv-main pip compile scripts/requirements/airflow.in" "./uv-8 pip compile scripts/requirements/airflow.in"
Benchmark 1: ./uv-main pip compile scripts/requirements/airflow.in
Time (mean ± σ): 1.222 s ± 0.222 s [User: 1.718 s, System: 0.575 s]
Range (min … max): 0.652 s … 1.424 s 10 runs
Benchmark 2: ./uv-8 pip compile scripts/requirements/airflow.in
Time (mean ± σ): 729.3 ms ± 129.2 ms [User: 1030.7 ms, System: 370.5 ms]
Range (min … max): 549.4 ms … 961.6 ms 10 runs
Summary
./uv-8 pip compile scripts/requirements/airflow.in ran
1.68 ± 0.42 times faster than ./uv-main pip compile scripts/requirements/airflow.in
# Scheduler: Default
$ hyperfine --warmup 1 "./uv-main pip compile scripts/requirements/airflow.in --refresh" "./uv-8 pip compile scripts/requirements/airflow.in --refresh"
Benchmark 1: ./uv-main pip compile scripts/requirements/airflow.in --refresh
Time (mean ± σ): 819.6 ms ± 112.8 ms [User: 737.7 ms, System: 387.5 ms]
Range (min … max): 723.3 ms … 1116.1 ms 10 runs
Benchmark 2: ./uv-8 pip compile scripts/requirements/airflow.in --refresh
Time (mean ± σ): 597.7 ms ± 106.0 ms [User: 440.2 ms, System: 305.8 ms]
Range (min … max): 494.6 ms … 835.9 ms 10 runs
Summary
./uv-8 pip compile scripts/requirements/airflow.in --refresh ran
1.37 ± 0.31 times faster than ./uv-main pip compile scripts/requirements/airflow.in --refresh
Profile of the resolver thread of apache airflow warm cache pip resolve before:
cargo build --profile profiling
samply record --rate 20000 target/profiling/uv pip compile scripts/requirements/airflow.in > /dev/null
Profile after:
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
performancePotential performance improvementPotential performance improvement

