perf: use mimalloc by default#7710
Conversation
|
!bench |
|
Here are the benchmark results for commit 2353e65. |
|
!bench |
|
Here are the benchmark results for commit 5588014. |
|
!bench |
|
Here are the benchmark results for commit a8c167a. Benchmark Metric Change
====================================================================================
+ Init.Data.List.Sublist async maxrss -30.8% (-68.2 σ)
- Init.Prelude async branches 2.3% (76.8 σ)
- Init.Prelude async instructions 1.7% (51.5 σ)
+ Init.Prelude async maxrss -23.8% (-51.2 σ)
+ Std.Data.DHashMap.Internal.RawLemmas branch-misses -4.9% (-35.3 σ)
- Std.Data.DHashMap.Internal.RawLemmas branches 3.0% (436.6 σ)
- Std.Data.DHashMap.Internal.RawLemmas instructions 2.5% (279.6 σ)
+ Std.Data.DHashMap.Internal.RawLemmas maxrss -18.2% (-123.7 σ)
- Std.Data.DHashMap.Internal.RawLemmas task-clock 6.1% (17.3 σ)
- Std.Data.Internal.List.Associative branches 3.4% (410.7 σ)
- Std.Data.Internal.List.Associative instructions 2.9% (437.0 σ)
+ Std.Data.Internal.List.Associative maxrss -12.3% (-13.6 σ)
- Std.Data.Internal.List.Associative wall-clock 9.6% (11.6 σ)
+ big_do branches -6.4% (-902.2 σ)
+ big_do instructions -7.7% (-982.2 σ)
- big_omega.lean branches 2.7% (72.5 σ)
- big_omega.lean instructions 2.6% (92.5 σ)
- big_omega.lean maxrss 21.8% (343.2 σ)
- big_omega.lean task-clock 20.3% (19.0 σ)
- big_omega.lean wall-clock 19.1% (17.4 σ)
- big_omega.lean MT branches 2.6% (123.7 σ)
- big_omega.lean MT instructions 2.5% (165.0 σ)
- big_omega.lean MT maxrss 21.9% (748.5 σ)
- big_omega.lean MT task-clock 18.9% (10.6 σ)
- binarytrees instructions 25.2% (5751.1 σ)
- binarytrees task-clock 14.7% (14.1 σ)
- binarytrees.st instructions 25.5% (10153.2 σ)
- binarytrees.st maxrss 21.8% (169.7 σ)
- binarytrees.st task-clock 25.5% (12.4 σ)
- binarytrees.st wall-clock 25.6% (12.4 σ)
- bv_decide_inequality.lean branches 4.2% (90.8 σ)
- bv_decide_inequality.lean instructions 3.4% (62.9 σ)
+ bv_decide_inequality.lean maxrss -15.3% (-12.3 σ)
- bv_decide_mod branches 5.3% (43.2 σ)
- bv_decide_mod instructions 4.7% (36.6 σ)
- bv_decide_mod task-clock 6.5% (12.1 σ)
- bv_decide_mod wall-clock 6.7% (12.4 σ)
- bv_decide_mul branches 6.7% (121.0 σ)
- bv_decide_mul instructions 6.1% (132.8 σ)
- bv_decide_mul task-clock 11.8% (10.7 σ)
- bv_decide_mul wall-clock 12.4% (12.9 σ)
- bv_decide_realworld branches 4.3% (61.8 σ)
- bv_decide_realworld instructions 3.5% (48.2 σ)
- bv_decide_realworld task-clock 7.4% (16.6 σ)
- bv_decide_realworld wall-clock 7.3% (11.1 σ)
- const_fold instructions 12.3% (95.3 σ)
- const_fold maxrss 22.2% (784.9 σ)
- deriv instructions 13.9% (334.8 σ)
- deriv maxrss 34.7% (524.8 σ)
- deriv task-clock 17.5% (11.3 σ)
- deriv wall-clock 17.6% (11.4 σ)
- identifier auto-completion branches 9.9% (80.2 σ)
- identifier auto-completion completion 18.5% (11.8 σ)
- identifier auto-completion instructions 9.6% (69.6 σ)
- identifier auto-completion task-clock 22.3% (15.9 σ)
- identifier auto-completion wall-clock 17.2% (19.0 σ)
- ilean roundtrip branches 9.3% (51.1 σ)
- ilean roundtrip compress 35.3% (12.1 σ)
- ilean roundtrip instructions 10.3% (49.0 σ)
- ilean roundtrip maxrss 1.2% (20.9 σ)
- ilean roundtrip task-clock 21.8% (17.5 σ)
- ilean roundtrip wall-clock 21.8% (17.5 σ)
- import Lean branches 4.9% (47.9 σ)
- import Lean instructions 4.5% (50.3 σ)
- import Lean task-clock 3.5% (11.7 σ)
- import Lean wall-clock 4.3% (12.3 σ)
+ lake build clean instructions -2.4% (-148.7 σ)
+ lake build clean maxrss -31.1% (-543.0 σ)
- lake build clean task-clock 10.4% (18.8 σ)
+ lake build clean wall-clock -16.5% (-51.6 σ)
+ lake build no-op maxrss -19.2% (-703.9 σ)
- lake config elab instructions 1.4% (52.4 σ)
- lake config import instructions 5.7% (88.8 σ)
- lake config import maxrss 2.2% (38.6 σ)
- lake config tree instructions 5.2% (252.0 σ)
- lake config tree maxrss 2.2% (33.6 σ)
- lake env instructions 5.6% (227.4 σ)
- lake env maxrss 2.3% (23.4 σ)
+ lake startup maxrss -2.2% (-10.2 σ)
+ language server startup instructions -1.6% (-31.0 σ)
+ language server startup maxrss -5.0% (-130.0 σ)
+ liasolver instructions -13.7% (-4562.6 σ)
- libleanshared.so binary size 3.7%
- nat_repr branches 1.8% (73.5 σ)
+ omega_stress.lean async branch-misses -11.4% (-14.2 σ)
+ omega_stress.lean async branches -5.5% (-175.7 σ)
+ omega_stress.lean async instructions -6.3% (-189.9 σ)
+ omega_stress.lean async maxrss -13.0% (-27.8 σ)
- parser instructions 6.6% (6386.4 σ)
- qsort instructions 3.6% (1371.4 σ)
- rbmap instructions 1.5% (453.8 σ)
- rbmap maxrss 8.8%
- rbmap_1 instructions 16.2% (860.1 σ)
- rbmap_1 maxrss 19.2% (1545.6 σ)
- rbmap_1 task-clock 28.5% (29.3 σ)
- rbmap_1 wall-clock 28.5% (29.4 σ)
- rbmap_10 instructions 6.0% (189.5 σ)
- rbmap_10 maxrss 16.3% (362.0 σ)
- rbmap_fbip instructions 26.9% (3093.4 σ)
+ rbmap_fbip maxrss -3.7% (-10.3 σ)
- rbmap_fbip task-clock 67.3% (59.2 σ)
- rbmap_fbip wall-clock 67.8% (60.2 σ)
- rbmap_library instructions 4.7% (2120.1 σ)
- rbmap_library maxrss 17.1% (185.9 σ)
+ reduceMatch instructions -5.3% (-31.8 σ)
+ simp_arith1 branches -2.4% (-70.6 σ)
+ simp_arith1 instructions -3.3% (-104.5 σ)
+ stdlib instructions -3.1% (-23761.6 σ)
+ stdlib maxrss -6.4% (-16.5 σ)
- stdlib process pre-definitions 7.2% (15.2 σ)
+ stdlib share common exprs -3.1% (-15.5 σ)
- stdlib tactic execution 7.6% (14.5 σ)
- stdlib task-clock 2.2% (30.8 σ)
+ tests/bench/ interpreted maxrss -3.0% (-28.7 σ)
- tests/compiler sum binary sizes 4.3%
- unionfind instructions 14.7% (1857.9 σ)
- unionfind maxrss 29.1% (436.4 σ)
- workspaceSymbols instructions 4.9% (22635.1 σ) |
|
!bench |
|
Here are the benchmark results for commit f50035c. Benchmark Metric Change
===================================================================================
- Init.Data.List.Sublist async branches 1.1% (72.7 σ)
+ Init.Data.List.Sublist async maxrss -30.8% (-96.1 σ)
- Init.Prelude async branches 3.2% (142.7 σ)
- Init.Prelude async instructions 2.1% (84.7 σ)
+ Init.Prelude async maxrss -23.6% (-101.5 σ)
+ Std.Data.DHashMap.Internal.RawLemmas branch-misses -6.3% (-32.4 σ)
- Std.Data.DHashMap.Internal.RawLemmas branches 4.1% (810.5 σ)
- Std.Data.DHashMap.Internal.RawLemmas instructions 3.0% (614.6 σ)
+ Std.Data.DHashMap.Internal.RawLemmas maxrss -17.9% (-32.3 σ)
- Std.Data.DHashMap.Internal.RawLemmas task-clock 6.9% (18.3 σ)
- Std.Data.DHashMap.Internal.RawLemmas wall-clock 6.1% (46.7 σ)
+ Std.Data.Internal.List.Associative branch-misses -4.6% (-21.7 σ)
- Std.Data.Internal.List.Associative branches 4.5% (2349.4 σ)
- Std.Data.Internal.List.Associative instructions 3.4% (867.1 σ)
+ Std.Data.Internal.List.Associative maxrss -12.9% (-27.9 σ)
+ big_do branches -5.9% (-2106.3 σ)
+ big_do instructions -7.5% (-3012.9 σ)
- big_omega.lean branches 3.3% (156.7 σ)
- big_omega.lean instructions 2.9% (173.0 σ)
- big_omega.lean maxrss 21.9% (398.0 σ)
- big_omega.lean MT branches 3.3% (681.7 σ)
- big_omega.lean MT instructions 2.8% (603.2 σ)
- big_omega.lean MT maxrss 21.9% (531.6 σ)
- big_omega.lean MT task-clock 18.5% (15.5 σ)
- big_omega.lean MT wall-clock 17.2% (14.3 σ)
- binarytrees instructions 26.3% (11342.3 σ)
- binarytrees task-clock 16.3% (12.9 σ)
- binarytrees.st instructions 26.7% (28444.0 σ)
- binarytrees.st maxrss 21.6% (97.3 σ)
- binarytrees.st task-clock 34.0% (191.1 σ)
- binarytrees.st wall-clock 34.0% (190.8 σ)
- bv_decide_inequality.lean branches 4.8% (121.0 σ)
- bv_decide_inequality.lean instructions 3.7% (90.0 σ)
+ bv_decide_inequality.lean maxrss -16.5% (-25.0 σ)
- bv_decide_mod branches 5.9% (28.2 σ)
- bv_decide_mod instructions 4.9% (22.0 σ)
- bv_decide_mul branches 7.4% (104.9 σ)
- bv_decide_mul instructions 6.5% (93.2 σ)
- bv_decide_mul task-clock 13.5% (19.1 σ)
- bv_decide_mul wall-clock 13.7% (22.0 σ)
- bv_decide_realworld branches 5.2% (47.0 σ)
- bv_decide_realworld instructions 3.9% (29.4 σ)
+ bv_decide_realworld maxrss -11.1% (-149.9 σ)
- bv_decide_realworld task-clock 6.1% (10.8 σ)
- bv_decide_realworld wall-clock 6.9% (18.4 σ)
- const_fold instructions 12.7% (75.7 σ)
- const_fold maxrss 22.2% (907.0 σ)
- deriv instructions 14.6% (205.2 σ)
- deriv maxrss 34.7% (524.8 σ)
- deriv task-clock 18.0% (41.9 σ)
- deriv wall-clock 18.2% (42.1 σ)
- identifier auto-completion branches 10.6% (396.3 σ)
- identifier auto-completion completion 19.9% (13.7 σ)
- identifier auto-completion instructions 9.9% (402.0 σ)
- identifier auto-completion task-clock 23.4% (28.3 σ)
- identifier auto-completion wall-clock 18.6% (24.9 σ)
- ilean roundtrip branches 10.6% (65.1 σ)
- ilean roundtrip compress 38.1% (25.6 σ)
- ilean roundtrip instructions 11.1% (180.5 σ)
- ilean roundtrip maxrss 1.2% (62.4 σ)
- ilean roundtrip parse 19.6% (42.6 σ)
- ilean roundtrip task-clock 25.2% (33.7 σ)
- ilean roundtrip wall-clock 25.3% (33.8 σ)
- import Lean branches 5.3% (48.1 σ)
- import Lean instructions 4.7% (50.2 σ)
+ lake build clean instructions -2.2% (-137.8 σ)
+ lake build clean maxrss -31.0% (-1712.1 σ)
- lake build clean task-clock 10.9% (47.7 σ)
+ lake build clean wall-clock -16.2% (-56.3 σ)
+ lake build no-op maxrss -19.2% (-271.9 σ)
- lake config elab instructions 1.7% (68.6 σ)
- lake config elab task-clock 6.2% (11.0 σ)
- lake config elab wall-clock 7.2% (12.7 σ)
- lake config import instructions 5.8% (102.4 σ)
- lake config import maxrss 2.2% (39.6 σ)
- lake config tree instructions 5.3% (208.6 σ)
- lake config tree maxrss 2.0% (15.6 σ)
- lake env instructions 5.8% (377.8 σ)
- lake env maxrss 2.2%
+ lake startup maxrss -2.6% (-218.7 σ)
+ language server startup instructions -1.4% (-63.2 σ)
+ language server startup maxrss -4.9% (-126.5 σ)
+ liasolver instructions -13.4% (-1029.9 σ)
- libleanshared.so binary size 3.7%
- nat_repr branches 1.7% (424.5 σ)
+ omega_stress.lean async branch-misses -11.8% (-115.0 σ)
+ omega_stress.lean async branches -4.7% (-133.5 σ)
+ omega_stress.lean async instructions -5.9% (-146.8 σ)
+ omega_stress.lean async maxrss -11.9% (-25.1 σ)
- omega_stress.lean async wall-clock 6.4% (14.4 σ)
- parser instructions 7.0% (1619.9 σ)
- qsort instructions 3.9% (3025.7 σ)
- rbmap instructions 1.6% (481.0 σ)
- rbmap maxrss 8.7% (56.0 σ)
- rbmap_1 instructions 16.8% (1280.1 σ)
- rbmap_1 maxrss 19.2% (1545.3 σ)
- rbmap_1 task-clock 26.7% (38.8 σ)
- rbmap_1 wall-clock 26.8% (38.7 σ)
- rbmap_10 instructions 6.2% (156.8 σ)
- rbmap_10 maxrss 16.3% (362.0 σ)
- rbmap_10 task-clock 8.4% (10.2 σ)
- rbmap_10 wall-clock 8.6% (10.4 σ)
- rbmap_fbip instructions 28.8% (7102.5 σ)
+ rbmap_fbip maxrss -3.7%
- rbmap_fbip task-clock 71.2% (20.2 σ)
- rbmap_fbip wall-clock 71.7% (20.4 σ)
- rbmap_library instructions 4.8% (1078.6 σ)
- rbmap_library maxrss 17.1% (70.3 σ)
+ reduceMatch instructions -4.9% (-38.0 σ)
+ simp_arith1 branches -1.9% (-25.8 σ)
+ simp_arith1 instructions -3.1% (-42.2 σ)
+ stdlib instantiate metavars -5.3% (-16.2 σ)
+ stdlib instructions -2.7% (-608.6 σ)
- stdlib process pre-definitions 8.3% (414.6 σ)
+ stdlib share common exprs -3.0% (-12.2 σ)
- stdlib wall-clock 4.6% (40.6 σ)
+ tests/bench/ interpreted maxrss -3.0% (-29.0 σ)
- tests/compiler sum binary sizes 4.3%
- unionfind instructions 15.9% (11615.9 σ)
- unionfind maxrss 29.0% (223.9 σ)
- unionfind task-clock 31.5% (10.7 σ)
- unionfind wall-clock 31.7% (10.8 σ)
- workspaceSymbols instructions 5.4% (1513.6 σ)
- workspaceSymbols task-clock 18.5% (17.3 σ)
- workspaceSymbols wall-clock 18.7% (17.5 σ) |
|
!bench |
|
Here are the benchmark results for commit 84efbd3. |
|
!bench |
|
Here are the benchmark results for commit 81b189d. |
|
!bench |
|
Here are the benchmark results for commit 301d638. |
0ec1c29 to
ffd987d
Compare
|
!bench |
|
Here are the benchmark results for commit ffd987d. |
|
!bench |
|
Here are the benchmark results for commit 47d8c05. |
|
!bench |
|
Here are the benchmark results for commit c71e026. |
* chore: bump to nightly-2025-03-22 * Merge master into nightly-testing * chore: adaptations for nightly-2025-03-22 * fix * Merge master into nightly-testing * protected * chore: bump to nightly-2025-03-24 * update batteries and aesop * fixes for count_cons_of_ne * fix fix to Sym.inhabitedSym' (need `default`, not the `a` that happens to be in context) * bump heartbeats in MathlibTest/observe * fix, deprecated * fix merge * Update lean-toolchain for testing leanprover/lean4#7672 * Trigger CI for leanprover/lean4#7672 * update batteries * fixes for leanprover/lean4#7672 * Trigger CI for leanprover/lean4#7672 * Trigger CI for leanprover/lean4#7672 * Trigger CI for leanprover/lean4#7672 * Trigger CI for leanprover/lean4#7672 * Trigger CI for leanprover/lean4#7672 * update batteries * fixes * chore: bump to nightly-2025-03-25 * Trigger CI for leanprover/lean4#7672 * Trigger CI for leanprover/lean4#7672 * update batteries * Trigger CI for leanprover/lean4#7672 * one fix * fixes * maxheartbeats * fixes for leanprover/lean4#7672 * fixes for leanprover/lean4#7672 * fixes for leanprover/lean4#7672 * fixes for leanprover/lean4#7672 * cleanups * . * chore: bump to nightly-2025-03-26 * update aesop * Update lean-toolchain for testing leanprover/lean4#7690 * Trigger CI for leanprover/lean4#7690 * Trigger CI for leanprover/lean4#7690 * maxHeartbeats * max heartbeats * invalidate cache * another heartbeats * bump batteries * deprecation * Update lean-toolchain for testing leanprover/lean4#7692 * Delete * chore: bump to nightly-2025-03-27 * update batteries * bump batteries * many more maxHeartbeats * chore: bump leantar v0.1.15 * invalidate cache * cache flush, take 2 * feat(Cache): root hash generation counter * 1-line fix * chore: bump to nightly-2025-03-28 * update deps * remove upstreamed * remove all adaptation notes, hooray * merge lean-pr-testing-7692 * chore: bump to nightly-2025-03-29 * Update lean-toolchain for testing leanprover/lean4#7717 * fix * Trigger CI for leanprover/lean4#7717 * Trigger CI for leanprover/lean4#7717 * fixes * fixes * Trigger CI for leanprover/lean4#7717 * fixes * fixes * fixes * fixes * got through all of mathlib * missing space * fix tests * Trigger CI for leanprover/lean4#7717 * Trigger CI for leanprover/lean4#7717 * Merge master into nightly-testing * chore: bump to nightly-2025-03-30 * Update lean-toolchain for testing leanprover/lean4#7710 * Trigger CI for leanprover/lean4#7710 * Trigger CI for leanprover/lean4#7710 * Merge master into nightly-testing * Trigger CI for leanprover/lean4#7717 * chore: remove fragile tests * chore: remove fragile tests * Trigger CI for leanprover/lean4#7710 * Update lean-toolchain for testing leanprover/lean4#7736 * how did this get dropped? * Trigger CI for leanprover/lean4#7710 * Trigger CI for leanprover/lean4#7736 * deprecations in Cache * fix * chore: bump to nightly-2025-03-31 * update batteries * merge fixes * fix * fixes * Update lean-toolchain for testing leanprover/lean4#7756 * fix * Update lean-toolchain for testing leanprover/lean4#7764 * eliminate some 7717 adaptation notes * Start fixing, upstream changes needed * Trigger CI for leanprover/lean4#7756 * Fix * Fix * Fix * Fix * Trigger CI for leanprover/lean4#7756 * Trigger CI for leanprover/lean4#7756 * chore: bump to nightly-2025-04-01 * lake update * Trigger CI for leanprover/lean4#7756 * Fix * Fix * Update batteries * Fix * Fix * Trigger CI for leanprover/lean4#7756 * lake update * Fix * Fix * fix for auxlemma detection * remove mention of aux proof in Mathlib.Control.Fix * fix whatsnew * better fix * argh * unfold aux lemmas before transforming aux decls * fixes * optimization: abstract nested proofs in toadditive * docstring * revert change * fix merge * deprecations in shake * fixes * fixes * chore: bump to nightly-2025-04-02 * lake update * fixes * fix test * fix * fix merge --------- Co-authored-by: leanprover-community-mathlib4-bot <leanprover-community-mathlib4-bot@users.noreply.github.com> Co-authored-by: Kyle Miller <kmill31415@gmail.com> Co-authored-by: Johan Commelin <johan@commelin.net> Co-authored-by: Markus Himmel <markus@lean-fro.org> Co-authored-by: Mario Carneiro <di.gama@gmail.com> Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
This PR improves memory use of Lean, especially for longer-running server processes, by up to 60%