Issue44340
Created on 2021-06-07 22:38 by holmanb, last changed 2021-09-08 17:29 by lukasz.langa. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 26585 | closed | holmanb, 2021-06-07 22:47 | |
| PR 27231 | merged | corona10, 2021-07-18 16:11 | |
| PR 28229 | merged | corona10, 2021-09-08 05:14 | |
| Messages (9) | |||
|---|---|---|---|
| msg395293 - (view) | Author: Brett Holman (holmanb) * | Date: 2021-06-07 22:38 | |
The existing --with-lto argument could be extended to pass through a value to select non-default lto compiler options: CC=clang ./configure --with-lto=thin This would allow default behavior to remain unchanged, while allowing those that want to use thin lto to opt in. For what it's worth, the tests (make test) pass using clang 11.1.0 and thinlto. |
|||
| msg397755 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2021-07-18 16:40 | |
I am now building the experiment environment to compare between thin-lto and full-lto |
|||
| msg397765 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2021-07-18 23:18 | |
FYI, Thin LTO shows enhanced build time. Full LTO (./configure --with-lto=full CC=clang) real 2m33.740s user 8m25.695s sys 0m13.124s Thin LTO (./configure --with-lto=thin CC=clang) real 1m51.867s user 12m53.694s sys 0m12.786s |
|||
| msg397766 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2021-07-18 23:47 | |
The test is executed under the following environments. There is no significant performance changed. MS Azure: D8s v3 CentOS Linux release 8.2.2004 (Core) [corona10@PythonLinux cpython]$ ./python -m pyperformance compare full.json thin.json full.json ========= Performance version: 1.0.2 Report on Linux-4.18.0-193.28.1.el8_2.x86_64-x86_64-with-glibc2.28 Number of logical CPUs: 8 Start date: 2021-07-18 23:18:04.644067 End date: 2021-07-18 23:44:20.951457 thin.json ========= Performance version: 1.0.2 Report on Linux-4.18.0-193.28.1.el8_2.x86_64-x86_64-with-glibc2.28 Number of logical CPUs: 8 Start date: 2021-07-18 22:46:00.717563 End date: 2021-07-18 23:12:19.376766 ### 2to3 ### Mean +- std dev: 570 ms +- 17 ms -> 568 ms +- 20 ms: 1.00x faster Not significant ### chameleon ### Mean +- std dev: 16.9 ms +- 0.6 ms -> 16.9 ms +- 0.9 ms: 1.00x slower Not significant ### chaos ### Mean +- std dev: 182 ms +- 7 ms -> 179 ms +- 7 ms: 1.02x faster Not significant ### crypto_pyaes ### Mean +- std dev: 198 ms +- 6 ms -> 192 ms +- 6 ms: 1.03x faster Significant (t=5.26) ### deltablue ### Mean +- std dev: 13.4 ms +- 0.5 ms -> 13.5 ms +- 0.5 ms: 1.01x slower Not significant ### django_template ### Mean +- std dev: 94.0 ms +- 3.2 ms -> 91.8 ms +- 3.7 ms: 1.02x faster Significant (t=3.53) ### dulwich_log ### Mean +- std dev: 178 ms +- 6 ms -> 176 ms +- 8 ms: 1.02x faster Not significant ### fannkuch ### Mean +- std dev: 764 ms +- 17 ms -> 755 ms +- 15 ms: 1.01x faster Not significant ### float ### Mean +- std dev: 194 ms +- 8 ms -> 187 ms +- 6 ms: 1.03x faster Significant (t=4.95) ### go ### Mean +- std dev: 388 ms +- 14 ms -> 387 ms +- 14 ms: 1.00x faster Not significant ### hexiom ### Mean +- std dev: 17.0 ms +- 0.7 ms -> 17.5 ms +- 0.8 ms: 1.03x slower Significant (t=-3.40) ### json_dumps ### Mean +- std dev: 22.5 ms +- 0.9 ms -> 22.3 ms +- 0.7 ms: 1.01x faster Not significant ### json_loads ### Mean +- std dev: 45.8 us +- 2.3 us -> 46.5 us +- 1.8 us: 1.02x slower Not significant ### logging_format ### Mean +- std dev: 19.1 us +- 0.9 us -> 18.7 us +- 0.7 us: 1.02x faster Not significant ### logging_silent ### Mean +- std dev: 336 ns +- 17 ns -> 334 ns +- 18 ns: 1.00x faster Not significant ### logging_simple ### Mean +- std dev: 17.1 us +- 0.8 us -> 16.7 us +- 0.8 us: 1.03x faster Significant (t=3.12) ### mako ### Mean +- std dev: 27.6 ms +- 1.6 ms -> 26.6 ms +- 0.9 ms: 1.04x faster Significant (t=4.11) ### meteor_contest ### Mean +- std dev: 172 ms +- 5 ms -> 169 ms +- 5 ms: 1.01x faster Not significant ### nbody ### Mean +- std dev: 232 ms +- 8 ms -> 224 ms +- 8 ms: 1.04x faster Significant (t=6.03) ### nqueens ### Mean +- std dev: 167 ms +- 7 ms -> 166 ms +- 7 ms: 1.00x faster Not significant ### pathlib ### Mean +- std dev: 38.2 ms +- 1.7 ms -> 37.4 ms +- 1.9 ms: 1.02x faster Significant (t=2.41) ### pickle ### Mean +- std dev: 19.4 us +- 0.8 us -> 19.5 us +- 0.8 us: 1.00x slower Not significant ### pickle_dict ### Mean +- std dev: 43.5 us +- 1.9 us -> 43.0 us +- 1.9 us: 1.01x faster Not significant ### pickle_list ### Mean +- std dev: 6.81 us +- 0.26 us -> 6.81 us +- 0.27 us: 1.00x slower Not significant ### pickle_pure_python ### Mean +- std dev: 840 us +- 28 us -> 825 us +- 28 us: 1.02x faster Not significant ### pidigits ### Mean +- std dev: 294 ms +- 9 ms -> 294 ms +- 9 ms: 1.00x slower Not significant ### pyflate ### Mean +- std dev: 1.17 sec +- 0.02 sec -> 1.16 sec +- 0.03 sec: 1.01x faster Not significant ### python_startup ### Mean +- std dev: 15.3 ms +- 0.6 ms -> 15.3 ms +- 0.6 ms: 1.00x faster Not significant ### python_startup_no_site ### Mean +- std dev: 10.3 ms +- 0.3 ms -> 10.2 ms +- 0.4 ms: 1.01x faster Not significant ### raytrace ### Mean +- std dev: 911 ms +- 19 ms -> 911 ms +- 21 ms: 1.00x faster Not significant ### regex_compile ### Mean +- std dev: 314 ms +- 12 ms -> 310 ms +- 10 ms: 1.01x faster Not significant ### regex_dna ### Mean +- std dev: 317 ms +- 10 ms -> 299 ms +- 9 ms: 1.06x faster Significant (t=9.99) ### regex_effbot ### Mean +- std dev: 6.20 ms +- 0.27 ms -> 5.80 ms +- 0.25 ms: 1.07x faster Significant (t=8.49) ### regex_v8 ### Mean +- std dev: 43.0 ms +- 1.3 ms -> 39.9 ms +- 1.9 ms: 1.08x faster Significant (t=10.22) ### richards ### Mean +- std dev: 158 ms +- 8 ms -> 157 ms +- 8 ms: 1.01x faster Not significant ### scimark_fft ### Mean +- std dev: 727 ms +- 18 ms -> 716 ms +- 18 ms: 1.02x faster Not significant ### scimark_lu ### Mean +- std dev: 309 ms +- 11 ms -> 304 ms +- 10 ms: 1.01x faster Not significant ### scimark_monte_carlo ### Mean +- std dev: 180 ms +- 6 ms -> 181 ms +- 8 ms: 1.00x slower Not significant ### scimark_sor ### Mean +- std dev: 355 ms +- 9 ms -> 352 ms +- 11 ms: 1.01x faster Not significant ### scimark_sparse_mat_mult ### Mean +- std dev: 9.51 ms +- 0.32 ms -> 9.19 ms +- 0.34 ms: 1.03x faster Significant (t=5.27) ### spectral_norm ### Mean +- std dev: 277 ms +- 10 ms -> 272 ms +- 9 ms: 1.02x faster Not significant ### sqlalchemy_declarative ### Mean +- std dev: 273 ms +- 10 ms -> 273 ms +- 10 ms: 1.00x faster Not significant ### sqlalchemy_imperative ### Mean +- std dev: 46.2 ms +- 2.3 ms -> 45.6 ms +- 2.1 ms: 1.01x faster Not significant ### sqlite_synth ### Mean +- std dev: 4.39 us +- 0.29 us -> 4.37 us +- 0.23 us: 1.00x faster Not significant ### sympy_expand ### Mean +- std dev: 996 ms +- 22 ms -> 993 ms +- 31 ms: 1.00x faster Not significant ### sympy_integrate ### Mean +- std dev: 42.6 ms +- 1.9 ms -> 43.5 ms +- 2.0 ms: 1.02x slower Significant (t=-2.47) ### sympy_str ### Mean +- std dev: 607 ms +- 18 ms -> 599 ms +- 15 ms: 1.01x faster Not significant ### sympy_sum ### Mean +- std dev: 350 ms +- 12 ms -> 344 ms +- 11 ms: 1.02x faster Not significant ### telco ### Mean +- std dev: 11.3 ms +- 0.6 ms -> 11.2 ms +- 0.6 ms: 1.01x faster Not significant ### tornado_http ### Mean +- std dev: 294 ms +- 11 ms -> 296 ms +- 12 ms: 1.01x slower Not significant ### unpack_sequence ### Mean +- std dev: 78.4 ns +- 7.1 ns -> 75.7 ns +- 2.5 ns: 1.04x faster Significant (t=2.86) ### unpickle ### Mean +- std dev: 26.1 us +- 1.1 us -> 27.3 us +- 1.3 us: 1.05x slower Significant (t=-5.57) ### unpickle_list ### Mean +- std dev: 6.65 us +- 0.21 us -> 6.68 us +- 0.31 us: 1.00x slower Not significant ### unpickle_pure_python ### Mean +- std dev: 567 us +- 21 us -> 572 us +- 21 us: 1.01x slower Not significant ### xml_etree_generate ### Mean +- std dev: 165 ms +- 8 ms -> 166 ms +- 8 ms: 1.00x slower Not significant ### xml_etree_iterparse ### Mean +- std dev: 187 ms +- 6 ms -> 187 ms +- 8 ms: 1.00x faster Not significant ### xml_etree_parse ### Mean +- std dev: 274 ms +- 10 ms -> 274 ms +- 11 ms: 1.00x slower Not significant ### xml_etree_process ### Mean +- std dev: 142 ms +- 6 ms -> 139 ms +- 6 ms: 1.02x faster Not significant |
|||
| msg397767 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2021-07-18 23:53 | |
clang version 11.0.0 |
|||
| msg397785 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2021-07-19 10:53 | |
New changeset b2cf2513f9184c850a69fab718532b4f7c6a003d by Dong-hee Na in branch 'main': bpo-44340: Add support for building with clang full/thin lto (GH-27231) https://github.com/python/cpython/commit/b2cf2513f9184c850a69fab718532b4f7c6a003d |
|||
| msg397786 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2021-07-19 10:55 | |
Now CPython 3.11 supports the Thin LTO, Thank you for the report and contribution, Brett! And also thank you Pablo and Gregory for the reviews! |
|||
| msg397855 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2021-07-20 06:51 | |
@ned.deily Can we use the thin-lto option for next macOS Python distribution? In my local environment, it passes all tests :) https://github.com/python/cpython/blob/366fcbac18e3adc41e3901580dbedb6a91e41a10/Mac/BuildScript/build-installer.py#L1199 FYI, Gentoo already recommends using the thin LTO instead of the full LTO. https://wiki.gentoo.org/wiki/Clang#Link-time_optimizations_with_Clang |
|||
| msg401415 - (view) | Author: Ćukasz Langa (lukasz.langa) * ![]() |
Date: 2021-09-08 17:29 | |
New changeset 84ca5fcd31541929f0031e974a434b95d8e78aab by Dong-hee Na in branch 'main': bpo-44340: Update whatsnews for ThinLTO (GH-28229) https://github.com/python/cpython/commit/84ca5fcd31541929f0031e974a434b95d8e78aab |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2021-09-08 17:29:40 | lukasz.langa | set | nosy:
+ lukasz.langa messages: + msg401415 |
| 2021-09-08 05:14:08 | corona10 | set | pull_requests: + pull_request26649 |
| 2021-07-20 06:51:58 | corona10 | set | nosy:
+ ned.deily messages: + msg397855 |
| 2021-07-19 10:55:14 | corona10 | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2021-07-19 10:55:07 | corona10 | set | messages: + msg397786 |
| 2021-07-19 10:53:01 | corona10 | set | messages: + msg397785 |
| 2021-07-18 23:53:02 | corona10 | set | messages: + msg397767 |
| 2021-07-18 23:47:12 | corona10 | set | messages: + msg397766 |
| 2021-07-18 23:18:53 | corona10 | set | messages: + msg397765 |
| 2021-07-18 17:10:50 | corona10 | set | versions: + Python 3.11 |
| 2021-07-18 16:40:36 | corona10 | set | messages: + msg397755 |
| 2021-07-18 16:11:03 | corona10 | set | pull_requests: + pull_request25779 |
| 2021-06-11 19:01:34 | FFY00 | set | nosy:
+ FFY00 |
| 2021-06-10 00:00:14 | corona10 | set | nosy:
+ corona10 |
| 2021-06-08 23:20:28 | ned.deily | set | nosy:
+ gregory.p.smith components: + Build, - Interpreter Core |
| 2021-06-07 22:47:48 | holmanb | set | keywords:
+ patch stage: patch review pull_requests: + pull_request25169 |
| 2021-06-07 22:38:11 | holmanb | create | |
