5x speedups without JIT compiler + portability...

Hi guys, 

one of my students just pointed me at your repository and I just wanted to briefly touch base and highlight some of my own prior work/research in optimizing the CPython interpreter. Pretty much ten years ago, I did extensive research on purely-interpretative optimizations, mostly through inline caching with quickening. Subsequent research on combining multiple different techniques led to maximum speedups of 5.5x, *without* requiring a JIT compiler. AFAICT, my research would ideally suited for stages 1 and 2 of your implementation plan. Due to some interest expressed on this research on Twitter about three months ago, I put together the paper and the many rejections it got in academia. If you're interested, please take a look: https://arxiv.org/abs/2109.02958

Based on my experience, it should be possible to obtain much of the proposed speedups by focusing just on the interpreter-based optimizations. A simple JIT is, IMHO, not going to provide more speedups beyond the interpreter. (Similar to having a template JIT vs an optimizing interpreter, as the template JIT mostly targets instruction dispatch costs, which are usually not dominant costs in Python.) A nice benefit is that this strategy maintains portability, and could even be used to provide optimizations across extensions written in C (numpy, etc.).

If I can provide any further explanations beyond the paper, please do let me know!

Other than that: Have a nice day and weekend, respectively & all the best from Munich,
--stefan



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

5x speedups without JIT compiler + portability... #93

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

5x speedups without JIT compiler + portability... #93

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions