-
Notifications
You must be signed in to change notification settings - Fork 53
Description
Hi guys,
one of my students just pointed me at your repository and I just wanted to briefly touch base and highlight some of my own prior work/research in optimizing the CPython interpreter. Pretty much ten years ago, I did extensive research on purely-interpretative optimizations, mostly through inline caching with quickening. Subsequent research on combining multiple different techniques led to maximum speedups of 5.5x, without requiring a JIT compiler. AFAICT, my research would ideally suited for stages 1 and 2 of your implementation plan. Due to some interest expressed on this research on Twitter about three months ago, I put together the paper and the many rejections it got in academia. If you're interested, please take a look: https://arxiv.org/abs/2109.02958
Based on my experience, it should be possible to obtain much of the proposed speedups by focusing just on the interpreter-based optimizations. A simple JIT is, IMHO, not going to provide more speedups beyond the interpreter. (Similar to having a template JIT vs an optimizing interpreter, as the template JIT mostly targets instruction dispatch costs, which are usually not dominant costs in Python.) A nice benefit is that this strategy maintains portability, and could even be used to provide optimizations across extensions written in C (numpy, etc.).
If I can provide any further explanations beyond the paper, please do let me know!
Other than that: Have a nice day and weekend, respectively & all the best from Munich,
--stefan