bpo-45214: add the LOAD_NONE opcode#28376
Conversation
e672cbd to
d31b1fb
Compare
gvanrossum
left a comment
There was a problem hiding this comment.
That’s an impeccable PR. Do you have hard perf numbers for PyPerformance? Pablo may be able to help you (he has a benchmark machine set up at work).
|
OOI, why did this end up as the chosen design, as opposed to |
I don't think it's been definitely decided yet (this is why I did the refactor PRs first so we can more easily experiment). The thinking is that None is so common that it may be faster for it to have a dedicated opcode (and another STORE_FAST_NONE). There will probably be another LOAD_COMMON_CONST opcode for True, False and a few others (zeros, 0.5, 2.0, ... ). Then there is the question of whether to fold MAKE_INT into that as I did in not PR or have a separate MAKE_INT opcode which takes items from the small-int array (I think you suggested to split out MAKE_INT before your vacation, or maybe it was Guido). |
👍 As an aside, do we know how common the use of |
|
To count that, one could do this experiment: take this PR, add. Dedicated RETURN_NONE op code, and count how many LOAD_NONE op codes are generated. Easy to add to tools/scripts/count_opcodes.py. |
|
I disabled a bunch of background stuff and meditated while the benchmarks ran, hopefully these results are real: |
|
We can probably get 1% from quite a few different ideas. Ten different
things each giving us 1% is 10%, so I am excited about this improvement!
|
|
I looked at dis output for worst benchmark, pidigits (that result seems repeatable). I don't see any difference except what we expect (I thought maybe some optimisation is missed). Maybe the PREDICTs I put in ceval are wrong for generators, not sure. Will continue looking. |
It's probably not repeatable, there's some noise and how I launch the benchmark seems to make a difference. (3 shell commands is noisy, one shell command that begins with "repeat 3" is more stable). I just repeated the whole things, the overall picture is |
|
Don't worry too much about pidigits. It mostly exercises very long integer arithmetic, so the difference is most likely due to some other random CPU cache effect. |
|
@pablogsal Is your benchmark machine less noisy than what I posted above? |
I will try to benchmark this tomorrow on my benchmark machine. If it results to be less noisy, I can try to give you access so you can benchmark there if you find it useful. |
|
I tried to benchmark but I keep getting errors regarding something with the freeze module: I tried to apply the diff clean on main and regenerate it but I keep getting them :( |
|
@ericsnowcurrently do you know why this may be happening? |
|
Oh, actually since the 15 of September all benchmarks on the benchmark server are failing with the same error :( |
|
I'm pretty sure Victor fixed this earlier. |
|
Here are the results of the benchark: |
|
@pablogsal Are your results consistent, as in if you compare two runs of the same python version you see no diff? I don't see a reason why LOAD_NONE would make unpickle 3% slower. Anyway, the picture emerging is that while timeit clearly shows "x=None" being about 8% faster with LOAD_NONE, overall this doesn't seem to make a measurable difference on benchmarks. I think something like a RETURN_NONE opcode which replaces a pair of opcodes. I will try that (but not sure how much time I will have this week - got some distractions). |
I ran it again and is quite consistent. Here is the comparison of |
|
I wouldn't give up on this completely yet, but it would be nice if we could measure e.g. the size decrease in pyc files. @pablogsal There is some weird shit in the table, e.g. Is that just something that was copy-pasted wrong? I could imagine that what you actually ran was and some tool expanded the bpo-45214 reference in the middle of the filename to Markdown, and that got picked up when you copied-pasted it. I also noticed that you specified |
Yeah, I have no idea where that is coming from. It may be GitHub recognizing the bpo prefix and doing some markdon shenaningans.
Yeah, I normally also use
Here is the full table: |
Confirmed, is GitHub. Try to make a comment with something like "bpo-45214": it will automatically add the link even if you write it as text |
|
All the analysis here is out of date now, closing. |
https://bugs.python.org/issue45214