Skip to content

Conversation

@1st1
Copy link
Member

@1st1 1st1 commented Jan 24, 2018

SSE 4.2 is pretty recent is there's plenty of hardware out there that doesn't support it. To minimize the risk of CPython build not running on older CPUs it's easier to just stop using the popcnt instruction.

I used the following micro-benchmark to make a decision to drop native popcount and always use the portable fallback code:

import time
from _testcapi import hamt


h = hamt()
for i in range(10000):
    h = h.set(str(i), i)

print(len(h), h.get('123'))

st = time.monotonic()
for _ in range(10**6):
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')
    h.get('123')

print(f'{time.monotonic() - st:.3f}s')

The results were the same on both with/without popcount builds.

To test the popcount instruction I've compiled CPython with `CFLAGS="-march=native". lldb session:

{pydev} ~/d/p/cpython (master %) » lldb -- ./python.exe t.py
(lldb) target create "./python.exe"
Current executable set to './python.exe' (x86_64).
(lldb) settings set -- target.run-args  "t.py"
(lldb) breakpoint set --name hamt_bitcount
Breakpoint 1: 5 locations.
(lldb) run
Process 59304 launched: './python.exe' (x86_64)
python.exe was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 59304 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.3
    frame #0: 0x00000001001171f9 python.exe`hamt_node_bitmap_assoc [inlined] hamt_bitcount(i=0) at hamt.c:446 [opt]
   443 	#if defined(__GNUC__) && (__GNUC__ > 4)
   444 	    return (uint32_t)__builtin_popcountl(i);
   445 	#elif defined(__clang__) && (__clang_major__ > 3)
-> 446 	    return (uint32_t)__builtin_popcountl(i);
   447 	#elif defined(_MSC_VER)
   448 	    return (uint32_t)__popcnt(i);
   449 	#else
Target 0: (python.exe) stopped.
(lldb) disassemble --pc
python.exe`hamt_node_bitmap_assoc:
->  0x1001171f9 <+41>: popcntq %rdi, %r12
    0x1001171fe <+46>: btl    %r13d, %eax
    0x100117202 <+50>: jae    0x1001173e6               ; <+534> at hamt.c
    0x100117208 <+56>: leal   (%r12,%r12), %eax

https://bugs.python.org/issue32436

@1st1 1st1 self-assigned this Jan 24, 2018
@1st1 1st1 added the skip news label Jan 24, 2018
@1st1 1st1 merged commit b7a80d5 into python:master Jan 24, 2018
@1st1 1st1 deleted the win branch January 24, 2018 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants