ENH: speed up 32-bit and 64-bit np.argsort by 5x with AVX-512 #23707
ENH: speed up 32-bit and 64-bit np.argsort by 5x with AVX-512 #23707charris merged 7 commits intonumpy:mainfrom
Conversation
|
PR numpy/x86-simd-sort#38 should resolve some of the test failures. |
|
Needs rebase. |
|
What is the reasoning behind using |
|
Segfault on 32 bit windows. I think it is legitimate, but may be because of |
I should stick to using |
Yeah, I should have expected that. My AVX512 argsort implementations use i64gather instructions which means I can only use it when |
|
Benchmark numbers: |
|
Let's give this a shot. Thanks @r-devulap . |
|
Do you think this needs a release note? I would like to combine release notes for #22315 and this, if that is okay. |
|
I was thinking a release note would be nice, thanks for offering :) The release notes are linked to the PR, so it is probably simpler to just make two. |
Leverages latest optimizations to
x86-simd-sortwhich provides AVX-512 routines forargsort. Algorithm details: