-
Notifications
You must be signed in to change notification settings - Fork 38.7k
For AVX2 code, also check for AVX, XSAVE, and OS support #13471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note to reviewers: This pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
| bool AVXEnabled() | ||
| { | ||
| uint32_t a, d; | ||
| __asm__("xgetbv" : "=a"(a), "=d"(d) : "c"(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible to use _xgetbv intrinsic here? https://software.intel.com/en-us/node/694243
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that, but it doesn't seem GCC has that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like this?:
#ifndef _xgetbv
static inline unsigned long long _xgetbv(unsigned int index){
unsigned int eax, edx;
__asm__ __volatile__("xgetbv" : "=a"(eax), "=d"(edx) : "c"(index));
return ((unsigned long long)edx << 32) | eax;
}
#endif
https://lists.macports.org/pipermail/macports-dev/2013-January/021782.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same thing, just specialized for AVX (which AFAIK is the only thing it's useful for entirely).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From gcc code I don't think _xgetbv is a mcro
gcc-mirror/gcc@154452f#diff-547c091edc6345b286585818a4ae7528R60
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Access to _xgetbv requires compiling with -mxsave, and you can't run any -mxsave code without verifying you're running on a CPU which supports XSAVE. There are a few options around this:
- Add a separate libbitcoin_crypto_xsave.a library, with just the
AVXEnabled()function (seems a lot of effort) - Put
AVXEnabled()in libbitcoin_crypto_avx2.a (which would then be compiled with-mavx2 -mxsave). Downside is that we'd need to put it in all AVX libraries (there may be a-mavxand/or-mbmi2library too later). - Compile everything with
-mxsave(when supported by the compiler) but just don't call_xgetbvwithout first checking (which in theory might be unsafe as the compiler could emit xgetbv instructions elsewhere).
I think all 3 are worse than 1 line of inline asm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
| /** Check whether the OS has enabled AVX registers. */ | ||
| bool AVXEnabled() | ||
| { | ||
| uint32_t a, d; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need #ifdef USE_ASM ? I'm really not even sure what the option is intended to mean anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's within a #if defined(USE_ASM) && (defined(__x86_64__) || defined(__amd64__) || defined(__i386__)) block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, nevermind.
|
utACK 14a2ccc |
src/crypto/sha256.cpp
Outdated
| { | ||
| std::string ret = "standard"; | ||
| #if defined(USE_ASM) && (defined(__x86_64__) || defined(__amd64__) || defined(__i386__)) | ||
| (void)AVXEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Presuming this line is to avoid an unused function warning, how about instead either:
- Commenting as to its purpose
- Defining it in an
#elseblock below - Only defining
AVXEnabledwhendefined(ENABLE_AVX2) && !defined(BUILD_BITCOIN_INTERNAL)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment.
I'd rather not add logic around the compilation of the function itself, as it may grow increasingly complex in the future (if we also have avx1 code, for example). It's in an anonymous namespace anyway, so if it's unused, the compiler will drop it.
| } | ||
|
|
||
| /** Check whether the OS has enabled AVX registers. */ | ||
| bool AVXEnabled() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be static
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's inside an anonymous namespace.
14a2ccc to
32d153f
Compare
|
utACK 32d153f - note I did not separately research and validate the bit checking |
… support 32d153f For AVX2 code, also check for AVX, XSAVE, and OS support (Pieter Wuille) Pull request description: Fixes bitcoin#12903. Tree-SHA512: 01e71efb5d3a43c49a145a5b1dc4fe7d0a491e1e78479e7df830a2aaac57c3dcfc316e28984c695206c76f93b68e4350fc037ca36756ca579b7070e39c835da2
Fixes #12903.