In the context of #19816, we found that
[...] clang will also populate __GNUC__, and while the default value is not documented (from what I saw), it just so happens to be 4, see:
In other words, the 4 here is load-bearing for clang support, and we cannot increment it to our lowest supported GCC version, for example. We definitely need to document this.
While looking at bumping the GCC minimum version (#20478), I ended up stumbling over at least one place where we require GCC >=4.4 through macros, which will disable the respective feature on clang (which defaults to GCC 4.1).
|
#if defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && (__GNUC_MINOR__ >= 4))) |
There's also many places where __GNUC__ occurs in the boost submodule, though I haven't audited them all for correctness (and many of them do correctly add || defined(__clang__) or || __clang_major__ >= <some_major_ver>).
I propose that we add explicit guards like boost does (and simplify our logic to avoid __GNUC_MINOR__, since our minimum supported version is far beyond 4.x anyway). Other places that only check whether __GNUC__ is defined should be OK, as clang will set it by default, though we could be more explicit there as well.
Finally, a separate mention for places that intentionally avoid clang. I'm quite sure that in these cases, similar functionality exists on clang as well, and we should enable the same optimizations there as well:
|
#if !defined(__clang__) && defined(__GNUC__) && defined(__GNUC_MINOR__) |
|
#if __GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4) |
|
#pragma GCC optimize("unroll-loops") |
|
#define cephes_isnan(x) __builtin_isnan(x) |
|
#define cephes_isinf(x) __builtin_isinf(x) |
|
#define cephes_isfinite(x) __builtin_isfinite(x) |
|
#endif |
|
#endif |
|
#if !defined(__clang__) && defined(__GNUC__) && defined(__GNUC_MINOR__) |
|
#if __GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4) |
|
/* enable auto-vectorizer */ |
|
#pragma GCC optimize("tree-vectorize") |
|
/* float associativity required to vectorize reductions */ |
|
#pragma GCC optimize("unsafe-math-optimizations") |
|
/* maybe 5% gain, manual unrolling with more accumulators would be better */ |
|
#pragma GCC optimize("unroll-loops") |
|
#endif |
|
#endif |
In the context of #19816, we found that
While looking at bumping the GCC minimum version (#20478), I ended up stumbling over at least one place where we require
GCC >=4.4through macros, which will disable the respective feature on clang (which defaults to GCC 4.1).scipy/scipy/_lib/src/ccallback.h
Line 69 in b76ca45
There's also many places where
__GNUC__occurs in the boost submodule, though I haven't audited them all for correctness (and many of them do correctly add|| defined(__clang__)or|| __clang_major__ >= <some_major_ver>).I propose that we add explicit guards like boost does (and simplify our logic to avoid
__GNUC_MINOR__, since our minimum supported version is far beyond 4.x anyway). Other places that only check whether__GNUC__is defined should be OK, as clang will set it by default, though we could be more explicit there as well.Finally, a separate mention for places that intentionally avoid clang. I'm quite sure that in these cases, similar functionality exists on clang as well, and we should enable the same optimizations there as well:
scipy/scipy/special/cephes/mconf.h
Lines 91 to 98 in b76ca45
scipy/scipy/spatial/src/distance_wrap.c
Lines 35 to 44 in b76ca45