BLD: Revamp BLAS/LAPACK G77 ABI wrappers and fix PROPACK segfaults#19855
BLD: Revamp BLAS/LAPACK G77 ABI wrappers and fix PROPACK segfaults#19855rgommers merged 17 commits intoscipy:mainfrom
Conversation
|
I've been noticing sporadic failures on the Windows Per comments on #19816, I think it would be worth taking some time to document and refactor It would also be nice to have a CI test for MKL (or Accelerate once that's ready) to ensure the wrappers are working. FWIW, building this PR with MKL passes all tests for me locally. |
That would indeed be amazing! 🤩 Just for clarity, the segfaults not only occurred on 32-bit machines, but also 64-bit ones, see conda-forge/scipy-feedstock#200. It's mainly on windows (where your wrappers already change/fix things), and MKL, which by default has a G77 ABI AFAIU.
Please try to rebase the commit from #18354 and include it here (or directly include a revert of c73e088). If you want I can do it. |
7c15659 to
087d998
Compare
In particular, it's conceivable that we need to decide on the complex type based on the convention of the fortran compiler being used. Compare for example this logic in blis. |
An empty commit is not really helpful (won't show up in git blame etc.). Please rebase it (or a fresh revert of c73e088) to the start of this PR and layer your changes on top. Edit: if you're not comfortable with |
087d998 to
cb01f18
Compare
3060d1b to
c65abb1
Compare
|
The one failure in the Linux aarch64 job isn't too concerning: The Windows build failure is unrelated. I've pushed a rebase, will see if we can get CI happy here. |
8889d83 to
8650776
Compare
|
One more, for macOS arm64: |
Due to a failure on Linux aarch64, visible in the Cirrus CI job. Also a very small bump in an ARPACK test to avoid a test failure on macOS arm64 in CI. [skip actions] [skip circle]
8650776 to
abd8228
Compare
rgommers
left a comment
There was a problem hiding this comment.
CI is happy now. The changes in _build_utils/src all look good to me, as does re-enabling PROPACK.
I am still looking at _generate_pyx.py a bit more carefully. I tried removing suppression of incompatible pointer warnings (see rgommers@8647d65, important to do soon because it'll break with GCC 14), but this PR didn't make a change there yet - still seeing warnings like these for all six wrapped functions:
[38/41] Compiling C object scipy/linalg/cython_l...linux-gnu.so.p/meson-generated_cython_lapack.c.o
scipy/linalg/cython_lapack.cpython-310-x86_64-linux-gnu.so.p/cython_lapack.c: In function '__pyx_f_5scipy_6linalg_13cython_lapack_cladiv':
scipy/linalg/cython_lapack.cpython-310-x86_64-linux-gnu.so.p/cython_lapack.c:4344:33: warning: passing argument 1 of 'cladivwrp_' from incompatible pointer type [-Wincompatible-pointer-types]
4344 | F_FUNC(cladivwrp, CLADIVWRP)((&__pyx_v_out), ((npy_complex64 *)__pyx_v_x), ((npy_complex64 *)__pyx_v_y));
| ~^~~~~~~~~~~~~
| |
| __pyx_t_float_complex *
In file included from scipy/linalg/cython_lapack.cpython-310-x86_64-linux-gnu.so.p/cython_lapack.c:1112:
scipy/linalg/_lapack_subroutines.h:23:50: note: expected '_Complex float *' but argument is of type '__pyx_t_float_complex *'
23 | void F_FUNC(cladivwrp, CLADIVWRP)(npy_complex64 *ret, npy_complex64 *x, npy_complex64 *y);
| ~~~~~~~~~~~~~~~^~~
scipy
In the generated _blas_subroutines.h the signatures use numpy types:
void F_FUNC(zdotcwrp, ZDOTCWRP)(npy_complex128 *ret, int *n, npy_complex128 *zx, int *incx, npy_complex128 *zy, int *incy);while the generated cython_lapack.c tries to use native types only for the first argument and numpy types for the third argument:
F_FUNC(zladivwrp, ZLADIVWRP)((&__pyx_v_out), ((npy_complex128 *)__pyx_v_x), ((npy_complex128 *)__pyx_v_y));with __pyx_v_out being declared as:
#if CYTHON_CCOMPLEX && (1) && (!0 || __cplusplus)
#ifdef __cplusplus
typedef ::std::complex< double > __pyx_t_double_complex;
#else
typedef double _Complex __pyx_t_double_complex;
#endif
#else
typedef struct { double real, imag; } __pyx_t_double_complex;
#endifcoming from cython_lapack.pyx:
cdef extern from "_lapack_subroutines.h":
void _fortran_zladiv "F_FUNC(zladivwrp, ZLADIVWRP)"(z *out, npy_complex128 *x, npy_complex128 *y) nogilIt's a bit of a messy puzzle, I'm not quite sure what the right solution is. What do you think @thalassemia?
h-vetinari
left a comment
There was a problem hiding this comment.
Overall this is excellent and amazing to see! :)
Just some high-level comments about naming & documentation, to keep this hopefully more easily maintainable.
Thanks! Works for me too, and the generated code looks better. Would you mind including my commit to disable |
This will turn into a hard error in GCC 14, so we should no longer suppress this warning for convenience.
610160a to
4a57261
Compare
Thanks for pointing this out @mattip! In #19816, I added a script to generate C wrappers that route all BLAS/LAPACK function calls to their properly prefixed and/or suffixed names. I plan on iterating on that PR once this one is merged. |
rgommers
left a comment
There was a problem hiding this comment.
Looks ready and CI is happy, so in it goes. Thanks a lot @thalassemia. And thanks @h-vetinari for the review.
Reference issue
Closes gh-15108. Includes all the changes in gh-18354.
What does this implement/fix?
Taking the lessons I learned from implementing Accelerate support, I rewrote the G77 BLAS/LAPACK ABI wrappers entirely in C. These wrappers should be compatible with any Fortran or C code that expects the standard C99
_Complextype. To makecython_blasandcython_lapackwork despite Cython C using struct complex types, I letscipy/linalg/_generate_pyx.pycontinue to generate an intermediate Fortran wrapper between the Cython C and the G77 wrapper C. This PR includes a temporary commit (c69d03b) to make use of changes to the PROPACK repo that would ideally be merged concomitantly with this one.With these changes, I believe I have fixed the segfaults and other issues previously seen with PROPACK, allowing me to re-enable all the relevant tests.
The motivation for rewriting the wrappers in C is to use macros like
BLAS_FUNCandCBLAS_FUNCfromnpy_cblas.hto link BLAS/LAPACK symbols in a case-sensitive manner. This is required for Accelerate, which has symbols of the formcblas_cdotc_sub$NEWLAPACK$ILP64, and should make adding ILP64 support as simple as adding the compile argument-DHAVE_BLAS_ILP64.Additional information
I had to tweak one of the tolerances in the (currently skipped)
test_svds_parameter_toltest intest_svds.pyfor it to pass on my machine. I am not sure why there is a lower bound on the output error, but that was also causing some test failures on my 32-bit Linux VM.