Skip to content

BLD: Revamp BLAS/LAPACK G77 ABI wrappers and fix PROPACK segfaults#19855

Merged
rgommers merged 17 commits intoscipy:mainfrom
thalassemia:g77_wrappers
Feb 28, 2024
Merged

BLD: Revamp BLAS/LAPACK G77 ABI wrappers and fix PROPACK segfaults#19855
rgommers merged 17 commits intoscipy:mainfrom
thalassemia:g77_wrappers

Conversation

@thalassemia
Copy link
Copy Markdown
Contributor

@thalassemia thalassemia commented Jan 10, 2024

Reference issue

Closes gh-15108. Includes all the changes in gh-18354.

What does this implement/fix?

Taking the lessons I learned from implementing Accelerate support, I rewrote the G77 BLAS/LAPACK ABI wrappers entirely in C. These wrappers should be compatible with any Fortran or C code that expects the standard C99 _Complex type. To make cython_blas and cython_lapack work despite Cython C using struct complex types, I let scipy/linalg/_generate_pyx.py continue to generate an intermediate Fortran wrapper between the Cython C and the G77 wrapper C. This PR includes a temporary commit (c69d03b) to make use of changes to the PROPACK repo that would ideally be merged concomitantly with this one.

With these changes, I believe I have fixed the segfaults and other issues previously seen with PROPACK, allowing me to re-enable all the relevant tests.

The motivation for rewriting the wrappers in C is to use macros like BLAS_FUNC and CBLAS_FUNC from npy_cblas.h to link BLAS/LAPACK symbols in a case-sensitive manner. This is required for Accelerate, which has symbols of the form cblas_cdotc_sub$NEWLAPACK$ILP64, and should make adding ILP64 support as simple as adding the compile argument -DHAVE_BLAS_ILP64.

Additional information

I had to tweak one of the tolerances in the (currently skipped) test_svds_parameter_tol test in test_svds.py for it to pass on my machine. I am not sure why there is a lower bound on the output error, but that was also causing some test failures on my 32-bit Linux VM.

@github-actions github-actions bot added scipy.sparse.linalg scipy.linalg scipy.sparse CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure C/C++ Items related to the internal C/C++ code base Fortran Items related to the internal Fortran code base Cython Issues with the internal Cython code base Meson Items related to the introduction of Meson as the new build system for SciPy BLD Build issues Issues with building from source, including different choices of architecture, compilers and OS labels Jan 10, 2024
@lucascolley lucascolley removed the BLD label Jan 10, 2024
@lucascolley lucascolley removed Fortran Items related to the internal Fortran code base Cython Issues with the internal Cython code base CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure labels Jan 10, 2024
@lucascolley lucascolley added this to the 1.13.0 milestone Jan 10, 2024
@thalassemia
Copy link
Copy Markdown
Contributor Author

thalassemia commented Jan 10, 2024

I've been noticing sporadic failures on the Windows cp311 (build sdist + wheel), full, no pythran test for a while now, and I do not know what is causing it.

Per comments on #19816, I think it would be worth taking some time to document and refactor scipy/linalg/_generate_pyx.py, from which I copied a lot of code when writing scipy/linalg/_generate_blas_wrapper.py in that PR. Should I create a separate PR for that or do it here?

It would also be nice to have a CI test for MKL (or Accelerate once that's ready) to ensure the wrappers are working. FWIW, building this PR with MKL passes all tests for me locally.

@h-vetinari
Copy link
Copy Markdown
Member

h-vetinari commented Jan 11, 2024

With these changes, I believe I have fixed the segfaults and other issues previously seen with PROPACK on 32-bit machines

That would indeed be amazing! 🤩

Just for clarity, the segfaults not only occurred on 32-bit machines, but also 64-bit ones, see conda-forge/scipy-feedstock#200. It's mainly on windows (where your wrappers already change/fix things), and MKL, which by default has a G77 ABI AFAIU.

Includes all the changes in gh-18354.

Please try to rebase the commit from #18354 and include it here (or directly include a revert of c73e088). If you want I can do it.

@thalassemia thalassemia changed the title BLD: Revamp BLAS/LAPACK G77 ABI wrappers and fix PROPACK on x86 BLD: Revamp BLAS/LAPACK G77 ABI wrappers and fix PROPACK segfaults Jan 11, 2024
@h-vetinari
Copy link
Copy Markdown
Member

Just for clarity, the segfaults not only occurred on 32-bit machines, but also 64-bit ones, [...]

In particular, it's conceivable that we need to decide on the complex type based on the convention of the fortran compiler being used. Compare for example this logic in blis.

@h-vetinari
Copy link
Copy Markdown
Member

h-vetinari commented Jan 11, 2024

Please try to rebase the commit from #18354 and include it here (or directly include a revert of c73e088).

An empty commit is not really helpful (won't show up in git blame etc.). Please rebase it (or a fresh revert of c73e088) to the start of this PR and layer your changes on top.

Edit: if you're not comfortable with git rebase -i, I can do this for you.

@rgommers
Copy link
Copy Markdown
Member

The one failure in the Linux aarch64 job isn't too concerning:

FAILED scipy/sparse/linalg/_eigen/tests/test_svds.py::Test_SVDS_PROPACK::test_svd_simple[csc_matrix-True-True-4-A1] - AssertionError: 
Not equal to tolerance rtol=1e-07, atol=3e-10

Mismatched elements: 2 / 16 (12.5%)
Max absolute difference: 6.35983951e-10
Max relative difference: 3.99680289e-15
 x: array([[ 1.000000e+00,  2.814834e-15,  2.269493e-12, -6.359840e-10],
       [ 2.814834e-15,  1.000000e+00,  5.029722e-14, -1.138435e-11],
       [ 2.269493e-12,  5.029722e-14,  1.000000e+00, -2.409678e-13],
       [-6.359840e-10, -1.138435e-11, -2.409678e-13,  1.000000e+00]])
 y: array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]]

The Windows build failure is unrelated. I've pushed a rebase, will see if we can get CI happy here.

@rgommers
Copy link
Copy Markdown
Member

One more, for macOS arm64:

scipy/sparse/linalg/_eigen/tests/test_svds.py:311: in test_svds_parameter_tol
    assert error < accuracy
E   assert 2.087038755703438e-15 < 2e-15

Due to a failure on Linux aarch64, visible in the Cirrus CI job.

Also a very small bump in an ARPACK test to avoid a test failure
on macOS arm64 in CI.

[skip actions] [skip circle]
Copy link
Copy Markdown
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is happy now. The changes in _build_utils/src all look good to me, as does re-enabling PROPACK.

I am still looking at _generate_pyx.py a bit more carefully. I tried removing suppression of incompatible pointer warnings (see rgommers@8647d65, important to do soon because it'll break with GCC 14), but this PR didn't make a change there yet - still seeing warnings like these for all six wrapped functions:

[38/41] Compiling C object scipy/linalg/cython_l...linux-gnu.so.p/meson-generated_cython_lapack.c.o
scipy/linalg/cython_lapack.cpython-310-x86_64-linux-gnu.so.p/cython_lapack.c: In function '__pyx_f_5scipy_6linalg_13cython_lapack_cladiv':
scipy/linalg/cython_lapack.cpython-310-x86_64-linux-gnu.so.p/cython_lapack.c:4344:33: warning: passing argument 1 of 'cladivwrp_' from incompatible pointer type [-Wincompatible-pointer-types]
 4344 |   F_FUNC(cladivwrp, CLADIVWRP)((&__pyx_v_out), ((npy_complex64 *)__pyx_v_x), ((npy_complex64 *)__pyx_v_y));
      |                                ~^~~~~~~~~~~~~
      |                                 |
      |                                 __pyx_t_float_complex *
In file included from scipy/linalg/cython_lapack.cpython-310-x86_64-linux-gnu.so.p/cython_lapack.c:1112:
scipy/linalg/_lapack_subroutines.h:23:50: note: expected '_Complex float *' but argument is of type '__pyx_t_float_complex *'
   23 | void F_FUNC(cladivwrp, CLADIVWRP)(npy_complex64 *ret, npy_complex64 *x, npy_complex64 *y);
      |                                   ~~~~~~~~~~~~~~~^~~
scipy

In the generated _blas_subroutines.h the signatures use numpy types:

void F_FUNC(zdotcwrp, ZDOTCWRP)(npy_complex128 *ret, int *n, npy_complex128 *zx, int *incx, npy_complex128 *zy, int *incy);

while the generated cython_lapack.c tries to use native types only for the first argument and numpy types for the third argument:

  F_FUNC(zladivwrp, ZLADIVWRP)((&__pyx_v_out), ((npy_complex128 *)__pyx_v_x), ((npy_complex128 *)__pyx_v_y));

with __pyx_v_out being declared as:

#if CYTHON_CCOMPLEX && (1) && (!0 || __cplusplus)
  #ifdef __cplusplus
    typedef ::std::complex< double > __pyx_t_double_complex;
  #else
    typedef double _Complex __pyx_t_double_complex;
  #endif
#else
    typedef struct { double real, imag; } __pyx_t_double_complex;
#endif

coming from cython_lapack.pyx:

cdef extern from "_lapack_subroutines.h":
    void _fortran_zladiv "F_FUNC(zladivwrp, ZLADIVWRP)"(z *out, npy_complex128 *x, npy_complex128 *y) nogil

It's a bit of a messy puzzle, I'm not quite sure what the right solution is. What do you think @thalassemia?

Copy link
Copy Markdown
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is excellent and amazing to see! :)

Just some high-level comments about naming & documentation, to keep this hopefully more easily maintainable.

@thalassemia
Copy link
Copy Markdown
Contributor Author

Thanks for pointing out those warnings and investigating @rgommers! On my machine, casting the return type variable to the expected Numpy type gets rid of those warnings (3ccf67c).

@rgommers
Copy link
Copy Markdown
Member

Thanks for pointing out those warnings and investigating @rgommers! On my machine, casting the return type variable to the expected Numpy type gets rid of those warnings (3ccf67c).

Thanks! Works for me too, and the generated code looks better. Would you mind including my commit to disable -Wno-incompatible-pointer-types in this PR? With that and the update to the PROPACK module, this looks ready to merge.

@thalassemia
Copy link
Copy Markdown
Contributor Author

Note the use of scipy-openblas wheels requires a scipy_ prefix, which in turn requires something like defining BLAS_SYMBOL_PREFIX=scipy_ (done by the pkgconfig file in the wheel) and using the BLAS_FUNC and CBLAS_FUNC macros.

Thanks for pointing this out @mattip! In #19816, I added a script to generate C wrappers that route all BLAS/LAPACK function calls to their properly prefixed and/or suffixed names. I plan on iterating on that PR once this one is merged.

Copy link
Copy Markdown
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ready and CI is happy, so in it goes. Thanks a lot @thalassemia. And thanks @h-vetinari for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Build issues Issues with building from source, including different choices of architecture, compilers and OS C/C++ Items related to the internal C/C++ code base Meson Items related to the introduction of Meson as the new build system for SciPy scipy.linalg scipy.sparse.linalg scipy.sparse

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Seg. fault in scipy.sparse.linalg tests in PROPACK

5 participants