CI/BLD: use scipy-openblas wheel when building#20585
Conversation
4737ad9 to
8214296
Compare
|
The |
rgommers
left a comment
There was a problem hiding this comment.
This need for LD_LIBRARY_PATH is quite annoying. @ev-br pinged me about an issue with this as well yesterday. I haven't been able to reproduce it myself, but my conclusion was that Libs: -L{libdir} -lscipy_openblas in the generated .pc file is not reliable since it depends on the loader's library search path configuration as well as whether site-packages is under sysroot or not, and it should be replaced with /abspath/to/scipy_openblas.so to ensure the built extension modules will always find scipy_openblas.so at runtime.
All of the boilerplate of copying shared libraries around should be reduceable to:
$ python -c "import scipy_openblas32; print(scipy_openblas32.get_pkg_config())" > scipy-openblas.pc
$ export PKG_CONFIG_PATH=$PWD
Does that sound right to you @mattip?
|
For LD_LIBRARY_PATH, my band-aid fix was to copy the .so file from build/ to build-install: https://github.com/ev-br/openblas-bench/blob/main/.github/workflows/codspeed.yml#L42 Still puzzling why the two libs have different linkage. |
|
LD_LIBRARY_PATH is needed at runtime, not build time. The linker arguments in the pkg_config file are for build time, not runtime. In a proper wheel, delocate/auditwheel are used so LD_LIBRARY_PATH is no longer needed. But in the CI workflows, we need to either copy the dependencies into somwhere like /usr/local/lib, which is what the current scripts do, or use LD_LIBRARY_PATH |
|
I know it's a runtime thing, I am saying that it should not be needed. It is bad for a build to succeed and then fail at runtime.
We have a goal of wanting it to work without vendoring. Also, folks are trying to use the I don't think it's hard. It should only require passing an absolute path rather than |
Makes sense. I can try this next week |
That seems to work on linux. How should this work on macos? Linking with the absolute path does not preserve the path for runtime, nor does using |
This is what conda-forge compilers use on macOS, so I expect it to work: It would be surprising if it works when given via |
|
The The missing part is that I need to do this in order to "burn" the I am still trying to figure out how to set the |
|
The or with: To test that, you can fix it locally like this: After that, the install name is correct: and adding You can also try checking install names of for example the tl;dr rebuilding |
I updated the wheels so those two lines are all that is needed. The error which indicates something is not working right with the |
|
Great, much nicer! I'll try to have a look at the The macOS job failed, detection didn't work: but I assume that's because this is now a pared-down PR focused on Linux? |
|
Yes, once the speediest build job to use a scipy-openblas wheel works (which I think happens to be the musllinux one), I will clean this all up for the other build jobs and wheel builds. |
|
If it helps to debug the problem, I can clean up the macos one so it too fails to apply the prefix? |
|
The reason the symbol prefix is not being honored is because we currently only generate the required BLAS/LAPACK wrappers for Accelerate builds to handle their special symbol suffixes. Since scipy-openblas has a special symbol prefix, we can fix this by enabling wrapper generation for those builds as well (the commit I added). The wrappers should only be generated when there is a symbol prefix or suffix because they result in infinite recursion when the wrapper symbols are the same as the BLAS/LAPACK symbols (as they are when there is no prefix/suffix). Could we automatically detect whether a BLAS/LAPACK library has prefixes/suffixes instead of enabling wrapper generation on a case-by-case basis? |
|
Ah of course, good catch @thalassemia.
Not fully automatically, but we can specify a set of "these are the set of potential prefixes/suffixes to check". E.g.: https://github.com/numpy/meson/blob/4e370ca8ab73c07f7b84abe8a4b937caace050a4/mesonbuild/dependencies/blas_lapack.py#L398-L409 For now doing it manually is the way to go, but we can add a standard set to check in Meson. I think it mainly matters for OpenBLAS (and most ILP64 builds of course). |
c9b77bb to
30c9033
Compare
|
The |
|
testing locally, setting OPENBLAS_NUM_THREADS=1 speeds up linalg tests by 50%, even when running tests without xdist. |
|
That solved the slow tests. Now a test is failing on numpy/numpy#26439: |
|
Should this PR remove the openblas preference from the meson defaults? It seems to be working when using scipy-openblas regardless of this setting Lines 10 to 18 in b421cd6 |
I don't think so. A PR for auto-detection in the same way as NumPy does it is still to be opened. That will overhaul all this, for now I don't see a reason to touch it. |
|
Ci (except for "owner approval") is passing. Should I trigger the wheel building, or leave it for a future PR? |
|
Great! Please do trigger the wheel builds, since the changes here affect wheel builds. |
|
CI is passing, I think this might be ready for a review. |
|
There was a merge conflict, so I rebased, cleaned this up a bit, and squashed all the commits. |
| # This does not work because pkg-config does not like backslashes, | ||
| PKG_CONFIG_PATH="{project}" | ||
| # do this instead (which will override this setting) | ||
| # set CIBW_ENVIRONMENT_WINDOWS=PKG_CONFIG_PATH=PWD.replace('\\', '/') |
There was a problem hiding this comment.
what is happening with this comment?
There was a problem hiding this comment.
You cannot set PKG_CONFIG_PATH to "{project}" on windows, the path native separators confuse pkg-config. So you have to set it via CIBW_ENVIRONMENT_WINDOWS instead. This is a comment for people wishing to build wheels that they must use that alternative. Should I remove it?
There was a problem hiding this comment.
Ah okay. I didn't really understand that from the comment - let's just clarify it instead.
To be honest I also don't understand the "this does not work" comment (which I now think is also aimed at folks wanting to build locally?), nor what "{project}" expands to and which tool does the expanding.
There was a problem hiding this comment.
Okay, let's deal with this loose end post-merge. Time to get this in, so we have it in the 1.14.x branch - that'll make maintenance easier.
|
Thanks Matti! This looks quite good, and makes things significantly easier to understand. |
|
It seems some of the windows tests are still slower now, there are occasional CI failures due to test durations being over the limits. Was that happening before this PR? |
|
I think what you're seeing is gh-20806, and that's due to CI machinery changes unrelated to this PR. |
|
|
||
| echo "REPAIR_PATH=$LIB_PATH" >> "$GITHUB_ENV" | ||
| GFORTRAN_LIB="\$(dirname \$(gfortran --print-file-name libgfortran.dylib))" | ||
| CIBW="DYLD_LIBRARY_PATH=$GFORTRAN_LIB:$LIB_PATH delocate-listdeps {wheel} &&\ |
There was a problem hiding this comment.
So the changes in this code block seem to be causing the problems. We already had two libgfortran's here, but I bet setting DYLD_LIBRARY_PATH here made OpenBLAS also see the one from gfortran rather than the one in the openblas tarball, or something like that.
Replaces #20074. Uses the scipy-openblas32 wheel for building Scipy
The first commit changes linux wheel builds to use the scipy-openblas32 wheel
Reference issue
What does this implement/fix?
Add a new cibw_before_build.sh scriptChangecibw_before_build*.shscripts to use scipy-openblas wheelsAdditional information
cc @rgommers