Skip to content

[MRG] Remove BLAS infos from show_versions and build log#14205

Merged
rth merged 13 commits intoscikit-learn:masterfrom
jeremiedbb:remove-blas-infos
Jul 18, 2019
Merged

[MRG] Remove BLAS infos from show_versions and build log#14205
rth merged 13 commits intoscikit-learn:masterfrom
jeremiedbb:remove-blas-infos

Conversation

@jeremiedbb
Copy link
Copy Markdown
Member

Fixes #14096

  • remove BLAS infos from show_versions()
  • remove BLAS from the dependencies in advanced installations doc
  • remove BLAS infos displayed in the build log

Under Debian-based operating systems, which include Ubuntu::
Installing from source without conda or pip requires you to have installed the
scikit-learn runtime dependencies, Python development headers and a working
C/C++ compiler. Under Debian-based operating systems, which include Ubuntu::
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to get the purpose of this section. It seems to give the dependencies if you want to build without pip or conda. But everywhere else we only explain how to install everything with pip.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to build from source. pip can't install python headers or compilers. If you want to build from source without conda you have to install compiler and headers.
Listing the python packages here is a bit strange, though. I think that is to avoid having to build them, these instructions predate binary wheels.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think part of this can be removed, but we should have instructions on how to build without conda.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see. I updated a bit. Let me know what you think

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Jul 2, 2019

Will we still want BLAS details from the scipy installation in order to diagnose efficiency issues??

@jeremiedbb
Copy link
Copy Markdown
Member Author

Will we still want BLAS details from the scipy installation in order to diagnose efficiency issues??

We can get some infos about BLAS from numpy and scipy installations with numpy/scipy.get_config(). I added them to show_versions.

However we need to be careful with this as it may not always display the right info. I don't know in which situation it can happen... For instance, if I pip install scipy in a conda env, the path in library_dirs points to usr/lib (where I have an openblas) instead of to the lib folder of site_packages/scipy in my conda env (ldd points to that folder).

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Jul 2, 2019 via email

@rth
Copy link
Copy Markdown
Member

rth commented Jul 2, 2019

Does that imply that get_config is buggy?

I think get_config correctly solves the problem of which BLAS to link against if you were to build from sources, but it won't tell you if numpy uses the MKL BLAS with conda and scipy uses OpenBlas as it was installed with pip.

For instance, https://stackoverflow.com/a/37190672/1791279 illustrates that there is no easy solution.

Maybe this is something threadpoolctl could detect I'm not sure, but for now I think it's OK to merge this as the currently displayed information can be misleading anyway.

@jeremiedbb
Copy link
Copy Markdown
Member Author

(sorry I wrote it wrong, it's show_config)

@jeremiedbb
Copy link
Copy Markdown
Member Author

thank's for the explanation.

threadpoolctl do detect it. It acts like ldd, so it finds the exact lib<whatever blas>.so that is loaded when you import numpy or scipy

/usr/lib/atlas-base/atlas/libblas.so.3
sudo update-alternatives --set liblapack.so.3 \
/usr/lib/atlas-base/atlas/liblapack.so.3
pip3 install numpy scipy cython
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not installing python3-numpy etc? e.g. on ARM you don't have wheel I think while there are distributions, in which case using the system numpy/scipy makes the most sense.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this is under installing build dependencies. I don't think installing system numpy scipy is the usual on ubuntu (which is the target of this command). Maybe we can add how to install system numpy scipy cython as an alternative when binary wheels aren't available ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remark of @rth is relevant regarding ARM is relevant. The last time I played with an ARM, I needed to install numpy from python3-numpy to be able to compile scikit-learn from source.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note to install system dependencies when wheels aren't available

numpy.show_config()

print('\nScipy BLAS/LAPACK:')
scipy.show_config()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prints half a page of text, we don't really want it in issue reports I think? Also, wouldn't it suffer from the same limitations as the previous _get_blas_info ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think show_config is more accurate in general. It's based on infos gathered when building numpy, whereas the previous _get_blas_info takes its infos from numpy.distutils.system_info.get_info which gather infos at run time.

But I'm really not sure how both work so I might be completely wrong...

/usr/lib/openblas-base/libopenblas.so.0
sudo update-alternatives --set liblapack.so.3 \
/usr/lib/lapack/liblapack.so.3
sudo apt-get install python3-matplotlib
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why system matplotlib? Seems inconsistent with numpy/scipy above.

On Red Hat and clones (e.g. CentOS), install the dependencies using::

sudo yum -y install gcc gcc-c++ numpy python-devel scipy
sudo yum -y install gcc gcc-c++ python-devel numpy scipy
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, inconsistent with above.
I don't care that much which way we go but we should be consistent. I slightly prefer the pip way.

@jeremiedbb
Copy link
Copy Markdown
Member Author

on ARM you don't have wheel I think

@rth maybe soon: see :)

@jeremiedbb
Copy link
Copy Markdown
Member Author

On second thought, I removed numpy/scipy show_config from show_versions. As Roman pointed out, the output is very large and can display a wrong information. Besides, it's not as before when we had a bundled BLAS and we needed to show infos about that. Most issues are not BLAS related, and when we suspect it might be we can ask users to report numpy/scipy.show_config().

@jeremiedbb
Copy link
Copy Markdown
Member Author

I tried to come up to more consistent instructions to install the build dependencies, but I don't know how to update it for centOS. Does anyone can help me with that ?

@glemaitre glemaitre added this to the 0.21.3 milestone Jul 18, 2019
Copy link
Copy Markdown
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see I did not see the change regarding both system and pip are mentioned.
So LGTM then.

Copy link
Copy Markdown
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, otherwise LGTM.

/usr/lib/lapack/liblapack.so.3
sudo apt-get install cython3 python3-numpy python3-scipy python3-matplotlib

On Red Hat and clones (e.g. CentOS), install the dependencies using::
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add

To use a high performance BLAS library (e.g. OpenBlas) see `scipy installation instructions <https://docs.scipy.org/doc/scipy/reference/building/linux.html>`_.

.. note::

To use a high performance BLAS library (e.g. OpenBlas) see
`scipy installation instructions <https://docs.scipy.org/doc/scipy/reference/building/linux.html>`_.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a line break, to keep below 80 char length.

@rth
Copy link
Copy Markdown
Member

rth commented Jul 18, 2019

Merging. Thanks @jeremiedbb !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

scikit-learn 0.21.x not linking MKL BLAS

5 participants