Skip to content

BLD Enable parallel cythonization#14203

Merged
adrinjalali merged 5 commits intoscikit-learn:masterfrom
rth:parallel-cythonization
Jun 28, 2019
Merged

BLD Enable parallel cythonization#14203
adrinjalali merged 5 commits intoscikit-learn:masterfrom
rth:parallel-cythonization

Conversation

@rth
Copy link
Copy Markdown
Member

@rth rth commented Jun 27, 2019

Cythonization during build takes a while (~1 min on my laptop). This enables parallel cythonization by default when joblib.effective_n_jobs is available (older versions don't have it, and we don't want to make it a mandatory build dependency).

The scaling with CPU cores is almost ideal (I imagine up to 8-10 cores).

This should improve CI build time a bit, since most CI agents have 2 CPU cores that we can use.

try:
import joblib
n_jobs = getattr(joblib, "effective_n_jobs")()
except (ImportError, AttributeError):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just n_jobs = joblib.effective_n_jobs() ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joblib 0.11 doesn't have effective_n_jobs I believe

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean why do you need to use getattr ? An AttributeError would be raised also by trying to call joblib.effective_n_jobs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, it exists. My concern was that prior to joblib 0.13.0 (that included this fix https://github.com/tomMoral/loky/pull/114) was over-estimated in some cases such as CI that has 32 CPU cores, but actually only 2 can be used.

Changed this PR accordingly.

Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me very happy :) Thanks @rth

@adrinjalali adrinjalali merged commit 98ca716 into scikit-learn:master Jun 28, 2019
@rth rth deleted the parallel-cythonization branch June 28, 2019 10:54
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
* Parallel cythonization by default

* Address review comment

* Fix

* Better comment wording

* Use contextlib.supress
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants