[MRG+1] Backend hints and shared memory constraints by ogrisel · Pull Request #602 · joblib/joblib

ogrisel · 2018-01-24T14:57:47Z

Sorry, I deleted my remote branch by accident and it closed #595. Here is the same PR again.

This is an alternative implementation of #537. I reimplemented it from scratch because joblib had diverged a bit and I did not agree with the semantics of constraints violations in #537.

TODO:

test by running RandomForestClassifier on dask-distributed cluster and check that fit run trees in parallel (done with this branch of sklearn: scikit-learn/scikit-learn@master...ogrisel:joblib-backend-hints);
decide whether we should use prefer='threads' and require='sharedmem' or keep the currently implemented boolean flags for hinting and hard constraints;
update Parallel docstring to document the new options.

codecov · 2018-01-24T15:18:03Z

Codecov Report

Merging #602 into master will increase coverage by 0.09%.
The diff coverage is 96.58%.

@@            Coverage Diff             @@
##           master     #602      +/-   ##
==========================================
+ Coverage   94.83%   94.93%   +0.09%     
==========================================
  Files          39       39              
  Lines        5287     5389     +102     
==========================================
+ Hits         5014     5116     +102     
  Misses        273      273

Impacted Files	Coverage Δ
joblib/parallel.py	`98.7% <100%> (+0.11%)`	⬆️
joblib/_parallel_backends.py	`94.82% <100%> (+0.96%)`	⬆️
joblib/test/test_parallel.py	`95.9% <94.52%> (-0.21%)`	⬇️
joblib/backports.py	`93.75% <0%> (-2.09%)`	⬇️
joblib/_store_backends.py	`90.47% <0%> (-0.53%)`	⬇️
joblib/test/test_memory.py	`98.14% <0%> (+0.37%)`	⬆️
joblib/memory.py	`95.54% <0%> (+0.59%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update feb1188...808b27a. Read the comment docs.

GaelVaroquaux · 2018-01-25T16:30:34Z

This looks great. I am in favor of merging it.

I'd be interested in having the point of view of @jcrist, given that this is work that he started. @jcrist , if you have time to review it?

TomAugspurger · 2018-01-25T17:39:53Z

Quick results from training a RandomForestClassifier and an ExtraTreesClassifier using (@jcrist's benchmark)

threading (4 cores I think)
loky (4 cores I think)
dask.distributed cluster with 8 workers, 4 cores each

Classification performance:

Classifier	backend	train-time	test-time	error-rate
RandomForest	dask.distributed	11.9515s	0.6102s	0.0296
RandomForest	threading	25.0732s	0.6096s	0.0296
RandomForest	loky	30.9979s	0.6114s	0.0296
ExtraTreesClassifier	dask.distributed	16.1022s	0.8105s	0.0325
ExtraTreesClassifier	threading	27.1128s	0.7095s	0.0325
ExtraTreesClassifier	loky	32.8696s	0.8143s	0.0325

https://github.com/tomaugspurger/joblib-distributed-benchmark. I need to re-run the multiply-nested CV example with the correct patches applied.

ogrisel · 2018-01-25T21:46:06Z

Thanks. This confirms that threading is optimal for RF dominated workloads when run on a single machine. The individual tasks in this benchmark are probably too short to be efficiently run on a distributed cluster (compared to the amount of computing resource used (~2x speedup for 8x more machines). The data is probably transferred unnecessarily repeatedly many times with the current state of the joblib distributed connector. Running the same benchmark with the dask-ml implementation of cross-validation and random search should make it possible to confirm this. A joblib.shelve (#593) primitive used automatically under the hood by Parallel on redundant input arguments might help fix this while preserving the eager numpy oriented API of scikit-learn. This should be explored in a separate PR though.

TomAugspurger · 2018-01-26T10:14:44Z

The data is probably transferred unnecessarily repeatedly many times with the current state of the joblib distributed connector.

Indeed, I forgot to scatter the data in the outer parallel_backend call. I'll repeat the benchmark later with this added.

jcrist

Apologies for the delay in review. Overall this looks fine to me. I left a few points that intersect with design issues I mentioned here: #537 (comment).

One rough edge on the backend selection code as currently implemented (and in this PR) is that n_jobs is coupled to the backend, and can be specified in a number of different ways:

Keyword to parallel_backend
Keyword to Parallel
Global default value

The defaults for these differ - parallel_backend defaults to -1 while Parallel defaults to 1. This makes it tricky in either case to determine if n_jobs was explicitly set or using the defaults. Both of these default values should maybe be changed to None to make detecting of explicit setting easier. The default semantics (either -1 or 1) should also probably be unified.

I believe the behavior should be:

If n_jobs is explicitly set in Parallel, it should be respected even if set in parallel_backend
If n_jobs is not explicitly set in Parallel, then n_jobs from parallel_backend should be used. However, if the backend from parallel_backend is ignored due to requirements, then the n_jobs from parallel_backend should also be ignored.

More discussion of this was given in #52 (the first take at allowing overrides).

jcrist · 2018-01-29T23:27:55Z

joblib/parallel.py

+                      "as the latter does not provide shared memory semantics."
+                      % (sharedmem_backend.__class__.__name__,
+                         backend.__class__.__name__))
+            return sharedmem_backend, n_jobs


When falling back on the backend as it doesn't satisfy the constraints, I think you should also fall back to the n_jobs default instead of the one from the contextmanager. An all-or-nothing approach seems the easiest to reason about to me.

jcrist · 2018-01-29T23:31:56Z

joblib/parallel.py

+            # fallback to the default thead-based backend.
+            sharedmem_backend = BACKENDS[DEFAULT_THREAD_BACKEND]()
+            if verbose >= 10:
+                print("Using %s as joblib.Parallel backend instead of %s "


This might be a project idiom, but I'd prefer a warning in this case rather than print if verbose >= 10. Warnings can be silenced as needed, but if I wrote code to use a certain backend and that backend is being ignored I'd like to know even if I have verbose set to 0.

Actually, I don't want to issue a warning in this case because the user cannot do anything to "fix" the cause of the problem: take the example fo random forests in scikit-learn: the current implementation relies on shared memory semantics in the prediction loop but not in the main fit loop. If the users is calling RF in a nested cross-val loop (which can itself benefit from parallelism). If the user uses a context manager to use the dask-distributed backend to parallelize as much as possible on all the levels, the prediction part should stay sequential (the default behavior) while all other parallel calls will benefit from dask (which would be more than enough to saturate all the cores in typical uses cases). There is no point in issuing a noisy warning that the user will not understand without understanding the details of the inner code of scikit-learn.

I have changed the verbosity of the parallel call itself to make it explicit which backend is used for each individual call however.

jcrist · 2018-01-29T23:35:59Z

joblib/_parallel_backends.py

    """

    supports_timeout = True
+    use_threads = True


Slight preference for uses_threads instead. use_threads feels a bit off grammatically, especially given other notations like supports_timeout.

ogrisel · 2018-01-30T06:29:17Z

Those are very good points.

… are not met.

Uses the new prefer / require keywords from joblib/joblib#602. This allows users to control how jobs are parallelized in more situations. For example, training a RandomForest on a cluster of machines with the dask backend. Closes scikit-learn#8804

ogrisel changed the title ~~Backend hints and shared memory constraints~~ [MRG] Backend hints and shared memory constraints Jan 24, 2018

ogrisel mentioned this pull request Jan 24, 2018

[MRG] Backend hints and shared memory constraints #595

Closed

3 tasks

ogrisel force-pushed the backend-hints branch from b63c1d4 to 79d9558 Compare January 24, 2018 15:24

GaelVaroquaux changed the title ~~[MRG] Backend hints and shared memory constraints~~ [MRG+1] Backend hints and shared memory constraints Jan 25, 2018

ogrisel mentioned this pull request Jan 25, 2018

[WIP] Implement backend requirements and preferences #537

Closed

ogrisel mentioned this pull request Jan 26, 2018

TST: Added another nested parallelism test dask/distributed#1710

Merged

jcrist reviewed Jan 29, 2018

View reviewed changes

ogrisel added 3 commits February 6, 2018 19:12

Backend hints and shared memory constraints

115d017

FIX typo: use_threads -> uses_threads

c8916d8

cosmit

037aee4

ogrisel force-pushed the backend-hints branch from c0d1342 to 037aee4 Compare February 6, 2018 18:12

ogrisel added 2 commits February 7, 2018 12:04

Use parallel_context by default n_jobs value unless harcd constraints…

ec4f4b7

… are not met.

Print effective backend and n_jobs in verbose output

808b27a

ogrisel merged commit cf66463 into joblib:master Feb 7, 2018

ogrisel deleted the backend-hints branch February 7, 2018 14:06

ogrisel mentioned this pull request Feb 19, 2018

Display the backend in joblib.Parallel with a high verbosity? #614

Closed

TomDLT mentioned this pull request Mar 8, 2018

Bug in nested context manager #642

Closed

TomAugspurger mentioned this pull request Jun 22, 2018

[WIP] Use Joblib backend hints rather than hardcoding scikit-learn/scikit-learn#11345

Closed

jakirkham mentioned this pull request Jul 2, 2018

Add ability to override joblib backend for scikit-learn estimators scikit-learn/scikit-learn#8804

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+1] Backend hints and shared memory constraints#602

[MRG+1] Backend hints and shared memory constraints#602
ogrisel merged 5 commits intojoblib:masterfrom
ogrisel:backend-hints

ogrisel commented Jan 24, 2018 •

edited

Loading

Uh oh!

codecov bot commented Jan 24, 2018 •

edited

Loading

Uh oh!

GaelVaroquaux commented Jan 25, 2018

Uh oh!

TomAugspurger commented Jan 25, 2018 •

edited

Loading

Uh oh!

ogrisel commented Jan 25, 2018 •

edited

Loading

Uh oh!

TomAugspurger commented Jan 26, 2018

Uh oh!

jcrist left a comment

Uh oh!

jcrist Jan 29, 2018

Uh oh!

jcrist Jan 29, 2018

Uh oh!

ogrisel Feb 7, 2018 •

edited

Loading

Uh oh!

jcrist Jan 29, 2018

Uh oh!

ogrisel commented Jan 30, 2018 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ogrisel commented Jan 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

GaelVaroquaux commented Jan 25, 2018

Uh oh!

TomAugspurger commented Jan 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Classification performance:

Uh oh!

ogrisel commented Jan 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TomAugspurger commented Jan 26, 2018

Uh oh!

jcrist left a comment

Choose a reason for hiding this comment

Uh oh!

jcrist Jan 29, 2018

Choose a reason for hiding this comment

Uh oh!

jcrist Jan 29, 2018

Choose a reason for hiding this comment

Uh oh!

ogrisel Feb 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcrist Jan 29, 2018

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Jan 30, 2018 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ogrisel commented Jan 24, 2018 •

edited

Loading

codecov bot commented Jan 24, 2018 •

edited

Loading

TomAugspurger commented Jan 25, 2018 •

edited

Loading

ogrisel commented Jan 25, 2018 •

edited

Loading

ogrisel Feb 7, 2018 •

edited

Loading