[MRG] Nested backend context manager#538
Conversation
7b48b85 to
5fa776e
Compare
5fa776e to
f4a0965
Compare
Codecov Report
@@ Coverage Diff @@
## master #538 +/- ##
==========================================
+ Coverage 94.78% 95.04% +0.25%
==========================================
Files 38 38
Lines 4969 5005 +36
==========================================
+ Hits 4710 4757 +47
+ Misses 259 248 -11
Continue to review full report at Codecov.
|
e934b17 to
05bae7b
Compare
7d45b07 to
ec61f0a
Compare
…d backend by default
ec61f0a to
66b74b6
Compare
…d backend by default
66b74b6 to
fca1e83
Compare
…grisel/joblib into ogrisel-nested-backend-context-manager
| delayed(nested_call, check_pickle=False)(i) | ||
| for i in range(5) | ||
| ) | ||
| assert ba.i == 1 |
There was a problem hiding this comment.
Does this look like the right number of calls to retrieve, given my Parallel calls?
There was a problem hiding this comment.
Yes, there is one only call to retrieve per call to Parallel.__call__.
This implements the new API introduced in joblib/joblib#538 for to ensure nested parallelism works correctly. This breaks API as previously `DaskDistributedBackend` had a `client` attribute with the `Client` that was used. Nested parallelism will be workers creating a `DaskDistributedBackend`, with no 'scheduler_host'. To avoid deadlocks, worker threads will secede and rejoin.
This implements the new API introduced in joblib/joblib#538 for to ensure nested parallelism works correctly. This breaks API as previously `DaskDistributedBackend` had a `client` attribute with the `Client` that was used. Nested parallelism will be workers creating a `DaskDistributedBackend`, with no 'scheduler_host'. To avoid deadlocks, worker threads will secede and rejoin.
This implements the new API introduced in joblib/joblib#538 for to ensure nested parallelism works correctly. This breaks API as previously `DaskDistributedBackend` had a `client` attribute with the `Client` that was used. Nested parallelism will be workers creating a `DaskDistributedBackend`, with no 'scheduler_host'. To avoid deadlocks, worker threads will secede and rejoin.
This implements the new API introduced in joblib/joblib#538 for to ensure nested parallelism works correctly. This breaks API as previously `DaskDistributedBackend` had a `client` attribute with the `Client` that was used. Nested parallelism will be workers creating a `DaskDistributedBackend`, with no 'scheduler_host'. To avoid deadlocks, worker threads will secede and rejoin.
This implements the new API introduced in joblib/joblib#538 for to ensure nested parallelism works correctly. This breaks API as previously `DaskDistributedBackend` had a `client` attribute with the `Client` that was used. Nested parallelism will be workers creating a `DaskDistributedBackend`, with no 'scheduler_host'. To avoid deadlocks, worker threads will secede and rejoin.
| 'retrieval_context' this could lead to deadlock, as all the workers | ||
| managed by the backend may be "busy" waiting for the nested parallel | ||
| calls to finish, but the backend has no free workers to execute those | ||
| tasks. |
There was a problem hiding this comment.
Very nice explanation :)
| delayed(nested_call, check_pickle=False)(i) | ||
| for i in range(5) | ||
| ) | ||
| assert ba.i == 1 |
There was a problem hiding this comment.
Yes, there is one only call to retrieve per call to Parallel.__call__.
|
|
||
| with parallel_backend("retrieval") as (ba, _): | ||
| Parallel(n_jobs=2)( | ||
| delayed(nested_call, check_pickle=False)(i) |
There was a problem hiding this comment.
We can probably remove that check_pickle=False now that loky is the default backend and that cloudpickle is used to pickle function definitions.
|
Merged! Thanks @TomAugspurger. |
While reviewing #516 (loky integration) and #537 (backend negotiation based on code hints / semantic constraints), I realized that the
with parallel_backend(backend_name):idiom should support nesting when possible.Here is a (currently failing) test to demonstrate what I mean. I will work on an implementation. If we ensure that backend instances should be picklable, that should be quite easy to implement although I am worried about the performance overhead it could induce.