[MRG+1] Integrate loky as default backend#516
Conversation
Codecov Report
@@ Coverage Diff @@
## master #516 +/- ##
==========================================
+ Coverage 93.73% 94.16% +0.43%
==========================================
Files 37 39 +2
Lines 4785 4953 +168
==========================================
+ Hits 4485 4664 +179
+ Misses 300 289 -11
Continue to review full report at Codecov.
|
|
This is WIP because we need to deal with the dependencies (loky and cloudpickle). @tomMoral will work on a second PR that sits on top of this one to vendor those two dependencies into a |
804255e to
50953d0
Compare
|
I finished vendoring |
aabadie
left a comment
There was a problem hiding this comment.
Even if this PR is marked WIP, I wanted to have a look at it.
For the moment, I only made a code review and have a few comments. Since the CI passes, it should work well.
Do you have some benchmarks compared to multiprocessing/threading ?
Makefile
Outdated
|
|
||
| test: | ||
| pytest joblib | ||
|
|
There was a problem hiding this comment.
This empty line is not necessary
joblib/_memmapping_reducer.py
Outdated
| @@ -0,0 +1,357 @@ | |||
| # -*- coding: utf-8 -*- | |||
There was a problem hiding this comment.
A module docstring is missing. This is not the case in other internal modules.
joblib/_memmapping_reducer.py
Outdated
| # done processing this data. | ||
| if not os.path.exists(filename): | ||
| if self.verbose > 0: | ||
| print("Memmapping (shape=%r, dtype=%s) to new file %s" % ( |
There was a problem hiding this comment.
I prefer using .format() with string. This needs to be changed in other places as well.
joblib/_memmapping_reducer.py
Outdated
| return (loads, (dumps(a, protocol=HIGHEST_PROTOCOL),)) | ||
|
|
||
|
|
||
| def delete_folder(folder_path): |
There was a problem hiding this comment.
This is the kind if function that already exists in the disk.py module. Maybe it could be factorize here ?
joblib/_memmapping_reducer.py
Outdated
|
|
||
| if np is not None: | ||
| # Register smart numpy.ndarray reducers that detects memmap backed | ||
| # arrays and that is alse able to dump to memmap large in-memory |
joblib/test/test_parallel.py
Outdated
| # This is a non-regression test for the "Pool seems closed" class of error | ||
| params = {'n_jobs': n_jobs, 'pre_dispatch': pre_dispatch, | ||
| 'batch_size': batch_size} | ||
| 'batch_size': batch_size, 'verbose': 100} |
There was a problem hiding this comment.
is this change really required ?
joblib/test/test_parallel.py
Outdated
| try: | ||
| import cloudpickle # noqa | ||
| except ImportError: | ||
| # This test pass only when fork is the process start method or |
joblib/test/test_parallel.py
Outdated
|
|
||
| # Make sure that the shared memory is cleaned at the end when we exit | ||
| # the context | ||
| assert len(os.listdir(tmpdir)) == 0 |
There was a problem hiding this comment.
assert not os.listdir(tmpdir)
joblib/test/test_parallel.py
Outdated
| p = Parallel(n_jobs=2, max_nbytes=1, backend=backend) | ||
| p(delayed(check_memmap)(a) for a in [np.random.random(10)] * 2) | ||
|
|
||
| assert len(os.listdir(tmpdir)) == 0 |
There was a problem hiding this comment.
assert not os.listdir(tmpdir)
setup.cfg
Outdated
| --doctest-glob="doc/*.rst" | ||
| --doctest-modules | ||
| testpaths = joblib | ||
| testpaths = joblib/test |
There was a problem hiding this comment.
not sure we want to change this.
joblib/disk.py
Outdated
| err_count += 1 | ||
| if err_count > RM_SUBDIRS_N_RETRY: | ||
| warnings.warn( | ||
| "Enable to delete folder {} after {} tentatives." |
|
There's still an issue when removing a directory on Windows. It seems that there's still a file open in the directory and Windows prevents that kind of things... |
I ran various tests on a big rackspace machine (sklearn grid searches) and observed no performance regression over multiprocessing. |
3353076 to
653ffc2
Compare
Makefile
Outdated
|
|
||
| test-no-multiprocessing: | ||
| export JOBLIB_MULTIPROCESSING=0 && pytest joblib | ||
| export JOBLIB_MULTIPROCESSING=0 && pytest joblib -c setup.cfg |
There was a problem hiding this comment.
The "-c setup.cfg" is no longer necessary.
653ffc2 to
99158ef
Compare
|
Please re-run the |
8b9d967 to
ad7d4e1
Compare
|
@tomMoral Please add a new test to check that the (default) locky backend accepts nested parallelism (without warnings). |
|
Fine with me. Merge when you want :) |
- This causes crashes with the new backend because some constants gets lost. - It also makes sense to avoid mixing this 2 backends has the results might be unexepected.
8faf4c6 to
eb98b5c
Compare
when running tests from the joblib root foler and joblib is not installed
since it not used anywhere
eb98b5c to
fe3f2e3
Compare
690d9a8 to
7502017
Compare
|
OK LGTM, merging, great stuff! |
|
Just saw that. Awesome!!
|
This PR aims to integrate
lokyas a backend injoblib. This PR features:lokyasjoblib.externals.lokycloudpickleasjoblib.externals.cloudpickleloky.get_reusable executoras a parallel backend.