ENH add initializer limiting n_threads for C-libs#701
ENH add initializer limiting n_threads for C-libs#701tomMoral merged 11 commits intojoblib:masterfrom
Conversation
tomMoral
commented
Jun 20, 2018
- This functionnality supports OpenMP, OpenBLAS, MKL and Accelerated
- Set n_threads=1 by default in all workers to avoid oversubscrption
Codecov Report
@@ Coverage Diff @@
## master #701 +/- ##
==========================================
- Coverage 95.14% 95.02% -0.13%
==========================================
Files 40 40
Lines 5725 5746 +21
==========================================
+ Hits 5447 5460 +13
- Misses 278 286 +8
Continue to review full report at Codecov.
|
joblib/_parallel_backends.py
Outdated
| if self._pool is None: | ||
| self._pool = ThreadPool(self._n_jobs) | ||
| self._pool = ThreadPool( | ||
| self._n_jobs, initializer=self.limit_clib_threads) |
There was a problem hiding this comment.
I don't think threads have isolated environment variables. Dynamic / ctypes based calls would be required to limit clib threads while joblib is doing thread-based parallelism.
There was a problem hiding this comment.
I would just not register any initializer at the ThreadPool level for now.
There was a problem hiding this comment.
I confirm that there is a single environment namespace per-process: all threads of the same process share the same environment variables:
>>> from time import sleep
>>> import os
>>> from threading import get_ident
>>> from multiprocessing.pool import ThreadPool
>>> def env_tweaks(*args):
... tid = str(get_ident())
... os.environ['THREAD_ID'] = tid
... sleep(0.01)
... return (tid, os.environ['THREAD_ID'])
...
...
>>> p = ThreadPool(4)
>>> results = p.map(env_tweaks, range(10))
>>> results
[('140286315460352', '140286290282240'), ('140286307067648', '140286315460352'), ('140286298674944', '140286290282240'), ('140286290282240', '140286307067648'), ('140286315460352', '140286298674944'), ('140286307067648', '140286315460352'), ('140286290282240', '140286307067648'), ('140286298674944', '140286307067648'), ('140286315460352', '140286307067648'), ('140286307067648', '140286307067648')]
>>> There was a problem hiding this comment.
Indeed.. thanks for pointing that out!
|
Please add an entry to the changelog. Please also add a new section at the end of the |
|
Yes I am planning on adding in this PR:
|
- this functionnality supports OpenMP, OpenBLAS, MKL and Accelerated - Set n_threads=1 by default in all workers to avoid oversubscrption
e24abb5 to
e7c933b
Compare
ogrisel
left a comment
There was a problem hiding this comment.
LGTM besides the following the comments:
CHANGES.rst
Outdated
| Thomas Moreau | ||
|
|
||
| Limit the number of threads used in worker processes (for ``loky`` and | ||
| ``multiprocessing``) by C-libraries that relies on threadpools. This |
There was a problem hiding this comment.
I would not mention multiprocessing as it's not reliable when using the default fork context under posix.
CHANGES.rst
Outdated
| Implement the ``loky`` backend with @ogrisel. This backend relies on | ||
| a robust implementation of ``concurrent.futures.ProcessPoolExecutor`` | ||
| with spawned processes that can be reused accross the ``Parallel`` | ||
| calls. |
There was a problem hiding this comment.
Please move this to the top and explain that it fixes bad interactions (crash) with the internal thread-pool of third-party libraries with a link to.
doc/parallel.rst
Outdated
| computational power available for each process. Moreover, when many processes | ||
| are running, the time taken by the OS scheduler to switch between them can | ||
| further hinder the performance of the computation. It is generally better to | ||
| avoid using more ``Process/Thread`` than the number of CPU on a machine. |
There was a problem hiding this comment.
avoid using significantly more processes or threads than the number of CPUs on a machine.
doc/parallel.rst
Outdated
| further hinder the performance of the computation. It is generally better to | ||
| avoid using more ``Process/Thread`` than the number of CPU on a machine. | ||
|
|
||
| Some third-partiy libraries -- *e.g.* ``numpy`` -- manage internally a |
There was a problem hiding this comment.
e.g. the BLAS runtime used by numpy
doc/parallel.rst
Outdated
| avoid using more ``Process/Thread`` than the number of CPU on a machine. | ||
|
|
||
| Some third-partiy libraries -- *e.g.* ``numpy`` -- manage internally a | ||
| thread-pool to perform their computations. The default behavior is generaly to |
* tag '0.12': (116 commits) Release 0.12 typo typo typo ENH add initializer limiting n_threads for C-libs (joblib#701) DOC better parallel docstring (joblib#704) [MRG] Nested parallel call thread bomb mitigation (joblib#700) MTN vendor loky2.1.3 (joblib#699) Make it possible to configure the reusable executor workers timeout (joblib#698) MAINT increase timeouts to make test more robust on travis DOC: use the .joblib extension instead of .pkl (joblib#697) [MRG] Fix exception handling in nested parallel calls (joblib#696) Fix skip test lz4 not installed (joblib#695) [MRG] numpy_pickle: several enhancements (joblib#626) Introduce Parallel.__call__ backend callbacks (joblib#689) Add distributed on readthedocs (joblib#686) Support registration of external backends (joblib#655) [MRG] Add a dask.distributed example (joblib#613) ENH use cloudpickle to pickle interactively defined callable (joblib#677) CI freeze the version of sklearn0.19.1 and scipy1.0.1 (joblib#685) ...
* releases: (121 commits) Release 0.12.1 fix kwonlydefaults key error in filter_args (joblib#715) MNT fix some "undefined name" flake8 warnings (joblib#713). from importlib import reload for Python 3 (joblib#675) MTN vendor loky2.1.4 (joblib#708) Release 0.12 typo typo typo ENH add initializer limiting n_threads for C-libs (joblib#701) DOC better parallel docstring (joblib#704) [MRG] Nested parallel call thread bomb mitigation (joblib#700) MTN vendor loky2.1.3 (joblib#699) Make it possible to configure the reusable executor workers timeout (joblib#698) MAINT increase timeouts to make test more robust on travis DOC: use the .joblib extension instead of .pkl (joblib#697) [MRG] Fix exception handling in nested parallel calls (joblib#696) Fix skip test lz4 not installed (joblib#695) [MRG] numpy_pickle: several enhancements (joblib#626) Introduce Parallel.__call__ backend callbacks (joblib#689) ...