Conversation
|
I updated the initial comment with a bit more information and a better wording |
Codecov Report
@@ Coverage Diff @@
## master #619 +/- ##
==========================================
+ Coverage 95.02% 95.29% +0.26%
==========================================
Files 39 41 +2
Lines 5427 5586 +159
==========================================
+ Hits 5157 5323 +166
+ Misses 270 263 -7
Continue to review full report at Codecov.
|
|
Can you add an example while you work on this PR. I find this very useful, because it helps thinking of the use case. |
joblib/__init__.py
Outdated
| from .parallel import register_parallel_backend | ||
| from .parallel import parallel_backend | ||
| from .parallel import effective_n_jobs | ||
| from .shelf import JoblibShelf, shelve, shelve_mmap |
There was a problem hiding this comment.
Should we expose to the user JoblibShelf? It seems to me that it should be internal.
There was a problem hiding this comment.
Indeed, will change that
| All values are cached on the filesystem, in a deep directory | ||
| structure. | ||
|
|
||
| see :ref:`memory_reference` |
There was a problem hiding this comment.
Hum, did we loose the docstring?
There was a problem hiding this comment.
It was already like this IIRC. The __init__ docstring has moved to StoreBase
cbb7361 to
999feab
Compare
I added a couple of examples in each function docstring. But maybe you are talking of a sphinx-gallery example ? If you have good ideas of examples, I buy them |
|
Aside from the example, what remains to be done here? |
joblib/shelf.py
Outdated
| memory. The future, a light-weight object, can be used later to reload the | ||
| initial object. | ||
|
|
||
| During the life of the future, the input object is kept written on a store |
There was a problem hiding this comment.
I would rather say "The input object is kept in a store (by default a file on a disk) as long as the future object exists (technically: as long as there is a reference on the future)".
joblib/shelf.py
Outdated
| return _active_shelf.put(input_object) | ||
|
|
||
|
|
||
| def shelve_mmap(input_array): |
There was a problem hiding this comment.
I think that the variable should rather be called "input_object" (here and in the docstring below, the word "array" should often be replaced by "object").
There was a problem hiding this comment.
This function is only meant to be used with numpy arrays, since it returns a future on a mmap. That's why I think it's important to keep the 'array' in the variable name
@ogrisel suggested that the This will allow a transparent use of this function with Parallel calls: no need to use the |
|
+def shelve_mmap(input_array):
This function is only meant to used with numpy array, since it return a future
on a mmap. That's why I think it's important to keep the 'array' in the
variable name
It's not limited to arrays. It should be able to take any object as an
input.
|
I updated the input parameter and the |
|
Is this still of interest to joblib ? I'd like to close it if possible :) And I'm not sure if it's in a rebasable state. |
This PR fixes #593 but is still WIP.
Note that this basic shelving can only be used with a script using a single python process because the futures returned by the shelf are only referenced in this python process. This means that it may not work as expected if using
Parallelwith loky or multiprocessing backend. But it should work withthreading.There are also some tests for the basic functionalities and top-level functions exposed to users:
Data are deleted in the following cases:
Here are examples showing how to use this new feature:
Here is the version with memmap: