[MRG] proposal to recommend ".joblib" file extension for load/dump#11230
[MRG] proposal to recommend ".joblib" file extension for load/dump#11230ogrisel merged 1 commit intoscikit-learn:masterfrom
Conversation
I'm proposing the use of `filename.joblib` instead of `filename.pkl` for models persisted via the joblib library. This will make it easier for model sharing and reduce confusion when it comes to time load a model, as it will be more clear whether a file was saved using the `pickle` or `joblib` library.
|
I am not sure about that. Files are also pickled. |
|
Perhaps you're right: is there no incompatibility in loading joblib pickles
with pickle.load?
|
Short answer no. Longer answer: starting in joblib 0.10 (released in July 2016) files produced by import pickle
import joblib
import numpy as np
filename = '/tmp/test.pkl'
joblib.dump([1, 2, 3], filename)
pickle.load(open(filename, 'rb')) # works fine
joblib.dump(np.array([1, 2, 3]), filename)
pickle.load(open(filename, 'rb')) # UnpicklingError: invalid load key, '\x01'. |
|
To sum up, I am fine with changing the extension in the example, maybe |
To summarize the triple negation - it's not strictly compatible :)
https://datatypes.net/open-jbl-files: .jbl seems to be an already existing extension. |
|
I should have guessed: any combination of 3 letters is already taken with high probability ;-). Let's go for .joblib then! |
|
Please also update other similar places in the repo (e.g., doc/tutorial/basic/tutorial.rst) |
|
LGTM as well. I did the change in the master branch of joblib. Let's merge this. |
|
I pushed a similar change in 066b501 for |
Recommend to use of `filename.joblib` instead of `filename.pkl` for models persisted via the joblib library to reduce confusion when it comes to time load a model, as it will be more clear whether a file was saved using the `pickle` or `joblib` library.
|
Sorry I'm not familiar with the doc site publishing workflow -- is there any additional actions for me to take to reflect the changes on the website? It is still showing the pre-merge content. http://scikit-learn.org/stable/modules/model_persistence.html |
|
Look at /dev rather than /stable. We will release real soon now... |
I'm proposing the use of
filename.joblibinstead offilename.pklfor models persisted via the joblib library. This will make it easier for model sharing and reduce confusion when it comes to time load a model, as it will be more clear whether a file was saved using thepickleorjobliblibrary.Reference Issues/PRs
What does this implement/fix? Explain your changes.
Any other comments?