Skip to content

[MRG] proposal to recommend ".joblib" file extension for load/dump#11230

Merged
ogrisel merged 1 commit intoscikit-learn:masterfrom
yufengg:patch-1
Jun 18, 2018
Merged

[MRG] proposal to recommend ".joblib" file extension for load/dump#11230
ogrisel merged 1 commit intoscikit-learn:masterfrom
yufengg:patch-1

Conversation

@yufengg
Copy link
Copy Markdown
Contributor

@yufengg yufengg commented Jun 10, 2018

I'm proposing the use of filename.joblib instead of filename.pkl for models persisted via the joblib library. This will make it easier for model sharing and reduce confusion when it comes to time load a model, as it will be more clear whether a file was saved using the pickle or joblib library.

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

I'm proposing the use of `filename.joblib` instead of `filename.pkl` for models persisted via the joblib library. This will make it easier for model sharing and reduce confusion when it comes to time load a model, as it will be more clear whether a file was saved using the `pickle` or `joblib` library.
@glemaitre
Copy link
Copy Markdown
Member

I am not sure about that. Files are also pickled.

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Jun 11, 2018 via email

@lesteve
Copy link
Copy Markdown
Member

lesteve commented Jun 13, 2018

is there no incompatibility in loading joblib pickles with pickle.load?

Short answer no.

Longer answer: starting in joblib 0.10 (released in July 2016) files produced by joblib.dump can not be read with pickle.load if the dumped object "contains" a numpy array. If the dumped object does not "contain" a numpy array, then it is a standard Python pickle and can be read with pickle.load

import pickle
import joblib
import numpy as np

filename = '/tmp/test.pkl'
joblib.dump([1, 2, 3], filename)
pickle.load(open(filename, 'rb'))  # works fine
joblib.dump(np.array([1, 2, 3]), filename)
pickle.load(open(filename, 'rb'))  # UnpicklingError: invalid load key, '\x01'.

@lesteve
Copy link
Copy Markdown
Member

lesteve commented Jun 13, 2018

To sum up, I am fine with changing the extension in the example, maybe .jbl rather than .joblib?

@rth
Copy link
Copy Markdown
Member

rth commented Jun 13, 2018

is there no incompatibility in loading joblib pickles with pickle.load?

Short answer no.

To summarize the triple negation - it's not strictly compatible :)

maybe .jbl rather than .joblib?

https://datatypes.net/open-jbl-files: .jbl seems to be an already existing extension.

@lesteve
Copy link
Copy Markdown
Member

lesteve commented Jun 13, 2018

I should have guessed: any combination of 3 letters is already taken with high probability ;-). Let's go for .joblib then!

Copy link
Copy Markdown
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @yufengg

@qinhanmin2014
Copy link
Copy Markdown
Member

Please also update other similar places in the repo (e.g., doc/tutorial/basic/tutorial.rst)

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Jun 18, 2018

LGTM as well. I did the change in the master branch of joblib. Let's merge this.

@ogrisel ogrisel merged commit 355d880 into scikit-learn:master Jun 18, 2018
@lesteve
Copy link
Copy Markdown
Member

lesteve commented Jun 18, 2018

I pushed a similar change in 066b501 for doc/tutorial/basic/tutorial.rst.

georgipeev pushed a commit to georgipeev/scikit-learn that referenced this pull request Jun 20, 2018
Recommend to use of `filename.joblib` instead of `filename.pkl` for models persisted via the joblib library to reduce confusion when it comes to time load a model, as it will be more clear whether a file was saved using the `pickle` or `joblib` library.
@yufengg
Copy link
Copy Markdown
Contributor Author

yufengg commented Aug 22, 2018

Sorry I'm not familiar with the doc site publishing workflow -- is there any additional actions for me to take to reflect the changes on the website? It is still showing the pre-merge content.

http://scikit-learn.org/stable/modules/model_persistence.html

@jnothman
Copy link
Copy Markdown
Member

Look at /dev rather than /stable. We will release real soon now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants