[MRG] Adds documentation for parallelisation of custom scorer#12813
[MRG] Adds documentation for parallelisation of custom scorer#12813jnothman merged 12 commits intoscikit-learn:masterfrom fx86:doc_pr_parallel_custom_scorer
Conversation
|
I know this is a restriction if pickle is used but is this still needed now that joblib is using cloudpickle? |
@albertcthomas would it make sense to add this for pre-cloudpickle versions of sklearn ? Could you let me know which version onwards has cloudpickle ? |
|
I was just wondering. It might be relevant even with cloudpickle. cc @tomMoral |
|
Thanks. Can you test if the issue persists in 0.20.1? If so, then cloudpickle doesn't help. |
|
Installed |
|
So now that joblib is using cloudpickle I think that it should work out of the box for most of the cases (if the loky backend is used). Instead of the example we can maybe add a remark in the doc saying that if a custom scoring function is used with |
|
Thanks, @albertcthomas. @amueller @adrinjalali - comments ? |
jnothman
left a comment
There was a problem hiding this comment.
I think you need an example of importing.
jnothman
left a comment
There was a problem hiding this comment.
Now are we sure this is necessary with current joblib?
|
@jnothman The new Loky backend takes care of serialising the interactively-defined functions, but I'm not sure if older versions of sklearn will work with this new backend. Can I suggest this to be kept for reasons of backward compatibility ? |
I think it would be better to say that this example is useful as other backends than loky (multiprocessing, custom backends) can be used. |
|
Roger that, @albertcthomas. |
|
Well it would be nice if someone could confirm that this is the expected behavior when loky is used and when a different backend is used |
albertcthomas
left a comment
There was a problem hiding this comment.
See my two comments below. Also as rmse is not defined anywhere in the example and as the example is not run, why do you think of using custom_scoring_function or custom_scorer instead of rmse?
IMO |
We could keep an abstract function name, like 'custom_scoring_function`, probably ? |
Yes |
There was a problem hiding this comment.
Thanks @fx86! Almost there :)
To make things even clearer I would also suggest putting your addition in a Notes section with a well-chosen title such as
.. notes:: **Using custom scorers in functions where `n_jobs > 1`**
albertcthomas
left a comment
There was a problem hiding this comment.
One last comment. Otherwise LGTM! Thanks @fx86
This reverts commit d551ff2.
This reverts commit d551ff2.
Reference Issues/PRs
Adds documentation on parallelisation of custom scorers. Without importing from a module,
joblib is unable to run custom scoring functions in parallel.
Prompted from this discussion - #10054 (comment)
ping @jnothman @amueller