[Callbacks] Alternative to #33340 to cloning with callbacks#33421
[Callbacks] Alternative to #33340 to cloning with callbacks#33421jeremiedbb wants to merge 12 commits intoscikit-learn:callbacksfrom
Conversation
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
| that are not estimators. Ignored if `estimator.__sklearn_clone__` | ||
| exists. | ||
|
|
||
| keep_callbacks : "warn" or bool, default="warn" |
There was a problem hiding this comment.
Should there be a defensive check to verify that keep_callbacks has one of the expected values ?
| if hasattr(estimator, "__sklearn_clone__") and not inspect.isclass(estimator): | ||
| return estimator.__sklearn_clone__() | ||
| return _clone_parametrized(estimator, safe=safe) | ||
| return estimator.__sklearn_clone__(keep_callbacks=keep_callbacks) |
There was a problem hiding this comment.
One argument I see against this approach is that the additional argument breaks third-party estimators who have already implemented __sklearn_clone__ to do their own cloning (not sure how widely this is used but a quick search seems to indicate that it's used by a few projects, e.g. ibis-ml, skada, nemos)
There was a problem hiding this comment.
That's true, we're passing a new arg to __clone__ so third-party __clone__ would break. I missed that point but then I'm definitely leaning toward the other alternative.
There was a problem hiding this comment.
A third alternative could be to expose a new function clone_with_callbacks that would be used by estimators implementing callback support. It would handle retrieving the callbacks, cloning, and putting back the callbacks.
But then I don't see the point of having this function instead of dealing with clone inside propagate_callback_context since this specific cloning only happens when we want to propagate.
|
I think the consensus was to go with #33340 to avoid changing |
|
I still wonder why we would not always copy the callback in The problem in #28760 (comment) was that we had used
instead of
|
As discussed in the last meeting this PR explores an alternative to #33340
ping @ogrisel @StefanieSenger @lesteve
Personally I tend to prefer slightly the other version because it doesn't extend the API of
clone, but the coupling in the other PR is not ideal either.(side note: the added test doesn't work yet because it needs #33394)