[Callbacks] FEA Define callback behaviour when cloning an estimator#33340
[Callbacks] FEA Define callback behaviour when cloning an estimator#33340lesteve merged 31 commits intoscikit-learn:callbacksfrom
Conversation
jeremiedbb
left a comment
There was a problem hiding this comment.
Thanks for the PR. A few comments.
jeremiedbb
left a comment
There was a problem hiding this comment.
A few more nitpicks. Otherwise LGTM.
Ping @ogrisel for a second review is possible. As discussed in a previous call, this PR makes clone discard the callbacks such that if an estimator with callbacks is passed to a function or meta-estimator that doesn't have callback support, then nothing breaks but the callbacks have no effect.
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
| if clone_estimator: | ||
| from sklearn.base import clone | ||
|
|
||
| with warnings.catch_warnings(): | ||
| warnings.filterwarnings( | ||
| "ignore", message="There are callbacks set on the estimator " | ||
| ) | ||
| cloned_estimator = clone(sub_estimator) | ||
| if hasattr(sub_estimator, "_skl_callbacks"): | ||
| cloned_estimator.set_callbacks(sub_estimator._skl_callbacks) | ||
| sub_estimator = cloned_estimator |
There was a problem hiding this comment.
nitpick: I'd put the callbacks_to_propagate right before the if not callbacks_to_propagate
| def propagate_callback_context(self, sub_estimator, clone_estimator=False): | ||
| """Propagate the callbacks and the context to a sub-estimator. | ||
|
|
||
| The callbacks are propagated to a clone of the sub-estimator instead if | ||
| clone_estimator is set to True. |
There was a problem hiding this comment.
as discussed in the last meeting, let's add an extra arg to clone instead to transfer the callbacks with 3 options: True/False/"warn". "warn" being the default and meaning discard the callbacks with a warning
|
@ogrisel @StefanieSenger @lesteve I opened #33421 as the alternative that we discussed during last meeting, that is add an option in |
StefanieSenger
left a comment
There was a problem hiding this comment.
Thank you @FrancoisPgm, I have a few comments, but overall looks good to me.
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
StefanieSenger
left a comment
There was a problem hiding this comment.
Thanks for addressing my review comments, @FrancoisPgm.
What is lacking for me is a test against what exactly would crash / not work as expected if we do not implement these changes.
I've raised this more concretely here and in the comments before.
ogrisel
left a comment
There was a problem hiding this comment.
Thanks @FrancoisPgm. The PR LGTM besides the following:
sklearn/base.py
Outdated
| warnings.warn( | ||
| f"There are callbacks set on the estimator {estimator.__class__.__name__} " | ||
| "being cloned. The callbacks will be discarded in the clone." | ||
| ) # TODO: add a link to some documentation for callback support. |
There was a problem hiding this comment.
I think it would be cleaner to create a dedicated warning class under sklearn.exceptions (for instance sklearn.exception.CloneWithCallbacksWarning) to catch instances of this warning explicitly in CallbackContext.clone_and_propagate_callback_context without relying on string message matching.
There was a problem hiding this comment.
Very good point, thx :)
| """Test the warning when cloning an estimator with callback registered to it.""" | ||
| estimator = MaxIterEstimator() | ||
| estimator.set_callbacks(TestingCallback()) | ||
| with pytest.warns(UserWarning, match="There are callbacks set on the estimator "): |
There was a problem hiding this comment.
This should also be updated to test the specific warning class.
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
lesteve
left a comment
There was a problem hiding this comment.
LGTM, thanks!
As noted by others, it would be nice to figure out exactly what breaks when we don't discard callbacks and add a relevant test.
Reference Issues/PRs
Towards #33323
What does this implement/fix? Explain your changes.
When used on an estimator holding callbacks,
clonenow discards the callbacks in the clone, raising a warning.For callback-compatible meta estimator which need to clone sub-estimators,
propagate_callbacksnow offers a way to clone the sub-estimators while forwarding the callbacks to the clone and silencing the warning.AI usage disclosure
I used AI assistance for:
Any other comments?