Skip to content

[Callbacks] Alternative to #33340 to cloning with callbacks#33421

Closed
jeremiedbb wants to merge 12 commits intoscikit-learn:callbacksfrom
jeremiedbb:alternative-to-clone-with-callbacks
Closed

[Callbacks] Alternative to #33340 to cloning with callbacks#33421
jeremiedbb wants to merge 12 commits intoscikit-learn:callbacksfrom
jeremiedbb:alternative-to-clone-with-callbacks

Conversation

@jeremiedbb
Copy link
Copy Markdown
Member

As discussed in the last meeting this PR explores an alternative to #33340

ping @ogrisel @StefanieSenger @lesteve

Personally I tend to prefer slightly the other version because it doesn't extend the API of clone, but the coupling in the other PR is not ideal either.

(side note: the added test doesn't work yet because it needs #33394)

that are not estimators. Ignored if `estimator.__sklearn_clone__`
exists.

keep_callbacks : "warn" or bool, default="warn"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a defensive check to verify that keep_callbacks has one of the expected values ?

if hasattr(estimator, "__sklearn_clone__") and not inspect.isclass(estimator):
return estimator.__sklearn_clone__()
return _clone_parametrized(estimator, safe=safe)
return estimator.__sklearn_clone__(keep_callbacks=keep_callbacks)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One argument I see against this approach is that the additional argument breaks third-party estimators who have already implemented __sklearn_clone__ to do their own cloning (not sure how widely this is used but a quick search seems to indicate that it's used by a few projects, e.g. ibis-ml, skada, nemos)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, we're passing a new arg to __clone__ so third-party __clone__ would break. I missed that point but then I'm definitely leaning toward the other alternative.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A third alternative could be to expose a new function clone_with_callbacks that would be used by estimators implementing callback support. It would handle retrieving the callbacks, cloning, and putting back the callbacks.
But then I don't see the point of having this function instead of dealing with clone inside propagate_callback_context since this specific cloning only happens when we want to propagate.

@lesteve
Copy link
Copy Markdown
Member

lesteve commented Mar 9, 2026

I think the consensus was to go with #33340 to avoid changing clone, let's close this one.

@lesteve lesteve closed this Mar 9, 2026
@StefanieSenger
Copy link
Copy Markdown
Member

StefanieSenger commented Mar 9, 2026

I still wonder why we would not always copy the callback in clone(), instead of adding a param which changes the API.

The problem in #28760 (comment) was that we had used

new_object._skl_callbacks = clone(estimator._skl_callbacks, safe=False) (which causes failure in callbacks with threads)

instead of

new_object._skl_callbacks = estimator._skl_callbacks (which passes the tests, including the smoke test from the other PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants