[Callbacks] EFF Single manager for all callbacks#33408
[Callbacks] EFF Single manager for all callbacks#33408lesteve merged 1 commit intoscikit-learn:callbacksfrom
Conversation
|
|
||
| class _CallbackManagerState: | ||
| manager = None | ||
| lock = Lock() |
There was a problem hiding this comment.
Just curious, do you have a use case in mind where get_callback_manager would be used in a multi-threaded context?
TIL about the double-checked locking pattern ...
There was a problem hiding this comment.
Is it when Parallel(n_jobs=..., prefer="threads")?
There was a problem hiding this comment.
In the last meeting we decided to allow the following:
cb = MetricMonitor("accuracy")
LogisticRegression(C=1).set_callbacks(cb).fit(X,y)
LogisticRegression(C=10).set_callbacks(cb).fit(X,y)
cb.plot()
Then nothing prevents the user to the same but in 2 different threads, in a joblib // loop for instance as Stefanie mentioned
StefanieSenger
left a comment
There was a problem hiding this comment.
Thank you @jeremiedbb!
I have checked how this works out and now I can only see one single subprocess even after multiple runs of PragressBar.
I have posted a few questions in the comments.
| from sklearn.callback._callback_context import CallbackContext | ||
|
|
||
|
|
||
| class _CallbackManagerState: |
There was a problem hiding this comment.
Is it the plan that all callback types would share the same Manager() instance?
Do early stopping, logging, ... callbacks surely also need one?
Should we expose _CallbackManagerState publicly to make implementing custom callbacks easier for
third party devs?
There was a problem hiding this comment.
Is it the plan that all callback types would share the same Manager() instance?
Yes, almost all callbacks that we thought about need some kind of shared data structure (queue, list, dict, ...)
We only need one manager process to manage all these shared objects.
Should we expose _CallbackManagerState publicly to make implementing custom callbacks easier for
third party devs?
not this class, but the get_callback_manager function yes. I think we should keep it private for the incoming release and consider making it public for the next one when we add documentation for implementing custom callbacks
StefanieSenger
left a comment
There was a problem hiding this comment.
As far as I can tell, this looks good to me.
Thank you @jeremiedbb!
Closes #33325
Create a sklearn global dedicated manager for all callbacks.
ping @FrancoisPgm @StefanieSenger @lesteve @ogrisel