[Callbacks] FEA Add the ScoringMonitor callback by FrancoisPgm · Pull Request #33407 · scikit-learn/scikit-learn

FrancoisPgm · 2026-02-25T15:41:44Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Add the ~~MetricMonitor~~ ScoringMonitor callback, which evaluates a ~~metric~~ scorer on an estimator during fit and logs the values.

AI usage disclosure

I used AI assistance for:

Code generation (e.g., when writing an implementation or fixing a bug)
Test/benchmark generation
Documentation (including examples)
Research and understanding

Any other comments?

ping @jeremiedbb

EDIT: the previously called MetricMonitorcallback is now renamed ScoringMonitor to avoid confusion since it uses scorers to compute the metric. An actual MetricMonitor which would take a scikit-learn metrics as an argument is considered for a future PR.

sklearn/base.py

sklearn/callback/tests/_utils.py

sklearn/callback/tests/test_metric_monitor.py

sklearn/callback/_callback_context.py

sklearn/callback/_callback_support.py

jeremiedbb

Thanks for the PR @FrancoisPgm. Here's a first pass

sklearn/callback/tests/_utils.py

sklearn/callback/_metric_monitor.py

Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>

…kit-learn into metric_monitor_callback2

…ut pandas, update get_logs output, remove X_val, y_val args from other estimators

…kit-learn into metric_monitor_callback2

sklearn/callback/tests/_utils.py

sklearn/callback/_callback_support.py

jeremiedbb · 2026-03-03T13:33:02Z

Instead of passing a partial from_reconstruction_attributes to the callback hook, we should pass the reconstruction attributes directly (with a lambda still).

Then eval_on_fit_taks_end should check if at least one callbacks requests it, and if so build a working estimator from these attributes (instead of a private method of the mixin, it would make more sense to be a private helper in _callback_context.py).

A callback can request a value by setting a class attribute. We can add a requires_fit_info class attribute to the callbacks that need some of the lazily computed values. It would be a list with the required values, e.g. requires_fit_info = ["reconstruction_attributes"]. For now it's the only one possible but it leaves the door open for future lazy values.

Then eval_on_fit_task_end passes the new estimator to the callback hooks. I don't know how to name the kwarg passed yet. reconstructed_estimator ? There might be confusion with the positional arg estimator.

FrancoisPgm · 2026-03-03T15:06:29Z

I don't know how to name the kwarg passed yet. reconstructed_estimator ? There might be confusion with the positional arg estimator.

I was thinking evaluable_estimator might be a little less ambiguous, focusing on the idea that the estimator can be evaluated through its predict or transform method.

EDIT: As discussed in the meeting, I ended up going with fitted_estimator.

…eview fixes

…e 'run' key

… import of callback_management_context to avoid circular imports. Use UTC time. Seperate tests depending on pandas availability.

…kit-learn into pr/FrancoisPgm/33407

ogrisel

For the record, in today's meeting we evoked the idea of exposing some kind of log index datastructure that would record per-run metadata such as the run id, the estimator name and the run start UTC datetime (as an ISO-formatted string) to make it convenient to find which log the user is interested in and then make it possible to retrieve a particular log by its run id.

ogrisel · 2026-03-19T14:55:13Z

sklearn/callback/tests/test_scoring_monitor.py

+    else:  # eval_on == "both"
+        train_log = [entry for entry in log if entry["eval_on"] == "train"]
+        val_log = [entry for entry in log if entry["eval_on"] == "val"]
+        assert len(train_log) == len(val_log) == max_iter


We could also compute the score(s) of estimator (after fit) on the same data and check that they match the last entry of the log.

ogrisel · 2026-03-19T15:32:40Z

sklearn/callback/tests/test_scoring_monitor.py

+            MetaEstimator(est).fit(X=X, y=y, X_val=X_val, y_val=y_val)
+
+        # with metadata-routing enabled and requested
+        est.set_fit_request(X_val=True, y_val=True)


I think it would make sense to set those requests on the ScoringMonitor estimator itself (maybe even by default) and then have the CallbackSupportMixin automatically request this metadata if there is a registered callback that request them on the estimator instance?

I think it would make sense to set those requests on the ScoringMonitor estimator itself

You mean something like ScoringMonitor().set_on_fit_task_begin_request("X_val") ?

It would not work because the estimator might be able to pass some values for some tasks but not for others, so we'd get an error. I don't think that we should make the callbacks metadata routers.

I naively though about something like:

ScoringMonitor().set_on_fit_task_end_request(X_val=True, y_val=True)

but in the end I don't like it, and it might not work.

Instead, we could have BaseEstimator.set_callbacks automatically call:

self.set_fit_requests(X_val=True, y_val=True)

if one of the callbacks declares that it needs extra requests in some kind of public attribute.

We can be probably defer this design discussion to a follow-up issue and keep the manual est.set_fit_request(X_val=True, y_val=True) for now.

We can only set_fit_request on estimators (that inherit from BaseEstimator). Callbacks don't do this and don't expose the set_request methods. I agree with @jeremiedbb that it is fine to pass metadata as usual params, instead of routing them to the callback.

In a separate discussion, we came to the conclusion that we have 2 options regarding metadata-routing:

don't make callbacks routers (as being done currently).

make callbacks routers and turn every estimator into a router as well.
The former is obviously a lot simpler but has limitations. For instance it doesn't allow to pass different sample weight for the fitting and for the scoring.
I think that this use case is quite niche and that for now we can go with the first option. We can reconsider later the second option if we really want to enable the advanced use cases.

StefanieSenger · 2026-03-23T10:48:44Z

sklearn/callback/tests/_utils.py

-                X=X, y=y
-            )
+            subcontext = callback_ctx.subcontext(task_id=i, task_name="iteration")
+            subcontext.call_on_fit_task_begin(X=X, y=y, metadata=metadata)


I would suggest not calling this "metadata", to not be confused with metadata routing, which is a different concept and which this doesn't relate to. I think this could otherwise be very confusing for contributors.

This was introduced in #33572 as

VALID_HOOK_PARAMS_IN = ["X", "y", "metadata", "reconstruction_attributes"] VALID_HOOK_PARAMS_OUT = ["X", "y", "metadata", "fitted_estimator"]

I was thinking if we could call it fit_extras instead.
LLM also suggests hook_payload, which after what I have read on what "payload" means in computing, fits even better, because it is generic and signals that this is data coming with this specific event.

to not be confused with metadata routing

But it is the metadata from metadata-routing, see https://github.com/scikit-learn/scikit-learn/pull/33407/changes#diff-9d7ab703cc6480acfdb50d80729bb33776dcf3269a07a8dedd1d4383dcb281abR372

But it is the metadata from metadata-routing,

I agree that currently this dict contains the routed metadata. What I mean is that it shouldn't be named as if it is part of the routing. What I am thinking is:

a) metadata it is not part of the routing conceptually since the sub-estimator's method is the end-consumer (even if it passes it forwards to another function or to a callback).

b) Metadata routing is generically supporting any routed metadata, so users can define custom estimators and be sure their metadata gets routed to it if they use it in a scikit-learn meta-estimator. I think we wouldn't want to expose every param, that gets passed though metadata routing, to the callbacks, right? So if we pick information that gets used in implemented callbacks, there will be metadata that had been routed, but which is not part of the metadata param that gets passed to a callback.

c) We could also pass something that was computed/defined on the sub-estimator's method directly for a different callback than ScoringMonitor. We actually have to, because there is no other param available (except the reconstruction attributes).

In the SLEP, I read:

- "metadata": a dictionary containing training and validation metadata, e.g. sample weights, `X_val`, `y_val`, etc.

So my impression is that the idea of this parameter is to pass something that is known on the estimator-level to the callback hook, and it could be something routed to the estimator or something only the estimator knows. For the reconstructions parameters, you have now defined a separate parameter, but what if the estimator splits data into train and validation sets internally, or what about passing n_iter, or loss, or best_score_so_far?

Regarding the name, that's what I put on the SLEP and people are currently voting on that, so we can't really change it now. I mean we could, but that would give a bad image.

I think we wouldn't want to expose every param, that gets passed though metadata routing, to the callbacks, right?

I think that we want to pass everything that is routed. Then it's up to the callback to do something with it (don't use any, use all, only use specific metadata). Maybe in sklearn we'll never use all the possible metadata in built-in callbacks, but we leave the door open for custom callbacks to use them.

We could also pass something that was computed/defined on the sub-estimator's method directly

That won't be included in metadata. Instead it will be added as an extension of the SLEP. For this SLEP we wrote the signature with the minimal set of params required for progress bars and score monitoring. We're confident that it will also be enough for early stopping and snapshots. For structured logging we might indeed want to pass new information and we'll extend the api then.

I think that we want to pass everything that is routed.

Something that was computed/defined on the sub-estimator's method directly

That won't be included in metadata.

I see. Thanks for your clarifications. If all routed params will be in metadata and we do not intend to put anything else into it, then metadata would be the best param name indeed. (I think I had missed the discussions and the reasons for change scikit-learn/enhancement_proposals@3cdaa00 in the SLEP.)

What I understand from what you write, @jeremiedbb, is that metadata is supposed to be everything that gets routed into the estimator's fit.

But there are a few things that are still unclear to me. I will try to write down what I think, but no pressure for a long answer. I mainly want to bring this down cleanly for myself:

For sample_weight, ScoringMonitor should use the sample_weight routed to the scorer, since it is a scoring callback. But instead it uses the sample_weight available in the estimator's fit method. Users can set different requests on fit and score, also a different sample_weight (I'm not sure if it is a good idea to do so, though). So there is a chance for things going wrong, I think.
Additionally, sample_weight can also be passed directly (as test_sample_weights_and_metadata_routing) shows, so in this case, sample_weight comes without routing and is strictly not routed metadata.

If metadata only contains de facto routed params, passing X_val and y_val to the callback can currently only happen by setting a metadata request for it on HistGradientboosting (which is the only estimator currently consuming these and the only one able to set a request). Does that mean, only HistGradientboosting can use ScoringMonitor with eval_on="val" or "both" if routing is enabled?
Would other estimators that calculate similar values then not be able to use ScoringMonitor with eval_on="val" or "both" or could they also pass their computed values directly, similar to sample_weight which can be passed directly by the user?
(I didn't dig deep, but I found SGD using a validation_mask, and a few estimators using X_val and y_val after an internal train_test_split. It seems like they use validation sets internally.)

sklearn/callback/tests/_utils.py

StefanieSenger · 2026-03-23T12:42:54Z

sklearn/callback/tests/_utils.py

-                X=X, y=y
-            )
+            subcontext = callback_ctx.subcontext(task_id=i, task_name="iteration")
+            subcontext.call_on_fit_task_begin(X=X, y=y, metadata=metadata)


But it is the metadata from metadata-routing,

I agree that currently this dict contains the routed metadata. What I mean is that it shouldn't be named as if it is part of the routing. What I am thinking is:

a) metadata it is not part of the routing conceptually since the sub-estimator's method is the end-consumer (even if it passes it forwards to another function or to a callback).

b) Metadata routing is generically supporting any routed metadata, so users can define custom estimators and be sure their metadata gets routed to it if they use it in a scikit-learn meta-estimator. I think we wouldn't want to expose every param, that gets passed though metadata routing, to the callbacks, right? So if we pick information that gets used in implemented callbacks, there will be metadata that had been routed, but which is not part of the metadata param that gets passed to a callback.

c) We could also pass something that was computed/defined on the sub-estimator's method directly for a different callback than ScoringMonitor. We actually have to, because there is no other param available (except the reconstruction attributes).

In the SLEP, I read:

- "metadata": a dictionary containing training and validation metadata, e.g. sample weights, `X_val`, `y_val`, etc.

So my impression is that the idea of this parameter is to pass something that is known on the estimator-level to the callback hook, and it could be something routed to the estimator or something only the estimator knows. For the reconstructions parameters, you have now defined a separate parameter, but what if the estimator splits data into train and validation sets internally, or what about passing n_iter, or loss, or best_score_so_far?

StefanieSenger

Here is another pass of review. (I haven't fully finished reviewing and there are a few things undefined on the metadata param).

In sum of my comment #33407 (comment), I am not sure if we should use it exclusively for routes params, because it would require users to call estimators differently.

Thanks so far for your work @FrancoisPgm and @jeremiedbb. ❤️

StefanieSenger · 2026-03-24T10:56:50Z

sklearn/callback/tests/_utils.py

-                X=X, y=y
-            )
+            subcontext = callback_ctx.subcontext(task_id=i, task_name="iteration")
+            subcontext.call_on_fit_task_begin(X=X, y=y, metadata=metadata)


I think that we want to pass everything that is routed.

Something that was computed/defined on the sub-estimator's method directly

That won't be included in metadata.

I see. Thanks for your clarifications. If all routed params will be in metadata and we do not intend to put anything else into it, then metadata would be the best param name indeed. (I think I had missed the discussions and the reasons for change scikit-learn/enhancement_proposals@3cdaa00 in the SLEP.)

What I understand from what you write, @jeremiedbb, is that metadata is supposed to be everything that gets routed into the estimator's fit.

But there are a few things that are still unclear to me. I will try to write down what I think, but no pressure for a long answer. I mainly want to bring this down cleanly for myself:

For sample_weight, ScoringMonitor should use the sample_weight routed to the scorer, since it is a scoring callback. But instead it uses the sample_weight available in the estimator's fit method. Users can set different requests on fit and score, also a different sample_weight (I'm not sure if it is a good idea to do so, though). So there is a chance for things going wrong, I think.
Additionally, sample_weight can also be passed directly (as test_sample_weights_and_metadata_routing) shows, so in this case, sample_weight comes without routing and is strictly not routed metadata.

If metadata only contains de facto routed params, passing X_val and y_val to the callback can currently only happen by setting a metadata request for it on HistGradientboosting (which is the only estimator currently consuming these and the only one able to set a request). Does that mean, only HistGradientboosting can use ScoringMonitor with eval_on="val" or "both" if routing is enabled?
Would other estimators that calculate similar values then not be able to use ScoringMonitor with eval_on="val" or "both" or could they also pass their computed values directly, similar to sample_weight which can be passed directly by the user?
(I didn't dig deep, but I found SGD using a validation_mask, and a few estimators using X_val and y_val after an internal train_test_split. It seems like they use validation sets internally.)

StefanieSenger · 2026-03-24T11:27:32Z

sklearn/callback/tests/test_scoring_monitor.py

+            MetaEstimator(est).fit(X=X, y=y, X_val=X_val, y_val=y_val)
+
+        # with metadata-routing enabled and requested
+        est.set_fit_request(X_val=True, y_val=True)


We can only set_fit_request on estimators (that inherit from BaseEstimator). Callbacks don't do this and don't expose the set_request methods. I agree with @jeremiedbb that it is fine to pass metadata as usual params, instead of routing them to the callback.

StefanieSenger · 2026-03-24T11:36:47Z

sklearn/callback/tests/test_scoring_monitor.py

+
+        # with metadata-routing enabled and requested
+        est.set_fit_request(X_val=True, y_val=True)
+        MetaEstimator(est, n_outer=2, n_inner=3).fit(X=X, y=y, X_val=X_val, y_val=y_val)


Maybe in the end we could check that X_val and y_val that the callback received is equal (the same?) to what was passed?

I don't see how we could do that with only the ScoreMonitoring callback. We need the TestingCallback that records the data passed to the hooks. But that's part of the bigger issue to add more tests for the hooks #33324

StefanieSenger · 2026-03-24T11:38:19Z

sklearn/callback/tests/test_scoring_monitor.py

+    # Without metadata-routing enabled, passing X_val and y_val gives an error
+    msg = re.escape(
+        "[X_val, y_val] are passed but are not explicitly set as requested or not "
+        "requested for MaxIterEstimator.fit"
+    )


I think we would not need the two checks if UnsetMetadataPassedError raises here. We already test this for every meta-estimator (not this test class of cause) and it works consistently.

sklearn/callback/_scoring_monitor.py

StefanieSenger · 2026-03-24T12:09:57Z

sklearn/callback/tests/test_scoring_monitor.py

+        # error if sample_weight not requested
+        scorer = make_scorer(r2_score)
+        callback = ScoringMonitor(eval_on="train", scoring={"r2": scorer})
+        est = MaxIterEstimator().set_callbacks(callback)
+        with pytest.raises(
+            TypeError,
+            match=re.escape("score got unexpected argument(s) {'sample_weight'}"),
+        ):
+            est.fit(X=X, y=y, sample_weight=sample_weight)


As in the other test, we don't necessarily need this here, since this behaviour is already tested for every meta-estimator (not the MetaEstimator test class of cause). What do you think about keeping this as a separate test for the test class?

I'm leaving them for now while we're figuring how we want to handle metadata routing. It helps making sure that the testing meta-estimator correctly implements metadata routing during this time.

sklearn/callback/tests/test_scoring_monitor.py

StefanieSenger · 2026-03-24T13:02:33Z

sklearn/callback/tests/test_scoring_monitor.py

+        scorer = make_scorer(r2_score)
+        scorer.set_score_request(sample_weight=True)
+        callback = ScoringMonitor(eval_on="train", scoring={"r2": scorer})
+        MaxIterEstimator().set_callbacks(callback).fit(
+            X=X, y=y, sample_weight=sample_weight
+        )


Here, sample_weight is still not routed via metadata routing to MaxIterEstimator.fit, it is only routed to the scorer.

It is passed directly into MaxIterEstimator.fit and it is this sample_weight (not the routed one) that gets used in the callback. This means, if we would route a different sample_weight to the scorer, the user could be oddly surprised by differing results. We need to document this well, since there is no solution to fix this without metadata routing, as far as I'm aware of.

For fixing this test (if we still want to test if directly passed versus routes sample_weight is the same: If we want to route sample_weight to the estimator, we need to to do a MaxIterEstimator().set_fit_request(sample_weight=True) (and pass it as a sub-estimator in a meta-estimator).

On a general note: In the meetings we were only talking about using metadata routing to make metadat available on the ScoringMonitor. This is not the whole picture.

We need to find solutions that work without metadata routing first, since we will keep passing arguments like sample_weight directly (as this test shows).

Then, we can add routing functionality.

StefanieSenger · 2026-03-25T16:27:37Z

sklearn/callback/_scoring_monitor.py

+            return
+
+        if self.eval_on in ("train", "both"):
+            score_params = metadata.get("train", {}).copy()


score_params = metadata.get("train", {}).copy()

This is very misleading. We don't retrieve the score_params from metadata routing here, we only pass what estimator.fit has.

The base problem is that this is not passing the correct information to the scorer.

As a first step, we should document it well in docstrings as well as via the used variable names. Therefore, it would be much better to keep calling it sample_weight, or more correctly, fit_sample_weight, or fit_metadata (if we want to be open to custom scorers).

score_params is metadata routing vocabulary. We are outside of metadata routing here and the scorer instance that may be used to score the estimator otherwise (as part of a cv splitter for instance) may receive a different set of score_params via metadata routing.

in MetaEstimator, you made me rename metadata to fit_params because it is what we're passing to fit.
Here, this is what we're passing to score so it is score_params. And it has to do with metadata routing because if enabled, score will raise if I don't pass sample_weight while it requested it.

If we want to be robust and explicit, we can inspect what the scorer requests and take it from metadata. Not sure how to do that properly though

StefanieSenger · 2026-03-26T08:27:21Z

sklearn/callback/_scoring_monitor.py

+    def __init__(self, *, eval_on="train", scoring):
+        self.eval_on = eval_on
+        self.scoring = scoring
+        # Turn the scorer into a MultimetricScorer which can route score params


Suggested change

# Turn the scorer into a MultimetricScorer which can route score params

# Turn the scorer into a _MultimetricScorer

Nit.

StefanieSenger · 2026-03-26T08:47:55Z

sklearn/callback/_scoring_monitor.py

+        scores = {"eval_on": eval_on}
+        scorer = self._estimator_scorers[context.estimator_name]
+        score_params = {k: v for k, v in score_params.items() if v is not None}
+        score = scorer(fitted_estimator, X, y, **score_params)


To verify my understanding: When we get the fitted_estimator from reconstruction attributes here, it is desirable that each fitted_estimator is a bit different from another, depending on the learned attributes until the time this hook had been called in fit.

I can see that for MaxIterEstimtaor, n_iter changes between the hook calls. But what would differ in other estimators? Would they differ by the param(s) passed as reconstruction_attributes into call_on_fit_task_end or maybe also by some attribute of the estimator instance via the copy() in _from_reconstruction_attributes (accidentally or on purpose)?

Could you give some examples?

I think I can see, we only need to pass a subset of all the learned attributes (ending on _) to reconstruction_attributes and only if we want to overwrite those the copied estimator estimator exposes at this point in time anyway. Copying the estimator does most of the work of adjusting the estimator.

StefanieSenger · 2026-03-26T11:48:40Z

sklearn/callback/tests/_utils.py

        inner_ctx.call_on_fit_task_begin(X=X, y=y)

-        est.fit(X, y)
+        est.fit(X=X, y=y, **fit_params)


I wonder, if passing **params (containing **fit_params and **score_params) to each consumer would do the trick and prevent us from making every estimator and every callback into a metadata router.

We'd route a bit more information than before, and we'd need to have an additional **params kwarg on consuming estimators.
But since score() can be called from inside fit now when a ScoringMonitor callback is set (or any other callback that uses sample_weight or X_val or user-specified kwargs) that would not be so surprising.

The callback could then use the routed information (for instance if a sample_weight=True/False request is set on a scorer) or require the user to set it in its scorer sample_weight is not set yet.

sklearn/callback/_scoring_monitor.py

StefanieSenger · 2026-03-26T13:10:03Z

sklearn/callback/_scoring_monitor.py

+        if select == "most_recent":
+            return logs[-1] if logs else {}
+
+        return logs


Do we want to give users the option to delete the logs? Currently, they accumulate and the users don't have any control over this.

Suggested change

return logs

return logs

def clear_logs():

....

StefanieSenger · 2026-03-26T16:06:09Z

sklearn/callback/_scoring_monitor.py

+            - "data": the recorded scores for the run. Each score value is associated
+              with the detailed context of the score computation.


Just a suggestion (since I had struggled with this), feel free to not apply.

Suggested change

- "data": the recorded scores for the run. Each score value is associated

with the detailed context of the score computation.

- "data": the recorded scores for each step of the run. Each score

value is annotated with the path of the task leading to score

computation.

Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>

github-actions · 2026-03-27T13:59:58Z

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here

`ruff check`

ruff detected issues. Please run ruff check --fix --output-format=full locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.12.2.

Details


sklearn/callback/_scoring_monitor.py:18:84: W291 Trailing whitespace
   |
17 |     The specified scorer is called on the training or validation data at each iterative
18 |     step of the estimator, and the score is logged by the callback. The logs can be 
   |                                                                                    ^ W291
19 |     retrieved through the `get_logs` method.
   |
   = help: Remove trailing whitespace

Found 1 error.
No fixes available (1 hidden fix can be enabled with the `--unsafe-fixes` option).

`ruff format`

ruff detected issues. Please run ruff format locally and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.12.2.

Details


--- sklearn/callback/_scoring_monitor.py
+++ sklearn/callback/_scoring_monitor.py
@@ -15,7 +15,7 @@
     """Callback that monitors a score for each iterative step of an estimator.
 
     The specified scorer is called on the training or validation data at each iterative
-    step of the estimator, and the score is logged by the callback. The logs can be 
+    step of the estimator, and the score is logged by the callback. The logs can be
     retrieved through the `get_logs` method.
 
     Parameters

1 file would be reformatted, 947 files already formatted

_{Generated for commit: c7a5c53. Link to the linter CI: here}

FrancoisPgm added 2 commits February 25, 2026 11:59

add MetricMonitor callback

7fd54ba

fix timestamp, make it work without pandas

0467ef6

FrancoisPgm commented Feb 25, 2026

View reviewed changes

sklearn/base.py Show resolved Hide resolved

FrancoisPgm commented Feb 25, 2026

View reviewed changes

sklearn/callback/tests/_utils.py Outdated Show resolved Hide resolved

FrancoisPgm commented Feb 25, 2026

View reviewed changes

sklearn/callback/tests/test_metric_monitor.py Outdated Show resolved Hide resolved

FrancoisPgm commented Feb 25, 2026

View reviewed changes

sklearn/callback/_callback_context.py Outdated Show resolved Hide resolved

FrancoisPgm commented Feb 25, 2026

View reviewed changes

sklearn/callback/_callback_support.py Outdated Show resolved Hide resolved

remove pandas import in tests

77f35de

jeremiedbb reviewed Feb 25, 2026

View reviewed changes

jeremiedbb added No Changelog Needed Callbacks labels Feb 25, 2026

FrancoisPgm and others added 4 commits February 26, 2026 11:33

Update sklearn/callback/tests/_utils.py

88e1a64

Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>

Update sklearn/callback/tests/_utils.py

827b13e

Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>

remove intercept

8ba1164

Merge branch 'metric_monitor_callback2' of github.com:FrancoisPgm/sci…

7b28cd8

…kit-learn into metric_monitor_callback2

jeremiedbb moved this to In progress in Labs Feb 26, 2026

jeremiedbb added this to Labs Feb 26, 2026

jeremiedbb mentioned this pull request Feb 27, 2026

Metric monitor callback jeremiedbb/scikit-learn#14

Closed

FrancoisPgm and others added 4 commits March 3, 2026 10:39

Merge branch 'callbacks' into metric_monitor_callback2

f1ae53e

change 'on_validation' arg for 'on', update test to work with or wito…

7415381

…ut pandas, update get_logs output, remove X_val, y_val args from other estimators

Merge branch 'metric_monitor_callback2' of github.com:FrancoisPgm/sci…

29c8fac

…kit-learn into metric_monitor_callback2

use global callback manager

7f3f8f8

jeremiedbb reviewed Mar 3, 2026

View reviewed changes

sklearn/callback/tests/_utils.py Outdated Show resolved Hide resolved

sklearn/callback/_callback_support.py Outdated Show resolved Hide resolved

sklearn/callback/_callback_support.py Outdated Show resolved Hide resolved

FrancoisPgm added 5 commits March 3, 2026 16:09

do the estimator reconstruction in eval_on_fit_task_end, plus small r…

bd87a82

…eview fixes

Add a uuid for the task tree in the callback context and use it in th…

7374c35

…e 'run' key

turn get_logs into logs property

645c709

Use scorers. Put get_logs back, adding as_frame and select args. Move…

ebae54a

… import of callback_management_context to avoid circular imports. Use UTC time. Seperate tests depending on pandas availability.

remove validate_params

2f4adbd

linting

7b0f18a

github-actions bot removed the CI:Linter failure The linter CI is failing on this PR label Mar 19, 2026

jeremiedbb and others added 7 commits March 19, 2026 10:51

more unitary tests

880476c

Merge branch 'metric_monitor_callback2' of github.com:FrancoisPgm/sci…

a89d482

…kit-learn into pr/FrancoisPgm/33407

change as_frame default to False

7f731d6

review logs tests

c76f6c5

Merge branch 'metric_monitor_callback2' of github.com:FrancoisPgm/sci…

a779bc4

…kit-learn into pr/FrancoisPgm/33407

rename -> train / val

73f3e42

review docstrings

d6163af

ogrisel reviewed Mar 19, 2026

View reviewed changes

jeremiedbb and others added 4 commits March 19, 2026 23:05

cln

7841920

iter

fbd1edc

cln

fd5de12

Merge branch 'callbacks' into metric_monitor_callback2

72ca4f0

This was referenced Mar 20, 2026

[Callbacks] Repr for the CallbackContext #33591

Draft

VOTE SLEP 23: Callback API scikit-learn/enhancement_proposals#103

Open

change get_logs output to be more flexible to different task trees

0ad24e8

StefanieSenger reviewed Mar 23, 2026

View reviewed changes

StefanieSenger reviewed Mar 24, 2026

View reviewed changes

jeremiedbb added 3 commits March 25, 2026 03:43

change log format + start fixing tests

a8d15eb

finish updating tests

dda6e2b

address review comments + improve routing

52d4471

StefanieSenger reviewed Mar 25, 2026

View reviewed changes

StefanieSenger reviewed Mar 26, 2026

View reviewed changes

jeremiedbb and others added 3 commits March 26, 2026 17:45

simplify logs generation + put train and val on same row

4eddfd0

Apply suggestions from code review

3057d8c

Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>

Merge branch 'callbacks' into metric_monitor_callback2

c7a5c53

github-actions bot added the CI:Linter failure The linter CI is failing on this PR label Mar 27, 2026

	# Turn the scorer into a MultimetricScorer which can route score params
	# Turn the scorer into a _MultimetricScorer

		- "data": the recorded scores for the run. Each score value is associated
		with the detailed context of the score computation.

Uh oh!

Conversation

FrancoisPgm commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

AI usage disclosure

Any other comments?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeremiedbb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeremiedbb commented Mar 3, 2026

Uh oh!

FrancoisPgm commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

StefanieSenger left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FrancoisPgm commented Feb 25, 2026 •

edited

Loading

FrancoisPgm commented Mar 3, 2026 •

edited

Loading

ogrisel Mar 20, 2026 •

edited

Loading

StefanieSenger left a comment •

edited

Loading

StefanieSenger Mar 24, 2026 •

edited

Loading

StefanieSenger Mar 26, 2026 •

edited

Loading