Skip to content

ENH: Display parameters in HTML representation#30763

Merged
glemaitre merged 145 commits intoscikit-learn:mainfrom
DeaMariaLeon:get-params
May 21, 2025
Merged

ENH: Display parameters in HTML representation#30763
glemaitre merged 145 commits intoscikit-learn:mainfrom
DeaMariaLeon:get-params

Conversation

@DeaMariaLeon
Copy link
Copy Markdown
Member

@DeaMariaLeon DeaMariaLeon commented Feb 3, 2025

Reference Issues/PRs

Working on first point on #26595
Fixes: #21266

What does this implement/fix? Explain your changes.

  • This code allows to visualise the estimator's parameter values. It adds an interactive dropdown hidden by default.
  • Copy paste button added. When clicking on it, the parameter's name is saved to the clipboard.

Any other comments?

@glemaitre : WDYT?

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 3, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 3d5804f. Link to the linter CI: here

@DeaMariaLeon DeaMariaLeon changed the title WIP: get_params() work - not ready for review WIP: get_params() work Feb 3, 2025
@glemaitre glemaitre self-requested a review February 6, 2025 15:28
@betatim
Copy link
Copy Markdown
Member

betatim commented Feb 7, 2025

This looks super cool! What about a more descriptive title for the PR? Every time I see a notification related to it pop up I think: what get_params is broken??!! To one second later realise what this PR is really about :D

Maybe something like "Display parameters in HTML representation"?

@DeaMariaLeon DeaMariaLeon changed the title WIP: get_params() work WIP: Display parameters in HTML representation Feb 7, 2025
@DeaMariaLeon
Copy link
Copy Markdown
Member Author

Here is the example again: Column Transformer with Mixed Types

@ArturoAmorQ
Copy link
Copy Markdown
Member

ArturoAmorQ commented May 13, 2025

Sorry for jumping in. I don't want to bring noise to a PR that already has 6 participants. I just wanted to share that I had a look on the render linked in #30763 (comment) and I noticed a change in behavior.

  • On main:
    image
  • This PR:
    image

Personally I find it more useful to know the column names than knowing they were selected using a make_column_selector object. Is there a way to retrieve the actual names of the selected columns from the object?

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

DeaMariaLeon commented May 13, 2025

@ArturoAmorQ thanks for bringing it up!.. it's a bug..
The names of the columns are shown correctly before make_column_selector is used (the first diagram in the document is fine).

@glemaitre
Copy link
Copy Markdown
Member

I don't think that there is a bug here. @ArturoAmorQ I think that you click on the first pipeline in main and the second pipeline in this PR. In main, it is the rendering of the the second pipeline:

image

I was a bit surprise because the code never execute any function when it comes to diagram. However, it could be a further improvement potentially by recording the column name once fit is called and show them once the pipeline is fitted. Before fit, we cannot do better than the signature of the object. We could also the representation of the make_column_selector function to have a better string reprenstation.

@ArturoAmorQ
Copy link
Copy Markdown
Member

Actually it's me who read the example too fast. Sorry for that. There are 2 pipelines in the example and the second one uses the make_column_selector. Your PR has the same behavior as before.

Still I wonder if we can somehow retrieve the column names from the make_column_selector 🤔 In any case that would be for a different PR.

Sorry again for the noise.

@glemaitre
Copy link
Copy Markdown
Member

Thanks @trallard for the guideline. We definitely need some visual improvements with those diagrams. Basically, this PR was a good starting point to find out a couple of limitations: inconsistency between IDEs/doc rendering and dark/light theme management (among others).

We should probably limit the scope of this PR but we need to revisit specifically the look of those diagrams and those guidelines will be useful for sure. We can hopefully also reuse knowledge that we gained when developing a couple of add-ons in skrub.

@trallard
Copy link
Copy Markdown

+1 on limiting scope.
If there is anything I can help with just give me a ping whenever.

@GaelVaroquaux
Copy link
Copy Markdown
Member

@trallard , happy to have you around. Thanks a lot!

Copy link
Copy Markdown
Member

@GaelVaroquaux GaelVaroquaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rendering looks fine to me (I no longer have issues with the contrast).

I do have one question on the code (inlined in the review).

In addition, I just wanted to check: this refactor does not change the protocol for subclasses in other packages compared to the one that we had before? If it does, we should document this clearly.

return self._repr_html_inner

def _repr_html_inner(self):
return self._html_repr()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this method? It seems to me that we could simply call _html_repr istead of this

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was taken from the original base.py code (that's the way it is in main). But I forgot to add its docstring:

This function is returned by the @property `_repr_html_` to make
`hasattr(estimator, "_repr_html_") return `True` or `False` depending
on `get_config()["display"]`. 

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I should do it differently of course I'll do it, but in the meantime I added the docstring.

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

this refactor does not change the protocol for subclasses in other packages compared to the one that we had before?

No, it doesn't.

Copy link
Copy Markdown
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GaelVaroquaux for the review. Looks good on my side. Let's merge.

Thanks @DeaMariaLeon it is a nice improvement.

@glemaitre glemaitre merged commit d077f82 into scikit-learn:main May 21, 2025
36 checks passed
@DeaMariaLeon
Copy link
Copy Markdown
Member Author

Thank you All!

@DeaMariaLeon DeaMariaLeon deleted the get-params branch May 21, 2025 08:36
@GaelVaroquaux
Copy link
Copy Markdown
Member

GaelVaroquaux commented May 21, 2025 via email

@betatim
Copy link
Copy Markdown
Member

betatim commented May 21, 2025

Whoop whoop 🥳

@GaelVaroquaux
Copy link
Copy Markdown
Member

GaelVaroquaux commented May 21, 2025 via email

@betatim
Copy link
Copy Markdown
Member

betatim commented May 21, 2025

I tried this out (for the first time) just now. One thing I noticed is that there is a "copy" button for each parameter. Not having paid attention to the discussion.design of this I approached it like an uneducated user. My first thought was that maybe it will copy the argument and value (n_neighbors=20) but it copies only the parameter name (n_neighbors). Overall I was wondering what the idea behind the feature was?

Screenshot 2025-05-21 at 13 01 31

@glemaitre
Copy link
Copy Markdown
Member

glemaitre commented May 21, 2025

@betatim The feature is particularly useful when you have nested pipeline and column transformer to get the fully qualified name, e.g. histgradientboostinclasssifier__max_depth. With a single estimator, it is a bit less pertinent. You can see it as an alternative to call pipeline.get_params to find the fully qualified name to copy it and paste it in pipeline.set_params (e.g. when doing the parameter grid in GridSearchCV.

We started only by copying the parameter name but I don't mind extending to the parameter value if we think it is useful.

@betatim
Copy link
Copy Markdown
Member

betatim commented May 21, 2025

That makes sense. Maybe something to show off in the release highlight to help others realise why copying the name is useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Display all parameter values in a tabular for a tab of the notebook HTML repr of estimators

7 participants