Skip to content

ENH: Display the number and names of output features#31937

Merged
jeremiedbb merged 207 commits intoscikit-learn:mainfrom
DeaMariaLeon:features2
Apr 15, 2026
Merged

ENH: Display the number and names of output features#31937
jeremiedbb merged 207 commits intoscikit-learn:mainfrom
DeaMariaLeon:features2

Conversation

@DeaMariaLeon
Copy link
Copy Markdown
Member

@DeaMariaLeon DeaMariaLeon commented Aug 13, 2025

Reference Issues/PRs

Towards #26595

Any other comments?

Example
Screenshot 2025-08-20 at 18 14 43

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Aug 13, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2005a4e. Link to the linter CI: here

@DeaMariaLeon DeaMariaLeon changed the title WIP: Display the shape of outgoing data structures WIP: Display the number of outgoing data structures Aug 18, 2025
@DeaMariaLeon DeaMariaLeon changed the title WIP: Display the number of outgoing data structures WIP: Display the number of output features Aug 18, 2025
@DeaMariaLeon DeaMariaLeon marked this pull request as ready for review August 21, 2025 09:03
@DeaMariaLeon
Copy link
Copy Markdown
Member Author

I wonder if I can have feedback before I add/fix more tests.
@glemaitre

@DeaMariaLeon DeaMariaLeon changed the title WIP: Display the number of output features ENH: Display the number of output features Aug 21, 2025
@glemaitre glemaitre self-requested a review August 22, 2025 09:03
@glemaitre
Copy link
Copy Markdown
Member

I see that we have to handle one specific case:

image

We have an internal PassThrough transformer that forward the input feature as-is and this it means that we should mention that the output feature are the same as n_features_in_.

@jeremiedbb
Copy link
Copy Markdown
Member

I find the block a bit big, it takes as much space as the estimator itself. I was also thinking that having the input features would be nice but then it really starts to take a lot of space around the estimator. So I wondered if the features could be intermediate blocks in the diagram, representing both the output features from the previous estimator and the input features for the next estimator. Something like this
output_features

This way the diagram alternates estimator blocks and data blocks
Then the text would be different obviously, like "16 features", or even the full shape ?

In addition, in this PR or in a following one, the data block could be unfold to show the feature names if available.

@glemaitre
Copy link
Copy Markdown
Member

One feedback of @ogrisel IRL is to directly show the feature names using the same pattern than "Parameters".

I personally agree with @jeremiedbb feedback: I would like something smaller. Also write now, we have to mention "output features" instead of simply "features" because of the ambiguity input/output when attached to the estimator. So the proposal to make the "feature" being blocks leaving on their own is nice I think because there is not ambiguity anymore.

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

I'll work on this, thanks for the feedback. Just:

One feedback of @ogrisel IRL is to directly show the feature names using the same pattern than "Parameters".

Should I add the feature names on this PR? I remember @glemaitre saying that they should be added on a separate PR.

@glemaitre
Copy link
Copy Markdown
Member

Should I add the feature names on this PR?

I want to dissociate it at first but since we are going to create a new block, it might be better to have directly the feature names as well.

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

If only one feature is processed (like in the Normalizer below), would it be too much trouble to dynamically make it singular (i.e., "1 feature" instead of "1 features")?

I changed that.

@DeaMariaLeon DeaMariaLeon moved this from In progress to Blocked in Labs Apr 10, 2026
Copy link
Copy Markdown
Contributor

@AnneBeyer AnneBeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your work, @DeaMariaLeon! This is really not straight forward and the output looks really good!
Sorry for taking so long, I checked all included examples visually and tried to understand the code changes. Here is my review so far, only some small suggestions. I still have two files to go, but I didn't want to keep you waiting even longer.

Comment thread examples/compose/plot_column_transformer_mixed_types.py

if est_block.names == "NoneType(...)":
est_block.names = "passthrough"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for adding a FIXME/TODO comment for a follow-up PR.

In lines 348 and 356

not est_block.names = "passthrough"

is part of the condition being checked, but it seems that it never actually is "passthrough" (at least when debugging the code snippet from above), so either this needs to be set already in _get_visual_block (or even in ColumnTransformer's_sk_visual_block_()?) or this condition in the lines above is actually redundant? I haven't found the exact location yet where this NoneType is set. It seems to be related to names=estimator.__class__.__name__, in _get_visual_block when estimator is (None,), but I got lost in the recursion. Maybe you can find it faster?

Comment thread sklearn/utils/_repr_html/features.py
Comment thread sklearn/utils/_repr_html/features.css
Comment thread sklearn/utils/_repr_html/features.py
Comment thread sklearn/utils/_repr_html/tests/test_js.py
Copy link
Copy Markdown
Contributor

@AnneBeyer AnneBeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here is my feedback on the remaining two files.

Comment thread sklearn/utils/_repr_html/estimator.js
Comment thread sklearn/utils/_repr_html/estimator.js
Comment thread sklearn/utils/_repr_html/tests/test_features.py
Comment thread sklearn/utils/_repr_html/tests/test_features.py
Copy link
Copy Markdown
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small unfinished review I had pending on my side and forgot to submit.

Comment thread sklearn/utils/_repr_html/tests/test_features.py Outdated
Comment thread sklearn/utils/_repr_html/tests/test_features.py Outdated
@DeaMariaLeon
Copy link
Copy Markdown
Member Author

+1 for adding a FIXME/TODO comment for a follow-up PR.
In lines 348 and 356

not est_block.names = "passthrough"

Thanks, I removed that. I believe a FIXME is not needed (unless the tests fail of course).

@DeaMariaLeon DeaMariaLeon removed the request for review from glemaitre April 14, 2026 14:31
Copy link
Copy Markdown
Contributor

@AnneBeyer AnneBeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thank you @DeaMariaLeon!
Could we merge this @ogrisel?

Copy link
Copy Markdown
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks everyone !

@jeremiedbb
Copy link
Copy Markdown
Member

@DeaMariaLeon I checked the modified examples and when the html repr is too large, there's no horizontal scrollbar:
tmp_html

I guess we didn't notice so far because we only showed small displays. It doesn't seem to be a bug introduced in this PR but it makes the rendering of the modified examples not great. I don't know if and how it would be possible to have a scrollbar here.

@jeremiedbb
Copy link
Copy Markdown
Member

or maybe for now we don't show the estimator in this example.

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

@jeremiedbb are we sure that there should be an horizontal scroll bar?
I can still move the display around to see the right side of the estimator

@jeremiedbb
Copy link
Copy Markdown
Member

What's your browser ? I'm on firefox and I can't

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

chrome and safari, both work. It's not introduced with this PR in any case because if you open this example in main, there is no horizontal scroll bar either: https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types
(If you open columntransformer and then the right transformer "cat" it gets very wide too).

@jeremiedbb
Copy link
Copy Markdown
Member

Indeed. And on firefox I can't move it either. I'm goind to open a dedicated issue. Let's merge this one !

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

oh.. but I'm doing it with the trackpad, not the mouse! it works that way with firefox too (not the mouse but the trackpad inside the laptop).

Thanks a lot everybody!

@jeremiedbb jeremiedbb merged commit d86a1df into scikit-learn:main Apr 15, 2026
38 checks passed
@github-project-automation github-project-automation Bot moved this from Blocked to Done in Labs Apr 15, 2026
@jeremiedbb
Copy link
Copy Markdown
Member

oh.. but I'm doing it with the trackpad, not the mouse! it works that way with firefox too (not the mouse but the trackpad inside the laptop).

hum, but it's not a laptop for me, there's no trackpad 😄

@DeaMariaLeon DeaMariaLeon deleted the features2 branch April 15, 2026 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants