ENH: Display the number and names of output features by DeaMariaLeon · Pull Request #31937 · scikit-learn/scikit-learn

DeaMariaLeon · 2025-08-13T07:03:12Z

Reference Issues/PRs

Towards #26595

Any other comments?

Example

github-actions · 2025-08-13T07:04:09Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 2005a4e. Link to the linter CI: here}

DeaMariaLeon · 2025-08-21T09:04:17Z

I wonder if I can have feedback before I add/fix more tests.
@glemaitre

glemaitre · 2025-08-22T09:19:23Z

I see that we have to handle one specific case:

We have an internal PassThrough transformer that forward the input feature as-is and this it means that we should mention that the output feature are the same as n_features_in_.

jeremiedbb · 2025-08-22T09:42:11Z

I find the block a bit big, it takes as much space as the estimator itself. I was also thinking that having the input features would be nice but then it really starts to take a lot of space around the estimator. So I wondered if the features could be intermediate blocks in the diagram, representing both the output features from the previous estimator and the input features for the next estimator. Something like this

This way the diagram alternates estimator blocks and data blocks
Then the text would be different obviously, like "16 features", or even the full shape ?

In addition, in this PR or in a following one, the data block could be unfold to show the feature names if available.

glemaitre · 2025-08-22T09:57:02Z

One feedback of @ogrisel IRL is to directly show the feature names using the same pattern than "Parameters".

I personally agree with @jeremiedbb feedback: I would like something smaller. Also write now, we have to mention "output features" instead of simply "features" because of the ambiguity input/output when attached to the estimator. So the proposal to make the "feature" being blocks leaving on their own is nice I think because there is not ambiguity anymore.

DeaMariaLeon · 2025-08-22T10:12:00Z

I'll work on this, thanks for the feedback. Just:

One feedback of @ogrisel IRL is to directly show the feature names using the same pattern than "Parameters".

Should I add the feature names on this PR? I remember @glemaitre saying that they should be added on a separate PR.

glemaitre · 2025-08-25T09:03:56Z

Should I add the feature names on this PR?

I want to dissociate it at first but since we are going to create a new block, it might be better to have directly the feature names as well.

DeaMariaLeon · 2026-03-18T16:34:57Z

Hi @DeaMariaLeon, here is first pass of review (I haven't looked at the tests yet).

I don't see any changes made to plot_column_transformer_mixed_types. Could you try set_output(transform="pandas") as suggested here #31937 (comment) and as you have already done for plot_cyclical_feature_engineering ?

I thought that his comment was just an explanation on the question you had. I didn't understand I should actually make the change. I'll do it.

~~EDIT: Do you know how to do that? I either get an error, or keep getting just the "x0, x4, x5" etc. @antoinebaker~~

DeaMariaLeon · 2026-03-19T09:12:53Z

I did reply to #31937 (comment), but it's only visible on github's "Files changed".

DeaMariaLeon · 2026-03-19T10:47:04Z

In this comment: #31937 (comment)
I wrote an "EDIT" that may be difficult to see. So I'll add that here just in case:

Hi @DeaMariaLeon, here is first pass of review (I haven't looked at the tests yet).

I don't see any changes made to plot_column_transformer_mixed_types. Could you try set_output(transform="pandas") as suggested here #31937 (comment) and as you have already done for plot_cyclical_feature_engineering ?

Me:

I thought that his comment was just an explanation on the question you had. I didn't understand I should actually make the change. I'll do it.

~~Me again:~~
~~Do you know how to do that? I either get an error, or keep getting just the "x0, x4, x5" etc. @antoinebaker~~

antoinebaker · 2026-03-19T11:00:08Z

Do you know how to do that? I either get an error, or keep getting just the "x0, x4, x5" etc. @antoinebaker

Well if the set_output(transform="pandas") does not work, don't bother :) Could you instead create a new issue with a screeenshot for this example ? (after this PR is merged)

DeaMariaLeon · 2026-03-20T08:05:24Z

Well if the set_output(transform="pandas") does not work, don't bother :) Could you instead create a new issue with a screeenshot for this example ? (after this PR is merged)

Looking at this again, I fail to see the issue. On that particular example (plot_column_transformer_mixed_types), the input of SelectPercentile are not the original features. Its inputs are already transformed by OneHotEncoder, so how can it give the names of the original columns? I think that what it shows is correct, but I may be missing something.

antoinebaker · 2026-03-20T09:07:44Z

Its inputs are already transformed by OneHotEncoder, so how can it give the names of the original columns?

Not the original names but the names output by the OneHotEncoder:

sex, pclass -> OneHotEncoder -> sex_female, sex_male, pclass_1, pclass_2, ... -> SelectPercentile -> sex_female, sex_male, pclass_1, pclass_3

which makes the preprocessing much easier to follow that "anonymous" column names such as x0, x1, ...

DeaMariaLeon · 2026-03-20T10:31:00Z

You are right, it would be easier.

DeaMariaLeon · 2026-03-20T11:36:31Z

But it works:

DeaMariaLeon · 2026-03-20T11:41:17Z

I mean: it works with a small change to the example.

EDIT (note for myself): There was a known issue with OneHotEncoder, and needed to set sparse=False to it on the example. That way one can use set_output(transform="pandas") and don't brake (the example).

DeaMariaLeon · 2026-03-20T12:28:49Z

Example can be seen from the built docs:
https://output.circle-artifacts.com/output/job/e9b6714c-0349-4347-8f40-4becd58d9e0d/artifacts/0/doc/auto_examples/compose/plot_column_transformer_mixed_types.html

DeaMariaLeon · 2026-03-20T12:48:48Z

Up to here, I think I have added all the feedback from @antoinebaker ~~except #31937 (comment) because of the circular import.~~

EDIT: Imported ColumnTransformer as suggested.

antoinebaker

Thanks for the PR @DeaMariaLeon! LGTM.

sklearn/compose/tests/test_column_transformer.py

DeaMariaLeon · 2026-03-23T16:41:45Z

Thanks @antoinebaker!

Adding features number

09c4525

github-actions bot added the module:utils label Aug 13, 2025

DeaMariaLeon added 8 commits August 13, 2025 11:38

Fixing ColumnTransformer plus styling

05b4dda

Clearer message

af30a8d

wip

d3f2ea2

works with single

68f1e7b

moving features

fd32e29

fix when not fitted

e7437d6

[ci skip] css work

e9510b1

Added - fixed test

08e7e82

DeaMariaLeon changed the title ~~WIP: Display the shape of outgoing data structures~~ WIP: Display the number of outgoing data structures Aug 18, 2025

DeaMariaLeon changed the title ~~WIP: Display the number of outgoing data structures~~ WIP: Display the number of output features Aug 18, 2025

DeaMariaLeon added 5 commits August 18, 2025 16:02

changing test_column_transformer

9d2c87c

replace is_fitted_css_class - not always true

9238ed0

fix when empty pipeline

a73f266

[doc build] Build examples

37146e1

[doc build] one test

9c93171

DeaMariaLeon marked this pull request as ready for review August 21, 2025 09:03

DeaMariaLeon changed the title ~~WIP: Display the number of output features~~ ENH: Display the number of output features Aug 21, 2025

glemaitre self-requested a review August 22, 2025 09:03

jeremiedbb mentioned this pull request Sep 10, 2025

Unexpected behavior of the HTML repr of meta-estimators #32146

Open

jeremiedbb added the frontend label Sep 12, 2025

DeaMariaLeon force-pushed the features2 branch from a2c76ba to 9c93171 Compare September 12, 2025 12:44

DeaMariaLeon added 8 commits March 18, 2026 17:52

add set_output to example

497f46f

lineup text in rst

23ae61e

More feedback work

954c73f

list compr

44ac903

testing

c8376c3

remove change to featureunion

7c140d0

remove set_output

df31f35

leave test the way it was for estimator

41ff5f3

remove line break

6034a8c

changed is_column_transformer

912d918

jeremiedbb mentioned this pull request Mar 20, 2026

FIX ColumnTransformer HTML display incorrect when all columns are transformed #33531

Merged

4 tasks

changed example to get column names - all transformers

e7ae5b7

feedback - correct is_column_transformer

bcd2bbc

antoinebaker approved these changes Mar 23, 2026

View reviewed changes

sklearn/compose/tests/test_column_transformer.py Outdated Show resolved Hide resolved

Correct comment

2202715

merge conflict

5ebdc94

jeremiedbb added this to the 1.9 milestone Mar 30, 2026

DeaMariaLeon mentioned this pull request Mar 31, 2026

POC: HTML displays structure change to be vertical #33647

Draft

4 tasks

Uh oh!

Conversation

DeaMariaLeon commented Aug 13, 2025 • edited by ogrisel Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

Any other comments?

Uh oh!

github-actions bot commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

DeaMariaLeon commented Aug 21, 2025

Uh oh!

glemaitre commented Aug 22, 2025

Uh oh!

jeremiedbb commented Aug 22, 2025

Uh oh!

glemaitre commented Aug 22, 2025

Uh oh!

DeaMariaLeon commented Aug 22, 2025

Uh oh!

glemaitre commented Aug 25, 2025

Uh oh!

DeaMariaLeon commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeaMariaLeon commented Mar 19, 2026

Uh oh!

DeaMariaLeon commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

antoinebaker commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeaMariaLeon commented Mar 20, 2026

Uh oh!

antoinebaker commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeaMariaLeon commented Mar 20, 2026

Uh oh!

DeaMariaLeon commented Mar 20, 2026

Uh oh!

DeaMariaLeon commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeaMariaLeon commented Mar 20, 2026

Uh oh!

DeaMariaLeon commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

antoinebaker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DeaMariaLeon commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

DeaMariaLeon commented Aug 13, 2025 •

edited by ogrisel

Loading

github-actions bot commented Aug 13, 2025 •

edited

Loading

DeaMariaLeon commented Mar 18, 2026 •

edited

Loading

DeaMariaLeon commented Mar 19, 2026 •

edited

Loading

antoinebaker commented Mar 19, 2026 •

edited

Loading

antoinebaker commented Mar 20, 2026 •

edited

Loading

DeaMariaLeon commented Mar 20, 2026 •

edited

Loading

DeaMariaLeon commented Mar 20, 2026 •

edited

Loading