Skip to content

FeatureUnion: Add verbose_feature_names_out parameter #25889

@sebmoh

Description

@sebmoh

Describe the workflow you want to enable

ColumnTransformer has the option to specify whether or not to "prefix all feature names with the name of the transformer that generated that feature" using the verbose_feature_names_out parameter.

FeatureUnion does not have this option. As a user I would like to control whether or not I want to have the prefix added. Right now, the prefix is added without any option to turn that functionality off.

Describe your proposed solution

I propose to add the same parameter that exists in ColumnTransformer to FeatureUnion so that the user can decide whether or not they would like to have the prefix or not.

Describe alternatives you've considered, if relevant

The user can remove the prefox manually after FeatureUnion has transformed the data
Requires to identify the columns in question and remove the prefix. Removing the prefix via a simple str.split("__")[1] could cause problems if the feature names already include __. If that is the case a more complicated way to remove the prefix is required by the user.

The user can write their own implementation of FeatureUnion that changes the method in question

from sklearn.pipeline import FeatureUnion

class NewFeatureUnion(FeatureUnion):
    def get_feature_names_out(self, input_features=None):
        """
        Overwrites the original method to not modify the features names
        """
        feature_names = []
        for name, trans, _ in self._iter():
            if not hasattr(trans, "get_feature_names_out"):
                raise AttributeError(
                    "Transformer %s (type %s) does not provide get_feature_names_out."
                    % (str(name), type(trans).__name__)
                )
            feature_names.extend(
                [f for f in trans.get_feature_names_out(input_features)]
            )
        return np.asarray(feature_names, dtype=object)

Adds more code to the user's codebase that needs to be maintained. Could also cause confusion because there is no obvious class name that explains that the new implementation does, so another developer would have to find the code snippet to figure out what the new implementation does.

Additional context

There might be a good reason why we don't want to have the same functionality in FeatureUnion that is available for ColumnTransformer. This is my first issue so I might be missing something.

A one-to-one implementation of the parameter might break things that are dependent on FeatureUnion. I am not that familiar with the rest of the codebase.

I think I would be able to create a merge request for it if this parameter request is approved.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions