-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
FeatureUnion: Add verbose_feature_names_out parameter #25889
Description
Describe the workflow you want to enable
ColumnTransformer has the option to specify whether or not to "prefix all feature names with the name of the transformer that generated that feature" using the verbose_feature_names_out parameter.
FeatureUnion does not have this option. As a user I would like to control whether or not I want to have the prefix added. Right now, the prefix is added without any option to turn that functionality off.
Describe your proposed solution
I propose to add the same parameter that exists in ColumnTransformer to FeatureUnion so that the user can decide whether or not they would like to have the prefix or not.
Describe alternatives you've considered, if relevant
The user can remove the prefox manually after FeatureUnion has transformed the data
Requires to identify the columns in question and remove the prefix. Removing the prefix via a simple str.split("__")[1] could cause problems if the feature names already include __. If that is the case a more complicated way to remove the prefix is required by the user.
The user can write their own implementation of FeatureUnion that changes the method in question
from sklearn.pipeline import FeatureUnion
class NewFeatureUnion(FeatureUnion):
def get_feature_names_out(self, input_features=None):
"""
Overwrites the original method to not modify the features names
"""
feature_names = []
for name, trans, _ in self._iter():
if not hasattr(trans, "get_feature_names_out"):
raise AttributeError(
"Transformer %s (type %s) does not provide get_feature_names_out."
% (str(name), type(trans).__name__)
)
feature_names.extend(
[f for f in trans.get_feature_names_out(input_features)]
)
return np.asarray(feature_names, dtype=object)
Adds more code to the user's codebase that needs to be maintained. Could also cause confusion because there is no obvious class name that explains that the new implementation does, so another developer would have to find the code snippet to figure out what the new implementation does.
Additional context
There might be a good reason why we don't want to have the same functionality in FeatureUnion that is available for ColumnTransformer. This is my first issue so I might be missing something.
A one-to-one implementation of the parameter might break things that are dependent on FeatureUnion. I am not that familiar with the rest of the codebase.
I think I would be able to create a merge request for it if this parameter request is approved.