-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Pipeline doesn't work with Label Encoder #3956
Copy link
Copy link
Closed
Labels
Description
I've found that I cannot use pipelines if I wish to use the label encoder. In the following I wish to build a pipeline that first encodes the label and then constructs a one-hot encoding from that labelling.
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from sklearn.pipeline import make_pipeline
import numpy as np
X = np.array(['cat', 'dog', 'cow', 'cat', 'cow', 'dog'])
enc = LabelEncoder()
hot = OneHotEncoder()
pipe = make_pipeline(enc, hot)
pipe.fit_transform(X)
However, the following error is returned:
lib/python2.7/site-packages/sklearn/pipeline.pyc in _pre_transform(self, X, y, **fit_params)
117 for name, transform in self.steps[:-1]:
118 if hasattr(transform, "fit_transform"):
--> 119 Xt = transform.fit_transform(Xt, y, **fit_params_steps[name])
120 else:
121 Xt = transform.fit(Xt, y, **fit_params_steps[name]) \
TypeError: fit_transform() takes exactly 2 arguments (3 given)
It seems that the problem is that the fit method for label encoder only takes a y argument, whereas the pipeline assumes that it will take an X and an optional y.
Reactions are currently unavailable