-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
sklearn.base.clone cannot clone estimator with pandas data frame parameters #5522
Copy link
Copy link
Closed
Description
I am trying to create an estimator with pandas data frame as one of the parameters, and find sklearn has problem for cloning this estimator by sklearn.base.clone. The sample code is as following:
from sklearn.base import BaseEstimator, TransformerMixin
import pandas as pd
from sklearn.base import clone
class DummyEstimator(BaseEstimator, TransformerMixin):
"""This is a dummpy class for generating numerical features
This feature extractor extracts numerical features from pandas data frame.
Parameters
----------
df: pandas data frame
The pandas data frame parameter.
Notes
-----
"""
def __init__(self, df):
self.df =df
def fit(self, X, y=None):
pass
def transform(self, X, y=None):
pass
if __name__ == "__main__":
# Generate a data frame
d = {"a": [1, 2, 3],
"b": [4, 5, 6],
"c": [7, 8, 9]
}
df = pd.DataFrame(d)
# Get an estimator instance
de = DummyEstimator(df)
# Clone the estimator
ret = clone(de)
if you run the above code, you would get error as following:
Traceback (most recent call last):
File "bug.py", line 38, in <module>
ret = clone(de)
File "/usr/local/lib/python2.7/dist-packages/sklearn/base.py", line 93, in clone
if not equality_test:
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 730, in __nonzero__
.format(self.__class__.__name__))
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels