-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Pickling of sklearn.tree.Tree #521
Copy link
Copy link
Closed
Labels
Description
When running the code below the line (note I am pickling/unpickling directly to a byte string and not writing a file object, the same error occurs with a file object pointing to a io buffer).
reLoadCLF = pickle.loads(pickle.dumps(clf))crashes with File "sklearn/tree/_tree.pyx", line 601, in
sklearn.tree._tree.Tree.__cinit__
ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'long long'I think it should work since we aren't running a 64 vs a 32 bit Python nor having different library versions. I have tried using different protocols and different skLearn classifiers to no avail. Is there a way to do this with the existing codebase or is this a bug?
#Random forest classsifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
#import pandas as pd
#loading the dataset
iris = load_iris()
X = iris.data
print(X)
y = iris.target
print(y)
#Split the data
X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=42,test_size=0.5)
#Build the model
clf = RandomForestClassifier(n_estimators=10)
#Train the classifier
clf.fit(X_train, y_train)
#Predictions
predicted = clf.predict(X_test)
#Check accuracy
print(accuracy_score(predicted, y_test))
import pickle
reLoadCLF = pickle.loads(pickle.dumps(clf))
print('reLoad & go')
predictedRL = reLoadCLF.predict(X_test)
print(accuracy_score(predictedRL, y_test))Reactions are currently unavailable