added unicode catch to predict_proba function in SVC#10338
added unicode catch to predict_proba function in SVC#10338JoshuaMeyers wants to merge 7 commits intoscikit-learn:masterfrom
Conversation
See EducationalTestingService/skll#87 Unicode handling in `.predict` had not been added to `.predict_proba`
|
Thank you for your PR. Are you sure this wasn't fixed in #7064 ? If so please add a unit test (or extend |
|
you're mixing tabs and spaces. Please use spaces. |
…nctions Previously, unicode test only tested `clf.fit`. In particular, `predict_proba` currently fails if unicode is provided to the `kernel` parameter
|
Thanks for your responses! @rth The code from the issue #7064 is the same but this was not applied to the @amueller My bad, I have fixed the spacing! It is necessary to catch unicode objects that are given to the |
| "the number of samples at training time" % | ||
| (X.shape[1], self.shape_fit_[0])) | ||
|
|
||
| if six.PY3: |
There was a problem hiding this comment.
Yeah I was confused as to why this didn't happen ;)
| clf = svm.SVC(kernel=unicode('linear'), probability=True) | ||
| clf.fit(X, Y) | ||
| clf.predict(T) | ||
| clf.predict_proba(T) |
There was a problem hiding this comment.
yep sounds a good idea, works on my local copy but just for completeness
|
worry less about lgtm and more about travis ;) Why is this not necessary in the sparse case? |
|
@amueller haha okay I'll focus on travis first. As for why this isn't necessary in the sparse case, the kernel is specified by its index in the sparse case which seems not to be sensitive to this... I've tested and indeed, the sparse versions do not error. |
# The order of these must match the integer values in LibSVM.
# XXX These are actually the same in the dense case. Need to factor
# this out.
_sparse_kernels = ["linear", "poly", "rbf", "sigmoid", "precomputed"]Maybe we should just do that? It's weird to have substantially different code paths for sparse and dense. If you'd rather not get involved in this, that's fine, too ;) |
|
Basically we just need to move |
|
It is still in progress? |
|
@qmick Feel free to take it if @JoshuaMeyers doesn't reply after some time. Note that we are using multiple ways to handle the problem currently. |
|
@qinhanmin2014 Thanks, I'll keep looking at it. |
|
Hey @qmick, feel free to have a go at this. I haven't had the time. Apologies! |
|
Thanks @JoshuaMeyers for your great work so far :) |
|
This issue was addressed by #10412. Closing this MR. |
What does this implement/fix? Explain your changes.
Adds mixed object type handling (unicode/string) to the
kernelparameter of SVM models.This was present for the
predictfunction, but neglected forpredict_probaFor a recreation of the error, see EducationalTestingService/skll#87