-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Description
I've been trying to implement SGDClassifier in an online app (need to use partial fitting) using loss='log', and tuning the various parameters as best I could to replicate the results I get using a LogisticRegression classifier. I'm trying to minimize neg_log_loss here. The only way I seem to be able to get good results is to set 'n_iter' to some large value (~1000). If I leave 'n_iter' out, the results are very bad due (I think) to early stoppage of the stochastic gradient descent. However, I just saw in the scikit-learn docs that:
n_iter : int, optional
The number of passes over the training data (aka epochs). Defaults to None. Deprecated, will be removed in 0.21.
Changed in version 0.19: Deprecated
On my local machine (running python 3.6.1, scikit-learn 0.19.0), I see no error when using 'n_iter' and the results are good. However, when I implement the code in a web app (pythonanywhere) running python 3.6 and scikit-learn 0.19.0, I get the error: AttributeError: 'SGDClassifier' object has no attribute 'n_iter'.
My question is, if 'n_iter' is now deprecated, how do I prevent the algorithm from early stopping? And why should it work on my current version of python/scikit learn but not elsewhere? Thanks!