[MRG+1] Fix MultinomialNB and BernoulliNB alpha=0 bug (continuation)#9131
[MRG+1] Fix MultinomialNB and BernoulliNB alpha=0 bug (continuation)#9131jmschrei merged 17 commits intoscikit-learn:masterfrom
Conversation
…into fix_NB_5814
jnothman
left a comment
There was a problem hiding this comment.
Thanks. Please wrap those fit calls in assert_warns in tests to ensure the warning is triggered.
We should also have a small test for the ValueError case, using assert_raise_messsage or similar.
Thanks for taking this on.
|
Thanks for reviewing! Tests are done. |
sklearn/tests/test_naive_bayes.py
Outdated
| # Test sparse X | ||
| X = scipy.sparse.csr_matrix(X) | ||
| nb = BernoulliNB(alpha=0.) | ||
| nb.fit(X, y) |
There was a problem hiding this comment.
This will also raise the warning which we would rather not see in tests. Either assert or ignore.
sklearn/tests/test_naive_bayes.py
Outdated
| y = np.array([0, 1]) | ||
| b_nb = BernoulliNB(alpha=-0.1) | ||
| m_nb = MultinomialNB(alpha=-0.1) | ||
| assert_raises(ValueError, b_nb.fit, X, y) |
There was a problem hiding this comment.
Better if this tests the error message with the assert_raise_* variants
|
Please add an entry to the change log at |
sklearn/naive_bayes.py
Outdated
|
|
||
| def _check_alpha(self): | ||
| if self.alpha < 0: | ||
| raise ValueError('Smoothing parameter alpha = %e. ' |
There was a problem hiding this comment.
We can now see that %e is clearly a bad pick. %.1e would be better and just as informative
There was a problem hiding this comment.
Very good point! done
jnothman
left a comment
There was a problem hiding this comment.
Otherwise, and assuming tests pass, LGTM
sklearn/naive_bayes.py
Outdated
| def _check_alpha(self): | ||
| if self.alpha < 0: | ||
| raise ValueError('Smoothing parameter alpha = %.1e. ' | ||
| 'alpha must be >= 0!' % self.alpha) |
There was a problem hiding this comment.
Just noticed this says >= which is a little awkward we warn when it's 0 and change the value
There was a problem hiding this comment.
Perhaps alternate language is "alpha should be > 0", as opposed to 'must be'?
There was a problem hiding this comment.
thanks, it's already fixed
| else: | ||
| self.class_log_prior_ = np.zeros(n_classes) - np.log(n_classes) | ||
|
|
||
| def _check_alpha(self): |
There was a problem hiding this comment.
Maybe you went through this before, but why isn't alpha just initially set to an appropriate value and then used as before, instead of changing the code a lot as below?
|
Modifying attributes didn't pay nice with our set_params API
…On 16 Jun 2017 10:36 am, "Jacob Schreiber" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/naive_bayes.py
<#9131 (comment)>
:
> @@ -460,6 +463,16 @@ def _update_class_log_prior(self, class_prior=None):
else:
self.class_log_prior_ = np.zeros(n_classes) - np.log(n_classes)
+ def _check_alpha(self):
Maybe you went through this before, but why isn't alpha just initially set
to an appropriate value and then used as before, instead of changing the
code a lot as below?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#9131 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6yspSztoncgMuuTDYOBIqpTi5Oe-ks5sEc31gaJpZM4N7P0O>
.
|
|
It wouldn't be modifying attributes, it would be checking alpha when passed into init and setting it appropriately initially. |
|
Not sure to understand your point but if we move the checking at clf = BernoulliNB(alpha=0.1)
clf.fit(X, y)
params={'alpha': 0.0}
clf.set_params(**params)
clf.fit(X, y) # Will crash |
|
This looks good to me. Thanks for the contribution! |
…cikit-learn#9131) * Fix scikit-learn#5814 * Fix pep8 in naive_bayes.py:716 * Fix sparse matrix incompatibility * Fix python 2.7 problem in test_naive_bayes * Make sure the values are probabilities before log transform * Improve docstring of `_safe_logprob` * Clip alpha solution * Clip alpha solution * Clip alpha in fit and partial_fit * Add what's new entry * Add test * Remove .project * Replace assert method * Update what's new * Format float into %.1e * Update ValueError msg
…cikit-learn#9131) * Fix scikit-learn#5814 * Fix pep8 in naive_bayes.py:716 * Fix sparse matrix incompatibility * Fix python 2.7 problem in test_naive_bayes * Make sure the values are probabilities before log transform * Improve docstring of `_safe_logprob` * Clip alpha solution * Clip alpha solution * Clip alpha in fit and partial_fit * Add what's new entry * Add test * Remove .project * Replace assert method * Update what's new * Format float into %.1e * Update ValueError msg
…cikit-learn#9131) * Fix scikit-learn#5814 * Fix pep8 in naive_bayes.py:716 * Fix sparse matrix incompatibility * Fix python 2.7 problem in test_naive_bayes * Make sure the values are probabilities before log transform * Improve docstring of `_safe_logprob` * Clip alpha solution * Clip alpha solution * Clip alpha in fit and partial_fit * Add what's new entry * Add test * Remove .project * Replace assert method * Update what's new * Format float into %.1e * Update ValueError msg
…cikit-learn#9131) * Fix scikit-learn#5814 * Fix pep8 in naive_bayes.py:716 * Fix sparse matrix incompatibility * Fix python 2.7 problem in test_naive_bayes * Make sure the values are probabilities before log transform * Improve docstring of `_safe_logprob` * Clip alpha solution * Clip alpha solution * Clip alpha in fit and partial_fit * Add what's new entry * Add test * Remove .project * Replace assert method * Update what's new * Format float into %.1e * Update ValueError msg
…cikit-learn#9131) * Fix scikit-learn#5814 * Fix pep8 in naive_bayes.py:716 * Fix sparse matrix incompatibility * Fix python 2.7 problem in test_naive_bayes * Make sure the values are probabilities before log transform * Improve docstring of `_safe_logprob` * Clip alpha solution * Clip alpha solution * Clip alpha in fit and partial_fit * Add what's new entry * Add test * Remove .project * Replace assert method * Update what's new * Format float into %.1e * Update ValueError msg
…cikit-learn#9131) * Fix scikit-learn#5814 * Fix pep8 in naive_bayes.py:716 * Fix sparse matrix incompatibility * Fix python 2.7 problem in test_naive_bayes * Make sure the values are probabilities before log transform * Improve docstring of `_safe_logprob` * Clip alpha solution * Clip alpha solution * Clip alpha in fit and partial_fit * Add what's new entry * Add test * Remove .project * Replace assert method * Update what's new * Format float into %.1e * Update ValueError msg
…cikit-learn#9131) * Fix scikit-learn#5814 * Fix pep8 in naive_bayes.py:716 * Fix sparse matrix incompatibility * Fix python 2.7 problem in test_naive_bayes * Make sure the values are probabilities before log transform * Improve docstring of `_safe_logprob` * Clip alpha solution * Clip alpha solution * Clip alpha in fit and partial_fit * Add what's new entry * Add test * Remove .project * Replace assert method * Update what's new * Format float into %.1e * Update ValueError msg
Reference Issue
Fixes #5814, continuation of #7477
What does this implement/fix? Explain your changes.
Move alpha setting into
fitandpartial_fit