[MRG+1] Fix MultinomialNB and BernoulliNB alpha=0 bug (continuation) by herilalaina · Pull Request #9131 · scikit-learn/scikit-learn

herilalaina · 2017-06-15T13:50:29Z

Reference Issue

Fixes #5814, continuation of #7477

What does this implement/fix? Explain your changes.

Move alpha setting into fit and partial_fit

…into fix_NB_5814

jnothman

Thanks. Please wrap those fit calls in assert_warns in tests to ensure the warning is triggered.

We should also have a small test for the ValueError case, using assert_raise_messsage or similar.

Thanks for taking this on.

herilalaina · 2017-06-15T22:18:48Z

Thanks for reviewing! Tests are done.

jnothman

Thanks for the quick work

jnothman · 2017-06-15T22:23:28Z

sklearn/tests/test_naive_bayes.py

+    # Test sparse X
+    X = scipy.sparse.csr_matrix(X)
+    nb = BernoulliNB(alpha=0.)
+    nb.fit(X, y)


This will also raise the warning which we would rather not see in tests. Either assert or ignore.

jnothman · 2017-06-15T22:23:57Z

sklearn/tests/test_naive_bayes.py

+    y = np.array([0, 1])
+    b_nb = BernoulliNB(alpha=-0.1)
+    m_nb = MultinomialNB(alpha=-0.1)
+    assert_raises(ValueError, b_nb.fit, X, y)


Better if this tests the error message with the assert_raise_* variants

jnothman · 2017-06-15T22:34:34Z

Please add an entry to the change log at doc/whats_new.rst.

jnothman · 2017-06-15T23:38:16Z

sklearn/naive_bayes.py


+    def _check_alpha(self):
+        if self.alpha < 0:
+            raise ValueError('Smoothing parameter alpha = %e. '


We can now see that %e is clearly a bad pick. %.1e would be better and just as informative

Very good point! done

jnothman

Otherwise, and assuming tests pass, LGTM

jnothman · 2017-06-16T00:23:02Z

sklearn/naive_bayes.py

+    def _check_alpha(self):
+        if self.alpha < 0:
+            raise ValueError('Smoothing parameter alpha = %.1e. '
+                             'alpha must be >= 0!' % self.alpha)


Just noticed this says >= which is a little awkward we warn when it's 0 and change the value

Perhaps alternate language is "alpha should be > 0", as opposed to 'must be'?

thanks, it's already fixed

jmschrei · 2017-06-16T00:36:03Z

sklearn/naive_bayes.py

        else:
            self.class_log_prior_ = np.zeros(n_classes) - np.log(n_classes)

+    def _check_alpha(self):


Maybe you went through this before, but why isn't alpha just initially set to an appropriate value and then used as before, instead of changing the code a lot as below?

jnothman · 2017-06-16T00:50:50Z

Modifying attributes didn't pay nice with our set_params API

…

On 16 Jun 2017 10:36 am, "Jacob Schreiber" ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In sklearn/naive_bayes.py <#9131 (comment)> : > @@ -460,6 +463,16 @@ def _update_class_log_prior(self, class_prior=None): else: self.class_log_prior_ = np.zeros(n_classes) - np.log(n_classes) + def _check_alpha(self): Maybe you went through this before, but why isn't alpha just initially set to an appropriate value and then used as before, instead of changing the code a lot as below? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9131 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz6yspSztoncgMuuTDYOBIqpTi5Oe-ks5sEc31gaJpZM4N7P0O> .

jmschrei · 2017-06-16T15:20:41Z

It wouldn't be modifying attributes, it would be checking alpha when passed into init and setting it appropriately initially.

herilalaina · 2017-06-16T21:20:59Z

Not sure to understand your point but if we move the checking at __init__, the following script won't work

clf = BernoulliNB(alpha=0.1)
clf.fit(X, y)
params={'alpha': 0.0}
clf.set_params(**params)
clf.fit(X, y) # Will crash

jmschrei · 2017-06-19T19:33:11Z

This looks good to me. Thanks for the contribution!

…cikit-learn#9131) * Fix scikit-learn#5814 * Fix pep8 in naive_bayes.py:716 * Fix sparse matrix incompatibility * Fix python 2.7 problem in test_naive_bayes * Make sure the values are probabilities before log transform * Improve docstring of `_safe_logprob` * Clip alpha solution * Clip alpha solution * Clip alpha in fit and partial_fit * Add what's new entry * Add test * Remove .project * Replace assert method * Update what's new * Format float into %.1e * Update ValueError msg

yl565 and others added 11 commits June 15, 2017 15:11

Fix scikit-learn#5814

ddb5383

Fix pep8 in naive_bayes.py:716

866e093

Fix sparse matrix incompatibility

632205c

Fix python 2.7 problem in test_naive_bayes

5a6e5f7

Make sure the values are probabilities before log transform

5a92259

Improve docstring of _safe_logprob

059273a

Clip alpha solution

644799a

Clip alpha solution

4c3576c

Clip alpha in fit and partial_fit

6e6f59e

Add what's new entry

421a52c

Merge branch 'master' of https://github.com/scikit-learn/scikit-learn …

53e6b5d

…into fix_NB_5814

herilalaina changed the title ~~[WIP] Fix MultinomialNB and BernoulliNB alpha=0 bug (continuation)~~ [MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug (continuation) Jun 15, 2017

jnothman reviewed Jun 15, 2017

View reviewed changes

herilalaina added 2 commits June 16, 2017 00:11

Add test

2e6bba5

Remove .project

c9566a9

jnothman reviewed Jun 15, 2017

View reviewed changes

herilalaina added 2 commits June 16, 2017 01:14

Replace assert method

aea2cd8

Update what's new

60f10cc

jnothman reviewed Jun 15, 2017

View reviewed changes

Format float into %.1e

b6835ac

jnothman approved these changes Jun 16, 2017

View reviewed changes

jmschrei reviewed Jun 16, 2017

View reviewed changes

jnothman changed the title ~~[MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug (continuation)~~ [MRG+1] Fix MultinomialNB and BernoulliNB alpha=0 bug (continuation) Jun 16, 2017

Update ValueError msg

aa4863f

jnothman added this to the 0.19 milestone Jun 17, 2017

jnothman added the Waiting for Reviewer label Jun 17, 2017

jmschrei approved these changes Jun 19, 2017

View reviewed changes

jmschrei merged commit b4b5de8 into scikit-learn:master Jun 19, 2017

This was referenced Jun 19, 2017

[MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug #7477

Closed

Multinomial Bayes issue #5814

Closed

herilalaina deleted the fix_NB_5814 branch June 19, 2017 19:36

jmschrei mentioned this pull request Jun 30, 2017

[MRG+2] Implement Complement Naive Bayes. #8190

Merged

rmalouf mentioned this pull request Mar 7, 2018

BernoulliNB and MultinomialNB documentation for alpha=0 #10772

Closed

rth mentioned this pull request Mar 8, 2018

Documentation for #10772 MultinomialNB alpha>0 #10775

Closed

arka204 mentioned this pull request Mar 22, 2020

[MRG] Adding variable alphaCorrection to classes in naive_bayes.py. #16747

Closed

hongshaoyang mentioned this pull request Nov 10, 2020

[MRG] Adding variable force_alpha to classes in naive_bayes.py #18805

Closed

Uh oh!

Conversation

herilalaina commented Jun 15, 2017

Reference Issue

What does this implement/fix? Explain your changes.

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

herilalaina commented Jun 15, 2017

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jun 15, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jun 16, 2017 via email

Uh oh!

jmschrei commented Jun 16, 2017

Uh oh!

herilalaina commented Jun 16, 2017

Uh oh!

jmschrei commented Jun 19, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants