MNT fix test for binary crossentropy in hist-GBDT by lorentzenchr · Pull Request #16691 · scikit-learn/scikit-learn

lorentzenchr · 2020-03-14T14:39:00Z

What does this implement/fix? Explain your changes.

This fixes a test for the binary cross entropy loss function in the module for HistGradientBoostingClassifier.

Any other comments?

See added comments in commits.

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

rth · 2020-03-18T15:30:57Z

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

+            # Subtract a constant term such that the binary cross entropy
+            # has its minimum at zero. This only works if 0 < y_true < 1.
+            return loss.pointwise_loss(y_true, x) \
+                - loss.pointwise_loss(y_true, logit(y_true))


I feel this is a bit complex.

Can we rather write, `

if isinstance(loss, _LOSSES['binary_crossentropy']): assert_allclose(loss.pointwise_loss(y_true, optimum), loss.pointwise_loss(y_true, logit(y_true))) else: ...

in the test directly?

Good idea. Unfortunately, the newton root finder is allergic to such chances and complains with a no converged message as it can't ever reach a true zero. I would have to change from newton to a minimizer. I thought my change is the smaller one and the advantage of newton is that you also check for gradient and hessian in some way being correct.
What do you think?

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

NicolasHug · 2020-03-18T15:42:56Z

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

+    # The argmin of binary_crossentropy for y_true=±1 is ±inf due to logit,
+    # cf. "complete separation". Therefore we use 0 < y_true < 1.


you mean 0 / 1 in stead of -1 / 1?

Also, why did it work with y_true == 1 then?

Yes, I meant y = 0 gives -inf, y = 1 gives +inf.
The reason, it worked for y_true = 1 so far was ... luck with numerical precision.

I looked at the solution of newton:

For y_true = 1 I find optimum = 35.47761173. Then pointwise_loss calls np.logaddexp(0, optimum) - optimum which gives 0 (true for all values of optimum > 33.271...).

Then, for y_true = 0, the loss becomes np.logaddexp(0, x) which equals zero for x < -745.1..., which the Newton/Halley root finder does not find.

lorentzenchr · 2020-05-10T11:42:00Z

I just rebased. Merge conflicts resolved. CI lights are green.
@rth @NicolasHug I addressed your comments.

PS: As you might note, this is a very subtle, small step towards thinking of logistic regression as a regression on the domain [0, 1]:smirk:

rth

One question otherwise LGTM, thanks!

rth · 2020-05-10T12:31:26Z

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

-    assert np.allclose(loss.pointwise_loss(y_true, optimum), 0)
-    assert np.allclose(get_gradients(y_true, optimum), 0, atol=1e-7)
+    assert_allclose(loss.inverse_link_function(optimum)[0], y_true)
+    assert_allclose(func(optimum)[0], 0, atol=1e-14)


Just a question on superficial read: doesn't func return a scalar (I'm a bit confused by the indexing) and if it's doesn't (and returns a ndarray as loss.pointwise_loss) why does using newton minimizer with it works as an objective function ?

Maybe some light type annotations on func could have helped. mypy now runs as part of CI..

🤔 No, func returns an array with one element. I tried to return scalars everywhere which fails badly because the methods of loss classes often call reshape(-1) internally. (And I have not yet completely figured out why, but this is another story).

I added a comment. I can also add type annotations. Can you provide an short example with ndarrays (of one element)?

Can you provide an short example with ndarrays (of one element)?

The typing for ndarray dimensions is not too standardized yet (https://github.com/numpy/numpy-stubs), but I think even,

def func(x: np.ndarray) -> np.ndarray:

would have been useful as opposed to,

def func(x: np.ndarray) -> float:

but anyway it's just a side comment,

rth

Thanks!

lorentzenchr · 2020-05-10T14:43:29Z

@rth Thanks for the approval and the annotation hint. I was able to make it work mostly with scalars. I hope it is now easier to read (and your approval still holds, maybe the type annotations are now too much?)

rth · 2020-05-10T14:47:14Z

Thanks! The current version looks good to me.

NicolasHug

Thanks @lorentzenchr , a few comments regading readability but LGTM

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

NicolasHug · 2020-05-10T15:15:50Z

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py


-    def fprime(x):
-        return get_gradients(y_true, x)
+    def fprime(x: float) -> float:


maybe add these as comments for now
I guess we need a more global decision on whether to use type annotations

I guess we need a more global decision on whether to use type annotations

I think the current situation is outlined in https://github.com/scikit-learn/scikit-learn/blame/fb76de72e7560aaa739872e94bee761777e54c0a/doc/developers/contributing.rst#L426

I did mention it in a dev meeting a month or two ago and didn't hear much objections. Again the idea is not to type everything, but now and then when it helps readability. I think we can keep these in, they can't hurt in any case.

Here in particular I find that without type annotations (and docstrings), it's very difficult to know, what's the expected input and output of,

def fprime(x): return get_gradients(y_true, x).item(0)

Ok.

In this specific case, I think we should actually let numpy deal with arrays in these functions (comment just below), so we'd need to remove the annotations anyway

Also, the input is actually not a float, it's a 2d array (with only one element).

I really think we should just let numpy and scipy properly deal with broadcasting here, and only ravel the arrays at the very end.

The argument x of def func(x) is the x0 of @pytest.mark.parametrize('loss, x0...) and is actually a float and not a ndarray. Same for fpime and fprime2.
AFAIU, the newton root finder does prefer scalars, not ndarrays as those have different meaning.

I'll do whatever you prefer here. It's just not clear to me what that is 😏

AFAIU, the newton root finder does prefer scalars

That's true, but it does fine with arrays with a single element

We need to wrap x0 and y_true into arrays because that's what the loss methods expect. newton is OK with these. If we're going to recast these 1-element arrays into floats, it's cleaner and easier to do it once at the very end, i.e. by calling .ravel just before the assertions.

Alternatively, we can also go back to just using assert np.allclose. I know we sort of prefer assert_allclose, but at the end of the day, that change is unrelated to this PR.

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

NicolasHug · 2020-05-10T19:40:46Z

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

+            return loss.pointwise_loss(y_true, x)

-    def fprime(x):
+    def fprime(x: np.ndarray) -> np.ndarray:


I don't feel like we need the annotations anymore. Especially given that these expect arrays with a single element. Newton would break otherwise (I think). Maybe a comment is enough?

NicolasHug

thanks @lorentzenchr , one last nit

I still think we don't need the anotations, but I'll merge tomorrow anyway

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py

NicolasHug · 2020-05-11T18:25:36Z

thanks @lorentzenchr !

github-actions bot added the module:ensemble label Mar 14, 2020

rth reviewed Mar 18, 2020

View reviewed changes

rth requested a review from NicolasHug March 18, 2020 15:33

NicolasHug reviewed Mar 18, 2020

View reviewed changes

Christian Lorentzen added 3 commits May 10, 2020 11:50

TST fix loss function test in _hist_gradient_boosting

a44b148

TST address review comments

2631181

TST replace allclose with assert_allclose

f4b4e8b

lorentzenchr force-pushed the tst_hgb branch from 01901be to f4b4e8b Compare May 10, 2020 09:53

TST adjust tolerances

d7d7708

rth reviewed May 10, 2020

View reviewed changes

TST add comment why 1-element arrays instead of scalars

d682380

rth approved these changes May 10, 2020

View reviewed changes

TST add type annotations and work with scalars

b1bd34d

NicolasHug reviewed May 10, 2020

View reviewed changes

Christian Lorentzen added 2 commits May 10, 2020 21:19

TST add actual_min for readability

ba13a6c

TST use ndarray and correct type annotation

3b25f5f

NicolasHug reviewed May 10, 2020

View reviewed changes

TST improve comments in test code

9d38046

NicolasHug approved these changes May 10, 2020

View reviewed changes

sklearn/ensemble/_hist_gradient_boosting/tests/test_loss.py Outdated Show resolved Hide resolved

TST address review comment

335016e

NicolasHug changed the title ~~[MRG] TST fix test for binary cross entropy in _hist_gradient_boosting~~ MNT fix test for binary crossentropy in hist-GBDT May 11, 2020

NicolasHug merged commit a6a20f2 into scikit-learn:master May 11, 2020

lorentzenchr deleted the tst_hgb branch May 11, 2020 19:54

gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020

MNT fix test for binary crossentropy in hist-GBDT (scikit-learn#16691)

5aee13c

viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020

MNT fix test for binary crossentropy in hist-GBDT (scikit-learn#16691)

3d74664

jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020

MNT fix test for binary crossentropy in hist-GBDT (scikit-learn#16691)

b4293cb

		# The argmin of binary_crossentropy for y_true=±1 is ±inf due to logit,
		# cf. "complete separation". Therefore we use 0 < y_true < 1.

Uh oh!

Conversation

lorentzenchr commented Mar 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lorentzenchr Mar 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented May 10, 2020

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented May 10, 2020

Uh oh!

rth commented May 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rth May 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NicolasHug commented May 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lorentzenchr commented Mar 14, 2020 •

edited

Loading

lorentzenchr Mar 18, 2020 •

edited

Loading

rth commented May 10, 2020 •

edited

Loading

rth May 10, 2020 •

edited

Loading