[WIP] Brier score binless decomposition#22233
[WIP] Brier score binless decomposition#22233ColdTeapot273K wants to merge 4 commits intoscikit-learn:mainfrom
Conversation
|
Thanks for the PR. However, before reviewing it we should settle the discussion on the related issue: #21774 (comment) |
|
I just tried using your implementation of |
|
I've checked your example TLDR: working as intended, since it's an exact, binless implementation; but we can do something about it to give more flexibility without introducing bins. Two points:
e.g. this is how you get the maximum refinement loss: bs, cl, rl = brier_score_loss_decomposition(
np.array([1, 0]).reshape(-1), np.array([0.5, 0.5]).reshape(-1)
)
# 0.25, 0.0, 0.25while the calibration loss here is 0 and it's an expected result. that's how you can hack the calibration metric in general btw - just predict class balance value (countered by considering joint likelihood metric instead, see McElreath's "Statistical Rethinking 2nd ed", paragraph 7.2) Now, this got me thinking that in practical applications we don't often consider numbers which differ in very distant decimals (e.g. 0.XXXXXX3 and 0.XXXXXX4) as too different. So we might relax the constraint on 'exactness' of this implementation by introducing some absolute/relative tolerance (like in
Tolerance in this case would act as alternative parameter to binning, the degree of "exactness" of this exact implementation. |
|
I would much prefer a more general solution to score decompositions as proposed in #23767. |
Reference Issues/PRs
Closes #21774. See also #18268, #21718
What does this implement/fix? Explain your changes.
Described in #21774
Any other comments?