Skip to content

[MRG+2] make sure that AUC is always a float#6225

Merged
GaelVaroquaux merged 3 commits intoscikit-learn:masterfrom
ogrisel:fix-6147-scoring-memmap
Jan 28, 2016
Merged

[MRG+2] make sure that AUC is always a float#6225
GaelVaroquaux merged 3 commits intoscikit-learn:masterfrom
ogrisel:fix-6147-scoring-memmap

Conversation

@ogrisel
Copy link
Copy Markdown
Member

@ogrisel ogrisel commented Jan 25, 2016

Even when computed on memory mapped data as reported in #6147.

Let's make sure that the new tests pass under AppVeyor before merging as deletion of mmap files tends to be troublesome under Windows.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, why is this needed here?
Or rather: why is this only needed here, and not in all scorers?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memmap.sum() used internally in trapz does not convert to scalar float (it returns a 0d memmap singleton) while it does convert to a float scaler for regular numpy arrays. Other scorers do not have this issue as highlighted by the test.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we report this to the NumPy guys?

@ogrisel
Copy link
Copy Markdown
Member Author

ogrisel commented Jan 26, 2016

I have introduced new UndefinedMetricWarning: in those new tests. Let me fix that.

@ogrisel ogrisel force-pushed the fix-6147-scoring-memmap branch from f40e1ce to a9ad70c Compare January 26, 2016 07:46
@ogrisel
Copy link
Copy Markdown
Member Author

ogrisel commented Jan 26, 2016

Ready for merge on my end.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't we always convert to float? Right now we'll have numpy floats or floats depending on the parallel computing setting, which might surprise people (different set of methods on the different objects).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that's the best way to do it but I found:

area = area.dtype.type(area)

or

area = area[()]

to return a numpy float rather than a Python float.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I could do:

if isinstance(area, np.memmap):
    area = area.dtype.type(area)

This way the behavior stays consistent for ndarray and memmap instances whatever the dtype.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this is what @lesteve just suggested.

@GaelVaroquaux GaelVaroquaux changed the title [MRG] make sure that AUC is always a float [MRG+1] make sure that AUC is always a float Jan 26, 2016
@GaelVaroquaux
Copy link
Copy Markdown
Member

Aside from the one comment, +1 for merge.

@ogrisel ogrisel force-pushed the fix-6147-scoring-memmap branch from d81dc43 to a13460c Compare January 27, 2016 09:32
@ogrisel
Copy link
Copy Markdown
Member Author

ogrisel commented Jan 27, 2016

Ok I addressed @GaelVaroquaux's comment. @amueller merge?

@ogrisel ogrisel force-pushed the fix-6147-scoring-memmap branch from a13460c to 3427d20 Compare January 28, 2016 13:26
@ogrisel
Copy link
Copy Markdown
Member Author

ogrisel commented Jan 28, 2016

Rebased to resolve changelog conflicts. @MechCoder are you still around for a second review?

@MechCoder
Copy link
Copy Markdown
Member

lgmt as well !

@MechCoder MechCoder changed the title [MRG+1] make sure that AUC is always a float [MRG+2] make sure that AUC is always a float Jan 28, 2016
GaelVaroquaux added a commit that referenced this pull request Jan 28, 2016
[MRG+2] make sure that AUC is always a float
@GaelVaroquaux GaelVaroquaux merged commit 1b4a9f2 into scikit-learn:master Jan 28, 2016
@ogrisel
Copy link
Copy Markdown
Member Author

ogrisel commented Jan 29, 2016

Backported to 0.17.X as 613f1ad 497ddf3 and 3bfbdb0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants