[MRG] faster, flatter precision_recall_fscore_support by jnothman · Pull Request #2278 · scikit-learn/scikit-learn

jnothman · 2013-07-27T14:25:09Z

Rewritten for efficiency and neatness. Note the handling of multilabel-sequences format by binarizing is in response to benchmarks. Under the simple benchmarks at https://gist.github.com/jnothman/5734967, this new version is about 1.3x faster than master (for multiclass and label indicators; about 2x faster for sequence of sequences).
Also fixed some warning handling in tests
Also reverted the recent reimplementation of mathew's correlation coefficient, using LabelEncoder to handle non-int labels instead.

arjoly · 2013-08-01T06:36:25Z

Thanks @jnothman :-)

Can you rebase on top of master?

jnothman · 2013-08-01T08:53:31Z

Sure -- the only conflict was the module's lengthy author list and imports.

arjoly · 2013-08-01T09:00:08Z

sklearn/metrics/metrics.py

Apparently, I let example-based instead of sample-based

indeed. and I've missed it multiple times since!

arjoly · 2013-08-01T09:16:01Z

Why have you removed the improvements of the mcc metric?

arjoly · 2013-08-01T09:16:33Z

Awesome refarctoring !!!

jnothman · 2013-08-01T09:28:29Z

Why have you removed the improvements of the mcc metric?

Uhh... apparently a git-user fail. I think I rebased an old version of my branch. I might need to go searching for the right one.

jnothman · 2013-08-01T09:36:12Z

Thank you Travis for recording my previous branch head (b81a68f6296753c52f45b0ebbc0b8b466f84b871).

jnothman · 2013-08-01T09:41:02Z

re-rebased with your comments (except for moving the elif case) addressed.

jnothman · 2013-08-01T09:46:10Z

sklearn/metrics/tests/test_metrics.py

The ignore_warnings decorator I introduce in #2334 would be better applied here, but I don't really have the time to fix up either PR atm.

arjoly · 2013-08-01T10:12:02Z

Travis is not happy :-/

jnothman · 2013-08-01T10:25:22Z

Travis is not happy :-/

Then nor am I. It's actually a worry, because the error is in matthews_corrcoef, unchanged since https://travis-ci.org/scikit-learn/scikit-learn/builds/9551448, which suggests a nasty interaction with the changes since in prf, e.g. use of np.errstate. Any ideas?

arjoly · 2013-08-01T11:00:54Z

I am able to reproduce the error on my laptop.
By using np.errstate, the error is solved:

    with np.errstate(divide='ignore', invalid='ignore'):
        mcc = np.corrcoef(y_true, y_pred)[0, 1]

Strange, the use of seterrstate and geterrstate seems to have some side effects.

jnothman · 2013-08-01T11:07:09Z

Is that correct behaviour, or is the warning meant to happen on nan?

arjoly · 2013-08-01T11:22:44Z

Given this test

def test_matthews_corrcoef_nan():
    with warnings.catch_warnings():
        warnings.simplefilter("always")
        assert_equal(matthews_corrcoef([0], [1]), 0.0)
        warnings.simplefilter("error")
        assert_equal(matthews_corrcoef([0,0],[0,1]), 0.0)

I would say that we should raise a warning.

So in the end, this should be,

    with np.errstate(divide='warn', invalid='warn'):
        mcc = np.corrcoef(y_true, y_pred)[0, 1]

GaelVaroquaux · 2013-08-01T11:23:32Z

sklearn/metrics/metrics.py

You might want to indicate here that 'modifier' and 'average' are strings used for the warning. It is hard to guess from the name of the function and the argument names.

I've done so.

jnothman · 2013-08-01T12:09:26Z

Apparently Travis doesn't like that suggestion either. The plot thickens. I'm open to suggestions.

arjoly · 2013-08-01T12:33:21Z

Strange, I though this would work. Should we ignore the warning and launch our own when the denominator is 0?

jnothman · 2013-08-01T12:41:53Z

I think you misinterpreted the test code: it means the test should fail if
a warning is produced. I'm not sure why this error appears here and not in
an earlier build, but I think this commit should fix it.

On Thu, Aug 1, 2013 at 10:33 PM, Arnaud Joly notifications@github.comwrote:

Strange, I though this would work. Should we ignore the warning and launch
our own when the denominator is 0?

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/2278#issuecomment-21933001
.

arjoly · 2013-08-01T12:54:28Z

LGTM 👍, thanks a lot this is a lot better than the previous version!

arjoly · 2013-08-13T07:10:28Z

A last review?

arjoly · 2013-08-21T09:07:11Z

(Ping some people randomly @ogrisel, @glouppe, @amueller)

ogrisel · 2013-08-21T09:28:55Z

sklearn/metrics/metrics.py

Could you please add an inline comment to explain when the invalid errstate could be triggered and why it's safe to ignore it in the specific case of matthews_corrcoef computation?

I don't know anything about it. This simply reverts the code to an implementation before it was adapted to handle string input, with a LabelEncoder thrown on top.

Alright then.

ogrisel · 2013-08-21T09:41:02Z

Apart from my warning class comment, the code looks simple enough and I trust the test suite and your bench results. +1 for merging.

ogrisel · 2013-08-21T09:44:05Z

sklearn/metrics/metrics.py

Actually I don't understand how this format string could possibly work. Could you please add a test that catches this specific warning and check the actual message content?

Okay. The tricky bit is the {{}} which format expands to {} to be substituted with "due to" or "in labels with" or "in samples with".

You're telling me I shouldn't look for jobs in NLG? :p

Alright I got it after writing the comment and re-reading the end of the function. Still an explicit test that checks the message would be very welcomed.

This comment about adding a check for the message content has not been addressed, or has it?

Sorry it has been in 0ea0aedef72306260097abff6ed786dedae1794b. I wonder how I missed it...

jnothman · 2013-09-10T05:10:12Z

Rebased on master, and addressed @arjoly's comment on label_binarize and @ogrisel's comments on the warning class and testing its message.

ogrisel · 2013-09-10T10:34:35Z

The travis failure looks real:

======================================================================
ERROR: sklearn.metrics.tests.test_metrics.test_prf_warnings
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/virtualenv/python2.7_with_system_site_packages/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/metrics/tests/test_metrics.py", line 1786, in test_prf_warnings
    assert_equal(str(record.pop().message),
IndexError: pop from empty list
----------------------------------------------------------------------

ogrisel · 2013-09-10T10:41:22Z

I cannot reproduce the test failure on my box either. The assert_warns statements from the previous test do not fail though. Maybe assert_warns could be extended to add an optional expected message option.

arjoly · 2013-09-10T11:18:44Z

sklearn/metrics/metrics.py

Could we use numpy or python warning instead?

Which one? It's better to have specific warning classes to allow the user to silence specific warning categories.

Depending of the situation, we can use either divide by zero warning or invalid.

Certainly, we could inherit from one of those.

+1 for keeping a specific class, optionally inheriting from the one that makes most sense.

I'm happy to consider inheriting from RuntimeWarning rather than UserWarning, since it is intended for "dubious runtime behaviour". Yet we use UserWarnings everywhere else in sklearn, despite the docstring "Base class for warnings generated by user code."

The numpy warnings aren't relevant, IMO. This isn't some arbitrary numerical issue, it's about implementation details of metrics where the input isn't within the metric's natural domain, but the function must return some value, and there are multiple options as to what value.

So, what do you suggest?

I don't have any strong opinion. I want to avoid adding a new warning class if one of the standard library does the job.

I'm happy to consider inheriting from RuntimeWarning rather than UserWarning, since it is intended for "dubious runtime behaviour". Yet we use UserWarnings everywhere else in sklearn, despite the docstring "Base class for warnings generated by user code."

RuntimeWarning looks appropriate (Base class for warnings about dubious runtime behavior.).

I want to avoid adding a new warning class if one of the standard library does the job.

In terms of warnings, I understand "does the job" as: where a user might want to silence a particular warning or turn it into an exception, it should be easy, unambiguous and version-invariant.

This means it should have a specific class to distinguish it from other warnings in the module if that seems appropriate. @ogrisel suggested it might be in this case. Backwards-compatibility would have me extend from UserWarning.

Alright. I think we can keep UserWarning for now. If we ever devise an official inheritance tree for sklearn warnings we might want to revise that choice later.

jnothman · 2013-09-10T12:11:59Z

The travis failure must be yet another unexpected warning interaction. I'll try to sleuth it out.

* Rewritten for efficiency and neatness. Note the handling of multilabel-sequences format by binarizing is in response to benchmarks. * Also fixed some warning handling in tests

ogrisel · 2013-09-10T13:38:12Z

The travis failure must be yet another unexpected warning interaction. I'll try to sleuth it out.

Maybe duplicated warnings inside the same warning.catch block are collected only once under certain circumstances?

jnothman · 2013-09-10T14:13:06Z

No, that's not the issue. The issue is that the register of past warnings for its 'once' operation to work is buggy, though I've lost the Python bug tracker id for now.

I've traced the present problem down to the test_score_objects module since this causes the failure:

nosetests sklearn/metrics/tests/test_score_objects.py sklearn.metrics.tests.test_metrics:test_prf_warnings

jnothman · 2013-09-10T14:21:51Z

found that and another one in the classification_report doctest. Let's see if Travis prefers this.

ogrisel · 2013-09-10T14:51:15Z

Great, well done @jnothman!

ogrisel · 2013-09-10T14:52:14Z

Once @arjoly's comment is addressed, +1 for merging.

arjoly · 2013-09-10T15:58:57Z

sklearn/metrics/metrics.py

Can you use a format syntax?

It's a nitpick. Do as you wish.

arjoly · 2013-09-11T06:11:22Z

LGTM +1 !

[MRG] faster, flatter precision_recall_fscore_support

ogrisel · 2013-09-11T09:12:27Z

Merged! Thanks @jnothman

jnothman · 2013-09-11T09:23:05Z

Good to have this out the door! Thanks @arjoly and @ogrisel!

arjoly · 2013-09-11T09:27:45Z

Thanks @jnothman !!! 👍 🍻

arjoly reviewed Aug 1, 2013
View reviewed changes

jnothman reviewed Aug 1, 2013
View reviewed changes

GaelVaroquaux reviewed Aug 1, 2013
View reviewed changes

ogrisel reviewed Aug 21, 2013
View reviewed changes

arjoly reviewed Sep 10, 2013
View reviewed changes

jnothman added 7 commits September 10, 2013 22:15

COSMIT rewrite precision_recall_fscore_support

86ea13d

* Rewritten for efficiency and neatness. Note the handling of multilabel-sequences format by binarizing is in response to benchmarks. * Also fixed some warning handling in tests

FIX warning interaction problem, DOC clarify parameters

68e9bcf

FIX ignore corrcoef warning

fd8acb8

Use label_binarize instead of LabelBinarizer

eb806cd

Use specialised warning class

a3f97ea

TST messages for UndefinedMetricWarning from prf

d66d537

TST use ignore_warnings helper

474d11f

TST classification_report example that does not trigger warning

5f5e96a

arjoly reviewed Sep 10, 2013
View reviewed changes

ogrisel added a commit that referenced this pull request Sep 11, 2013

Merge pull request #2278 from jnothman/prf_rewrite3

513b0f4

[MRG] faster, flatter precision_recall_fscore_support

ogrisel merged commit 513b0f4 into scikit-learn:master Sep 11, 2013

jnothman deleted the prf_rewrite3 branch September 11, 2013 09:23

Uh oh!

Conversation

jnothman commented Jul 27, 2013

Uh oh!

arjoly commented Aug 1, 2013

Uh oh!

jnothman commented Aug 1, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arjoly commented Aug 1, 2013

Uh oh!

arjoly commented Aug 1, 2013

Uh oh!

jnothman commented Aug 1, 2013

Uh oh!

jnothman commented Aug 1, 2013

Uh oh!

jnothman commented Aug 1, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arjoly commented Aug 1, 2013

Uh oh!

jnothman commented Aug 1, 2013

Uh oh!

arjoly commented Aug 1, 2013

Uh oh!

jnothman commented Aug 1, 2013

Uh oh!

arjoly commented Aug 1, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Aug 1, 2013

Uh oh!

arjoly commented Aug 1, 2013

Uh oh!

jnothman commented Aug 1, 2013

Uh oh!

arjoly commented Aug 1, 2013

Uh oh!

arjoly commented Aug 13, 2013

Uh oh!

arjoly commented Aug 21, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Aug 21, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Sep 10, 2013

Uh oh!

ogrisel commented Sep 10, 2013

Uh oh!

ogrisel commented Sep 10, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!