Skip to content

FIX binary/multiclass jaccard_similarity_score and extend to handle averaging#13092

Closed
jnothman wants to merge 97 commits intoscikit-learn:masterfrom
jnothman:jaccard-sim
Closed

FIX binary/multiclass jaccard_similarity_score and extend to handle averaging#13092
jnothman wants to merge 97 commits intoscikit-learn:masterfrom
jnothman:jaccard-sim

Conversation

@jnothman
Copy link
Copy Markdown
Member

@jnothman jnothman commented Feb 4, 2019

Reference Issues/PRs

Fixes #7332. Supersedes #10083

What does this implement/fix? Explain your changes.

The current Jaccard implementation is ridiculous for binary and multiclass problems, returning accuracy. This makes Jaccard comparable to Precision, Recall and F-score, which are also fundamentally set-wise metrics.

gxyd added 30 commits November 8, 2017 01:25
this deals with both multilabel and multiclass problems
labels, sample_weight seems to be working fine, though haven't
fully testing them again, will do in next commit
@jnothman jnothman changed the title FIX and extension of jaccard_similarity_score to handle multiclass/averaging properly FIX multiclass jaccard_similarity_score and extend to handle averaging Feb 5, 2019
@jnothman jnothman changed the title FIX multiclass jaccard_similarity_score and extend to handle averaging FIX binary/multiclass jaccard_similarity_score and extend to handle averaging Feb 5, 2019
@jnothman
Copy link
Copy Markdown
Member Author

jnothman commented Feb 5, 2019

We might also consider renaming jaccard_similarity_score to jaccard_score, deprecating the old one, and making average='binary' the default as in Precision-Recall-Fscore


def jaccard_similarity_score(y_true, y_pred, normalize=True,
def jaccard_similarity_score(y_true, y_pred, labels=None, pos_label=1,
average='samples', normalize='true-if-samples',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a note that this is not backward compatible with users calling it with positional arguments [sigh]! But I'm not sure what we should do in these cases.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we deprecate the current function and make jaccard_score that would solve it :)

labels are column indices. By default, all labels in ``y_true`` and
``y_pred`` are used in sorted order.

pos_label : str or int, 1 by default
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> (default=1)?

... # doctest: +ELLIPSIS
0.33...
>>> jaccard_similarity_score(y_true, y_pred, average='micro')
... # doctest: +ELLIPSIS
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is redundant, it's already set above (and it generates the odd empty ... line in the output).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these flags are per-statement, so I don't see how "it's already set above"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scope of those flags are at least per-block, example:

>>> cov.covariance_ # doctest: +ELLIPSIS
array([[0.7569..., 0.2818...],
[0.2818..., 0.3928...]])
>>> cov.location_
array([0.0622..., 0.0193...])

def test_jaccard_similarity_score_validation():
y_true = np.array([0, 1, 0, 1, 1])
y_pred = np.array([0, 1, 0, 1, 1])
assert_raise_message(ValueError, "pos_label=2 is not a valid label: "
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest.raises?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why pytest.raises? For readability? I don't think the error message is any better with pytest.raises for instance.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't we gradually moving away from assert_raise_message and move to with pytest.raises(...)? At least that was my impression.

"classification.")
assert_raise_message(ValueError, msg3, jaccard_similarity_score, y_true,
y_pred, average='samples')
assert_raise_message(ValueError, msg3, jaccard_similarity_score, y_true,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate of the above? seems like a copy/paste issue.

@jnothman
Copy link
Copy Markdown
Member Author

jnothman commented Feb 6, 2019

Thanks for the review @adrinjalali!

@adrinjalali
Copy link
Copy Markdown
Member

It's also probably a good idea to check if there need to be changes to examples/multioutput/plot_classifier_chain_yeast.py.

@qinhanmin2014
Copy link
Copy Markdown
Member

We might also consider renaming jaccard_similarity_score to jaccard_score, deprecating the old one, and making average='binary' the default as in Precision-Recall-Fscore

I'll vote +1 for this solution.

And I don't understand why we need normalize parameter.

Calculate metrics for each instance, and find their average (only
meaningful for multilabel classification).

normalize : bool, optional (default=True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default is true-if-samples and not True

@adrinjalali
Copy link
Copy Markdown
Member

well, I guess #13151 is a better solution anyway :)

@qinhanmin2014
Copy link
Copy Markdown
Member

I think there's enough consensus to close this one. Let's try to merge #13151

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

multiclass jaccard_similarity_score should not be equal to accuracy_score

4 participants