multiclass jaccard_similarity_score should not be equal to accuracy_score

The documentation for `sklearn.metrics.jaccard_similarity_score` currently (version 0.17.1) states that:

> In binary and multiclass classification, this function is equivalent to the accuracy_score. It differs in the multilabel classification problem.

However, I do not think that this is the right thing to do for multiclass-problems.   As far as I can tell, within the machine learning community a more common usage of the Jaccard index for multi-class is to
use the mean Jaccard-Index calculated for each class indivually. i.e., first calculate the jaccard index for class 0, class 1 and class 2, and then average them. This is what is very commonly done in the image segmentation community (where this is referred to as the "mean Intersection over Union"  score (see e.g.[1]), but as far as I can tell by skimming it, this is also what the original publication of the jaccard index did  in multiclass scenarios [2].  Note that this is NOT the same as the accuracy_score. Consider this example:

```
y_true = [0, 1, 2]
y_pred = [0, 0, 0]
```

The accuracy is clearly 1/3, and this is also what the jaccard_score in sklearn currently returns. The class-specific jaccard_scores would be:

J0 = 1 /3
J1 = 0 / 1
J2 = 0 / 1

Thus IMO the jaccard_score should be (J0 + J1 + J2) / 3 = 1/9  in this case

[1] e.g. Long et al, "The Pascal Visual Object Classes Challenge – a Retrospective", https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf  , but see any other paper on Semantic Segmentation

[2] Jaccard, "THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE", http://onlinelibrary.wiley.com/doi/10.1111/j.1469-8137.1912.tb05611.x/abstract (Note that I have only skimmed the paper, but it seems to me that the author always reports the average of the "efficient of community" calculated over pairs whenever the author compares more than just 2 groups)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

multiclass jaccard_similarity_score should not be equal to accuracy_score #7332

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

multiclass jaccard_similarity_score should not be equal to accuracy_score #7332

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions