Explanation of nu parameter in One-class SVM

In the latest documentation for Outlier Detection (http://scikit-learn.org/stable/modules/outlier_detection.html), it mentioned an important distinction between novelty detection and outlier detection is that: In novelty detection, "the training data is not polluted by outliers", and in outlier detection, "the training data contains outliers". And in the example, One-class SVM is used to demonstrate novelty detection.

However, in One-class SVM, it is still possible to accept outliers in the training data. Particularly, the parameter nu is used to tune upper bound on the fraction of outliers in the training dataset (as explained in Proposition 4 of the original paper - Estimating the support of a high-dimensional distribution, by B Schölkopf et al.).

So I think that the distinction between outlier detection and novelty detection is not well-illustrated in the current documentation (by the use of One-class SVM). And in fact, I would think that we should not differentiate between the two cases.

Besides, the current explanation of nu parameter(i.e., "The  \nu parameter, also known as the margin of the One-Class SVM, corresponds to the probability of finding a new, but regular, observation outside the frontier.") should be re-written based on the explanation from the original paper to make things clearer.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explanation of nu parameter in One-class SVM #3466

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Explanation of nu parameter in One-class SVM #3466

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions