[MRG+2] clean outlier_detection.py#9018
Conversation
|
Open questions:
|
422ce84 to
4d7e6eb
Compare
|
added WIP in name :) |
|
yes thanks :) |
albertcthomas
left a comment
There was a problem hiding this comment.
A few corrections/suggestions.
I also did a bit of research to find out why the cubit root of the mahalanobis distance is returned when raw_values=False. Except this example stating visualization purpose I do not see why we need to return the cubic root.
In the end, I don't think we need this raw_values parameter and we should always return the mahalanobis distance without the cubic root. This would however require a deprecation warning... I will check that this does not break the cited example (if that's the case, maybe use the cubic root only for the example).
| @@ -55,7 +112,7 @@ def decision_function(self, X, raw_values=False): | |||
| decision : array-like, shape (n_samples, ) | |||
| The values of the decision function for each observations. | |||
|
|
||
| Parameters | ||
| ---------- | ||
| store_precision : bool |
There was a problem hiding this comment.
boolean, optional (default=True)
| store_precision : bool | ||
| Specify if the estimated precision is stored. | ||
|
|
||
| assume_centered : Boolean |
There was a problem hiding this comment.
boolean, optional (default=False)
| If False, the robust location and covariance are directly computed | ||
| with the FastMCD algorithm without additional treatment. | ||
|
|
||
| support_fraction : float, 0 < support_fraction < 1 |
There was a problem hiding this comment.
float, optional (default=None)
|
|
||
| support_fraction : float, 0 < support_fraction < 1 | ||
| The proportion of points to be included in the support of the raw | ||
| MCD estimate. Default is ``None``, which implies that the minimum |
There was a problem hiding this comment.
Should be in the interval (0,1).
| value of support_fraction will be used within the algorithm: | ||
| `[n_sample + n_features + 1] / 2`. | ||
|
|
||
| contamination : float, 0. < contamination < 0.5 |
There was a problem hiding this comment.
float, optional (default=0.1)
|
|
||
| contamination : float, 0. < contamination < 0.5 | ||
| The amount of contamination of the data set, i.e. the proportion | ||
| of outliers in the data set. |
There was a problem hiding this comment.
Should be in the interval (0, 0.5).
| self.contamination = contamination | ||
|
|
||
| def fit(self, X, y=None): | ||
| MinCovDet.fit(self, X) |
There was a problem hiding this comment.
super(EllipticEnveloppe, self).fit(X)
| def __init__(self, store_precision=True, assume_centered=False, | ||
| support_fraction=None, contamination=0.1, | ||
| random_state=None): | ||
| MinCovDet.__init__(self, store_precision=store_precision, |
There was a problem hiding this comment.
super(EllipticEnveloppe, self).fit(X)
| ------- | ||
| decision : array-like, shape (n_samples, ) | ||
| The values of the decision function for each observations. | ||
| The values of the decision function for each observation. |
There was a problem hiding this comment.
Decision function of the samples.
|
I also checked that this doesn't break the examples in plot_outlier_detection.py and plot_outlier_detection_housing.py Do we want to try to clarify our position on the |
|
See Comment in issue #4168 for explanations about the cubic root. See also Cross validated. |
|
Let's stick to cleaning the code in this PR, and think about API consistency in #9015 |
|
ping @vene or @agramfort for a review? |
|
re-ping @agramfort (small PR!) |
|
LGTM |
|
LGTM, merging, thanks a lot! |
Remove OutlierDetectionMixin, which was only used by by EllipticEnvelope
Remove OutlierDetectionMixin, which was only used by by EllipticEnvelope
Remove OutlierDetectionMixin, which was only used by by EllipticEnvelope
Remove OutlierDetectionMixin, which was only used by by EllipticEnvelope
Remove OutlierDetectionMixin, which was only used by by EllipticEnvelope
Remove OutlierDetectionMixin, which was only used by by EllipticEnvelope
Remove OutlierDetectionMixin, which was only used by by EllipticEnvelope
Remove OutlierDetectionMixin, which was only used by by EllipticEnvelope
Following up discussion in issue #8693
-remove OutlierDetectionMixin