DOC Update plot_mahalanobis_distances to notebook style#17089
DOC Update plot_mahalanobis_distances to notebook style#17089glemaitre merged 11 commits intoscikit-learn:masterfrom
Conversation
thomasjpfan
left a comment
There was a problem hiding this comment.
Thank you @lucyleeow
Since we broke up the example images into two images, there are two places in the user guide that will now have a different image: doc/modules/outlier_detection.rst (Fitting an elliptic envelope) and doc/modules/covariance.rst (Minimum Covariance Determinant). The images of the contour plot still seems to work in the context of the user guide. Do you think so as well?
| plt.show() | ||
|
|
||
| # %% | ||
| # [1] P. J. Rousseeuw. Least median of squares regression. J. Am |
There was a problem hiding this comment.
Could we make these references link correctly? I feed they are better at the top of the page under the initial description. Specifically under the paragraph:
Associated applications include outlier detection, observation ranking and clustering.
|
|
||
| .. math:: | ||
|
|
||
| d_{(\mu,\Sigma)}(x_i)^2 = (x_i - \mu)'\Sigma^{-1}(x_i - \mu) |
There was a problem hiding this comment.
Is using T for transpose clearer for you?
| d_{(\mu,\Sigma)}(x_i)^2 = (x_i - \mu)'\Sigma^{-1}(x_i - \mu) | |
| d_{(\mu,\Sigma)}(x_i)^2 = (x_i - \mu)^T\Sigma^{-1}(x_i - \mu) |
| # deviation = 2 and feature 2 has a standard deviation = 1. Next, 25 samples | ||
| # are replaced with Gaussian outlier samples where feature 1 has standard | ||
| # devation = 1 and feature 2 has standard deviation = 7. | ||
|
|
There was a problem hiding this comment.
Move the import numpy as np here?
| # that of the MCD robust estimator (1.2). This shows that the MCD based | ||
| # robust estimator is much more resistant to the outlier samples, which were | ||
| # designed to have a much larger variance in feature 2. | ||
|
|
There was a problem hiding this comment.
Move the from sklearn.covariance import EmpiricalCovariance, MinCovDet and matplotlib import here?
|
Thanks for the review @thomasjpfan |
| # Generate data | ||
| # -------------- | ||
| # | ||
| # First we generate a dataset of 125 samples and 2 features. Both features |
There was a problem hiding this comment.
| # First we generate a dataset of 125 samples and 2 features. Both features | |
| # First, we generate a dataset of 125 samples and 2 features. Both features |
| # | ||
| # First we generate a dataset of 125 samples and 2 features. Both features | ||
| # are Gaussian distributed with mean of 0 but feature 1 has a standard | ||
| # deviation = 2 and feature 2 has a standard deviation = 1. Next, 25 samples |
There was a problem hiding this comment.
| # deviation = 2 and feature 2 has a standard deviation = 1. Next, 25 samples | |
| # deviation equal to 2 and feature 2 has a standard deviation equal to 1. Next, 25 samples |
| # are Gaussian distributed with mean of 0 but feature 1 has a standard | ||
| # deviation = 2 and feature 2 has a standard deviation = 1. Next, 25 samples | ||
| # are replaced with Gaussian outlier samples where feature 1 has standard | ||
| # devation = 1 and feature 2 has standard deviation = 7. |
There was a problem hiding this comment.
| # devation = 1 and feature 2 has standard deviation = 7. | |
| # deviation equal to 1 and feature 2 has a standard deviation equal_to 7. |
| # First we generate a dataset of 125 samples and 2 features. Both features | ||
| # are Gaussian distributed with mean of 0 but feature 1 has a standard | ||
| # deviation = 2 and feature 2 has a standard deviation = 1. Next, 25 samples | ||
| # are replaced with Gaussian outlier samples where feature 1 has standard |
There was a problem hiding this comment.
| # are replaced with Gaussian outlier samples where feature 1 has standard | |
| # are replaced with Gaussian outlier samples where feature 1 has a standard |
| # Comparison of results | ||
| # --------------------- | ||
| # | ||
| # Below we fit MCD and MLE based covariance estimators to our data and print |
There was a problem hiding this comment.
| # Below we fit MCD and MLE based covariance estimators to our data and print | |
| # Below, we fit MCD and MLE based covariance estimators to our data and print |
|
Thanks @glemaitre, suggestions added. |
|
Thanks @lucyleeow |
Reference Issues/PRs
None
What does this implement/fix? Explain your changes.
plot_mahalanobis_distances.pyto notebook style with alternating code and textAny other comments?