DOC Add links to preprocessing examples in docstrings and userguide#26877
Conversation
ArturoAmorQ
left a comment
There was a problem hiding this comment.
Thanks for the PR @StefanieSenger. Here is a first batch of comments.
ArturoAmorQ
left a comment
There was a problem hiding this comment.
Apart from a bit of wording, LGTM :) Thanks again @StefanieSenger!
Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>
adrinjalali
left a comment
There was a problem hiding this comment.
Left a few comments, it's hard to review since this PR is rather large and touching many examples, it's easier if PRs have a smaller scope.
|
Resolved the CI issues, thank you @adrinjalali |
ogrisel
left a comment
There was a problem hiding this comment.
Overall, LGTM, thanks for the PR. In addition to @adrinjalali's remarks here are a few more.
sklearn/preprocessing/_data.py
Outdated
| unit variance scaling. | ||
|
|
||
| MinMaxScaler doesn't reduce the effect of outliers; it only linearily | ||
| scales them down. For an example visualization, refer to :ref:`Compare |
There was a problem hiding this comment.
This statement holds for all scalers (StandardScaler, RobustScaler, MaxAbsScaler and MinMaxScaler). What is different is that the scale value found by RobustScaler is not sensitive to the presence of a few large marginal outliers while it is for StandardScaler and even more so for MinMaxScaler and MaxAbsScaler.
There was a problem hiding this comment.
Yes, I see. To express how the MinMaxScaler differs from the other scalers concerning outliers, I have tried to come up with a new wording:
`MinMaxScaler` doesn't reduce the effect of outliers, but it linearily
scales them down into a fixed range, where the largest occuring data point
corresponds to the maximum value and the smallest one corresponds to the
minimum value.
What do you think?
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
…er/scikit-learn into link_examples_preprocessing
…cikit-learn#26877) Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
…cikit-learn#26877) Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
…cikit-learn#26877) Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
…26877) Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
…cikit-learn#26877) Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
This PR suggests to add links to the examples from the Preprocessing section to the docstrings of the respective classes and functions.