[SPRINT] Add warning notes in preprocessing functions


The goal here is to add a warning note in the docstring of the pre-processing functions (follow up to  #17387) to warn about potential issues when using these functions, and recommend using a pipeline instead:

- [x] maxabs_scale
- [x] minmax_scale
- [ ] ~normalize~
- [x] quantile_transform
- [x] robust_scale
- [x] scale
- [x] power_transform

All of these are in `sklearn/preprocessing/_data.py`. Here is a warning template:

```
    .. warning:: Risk of data leak

        Do not use :func:`~sklearn.preprocessing.scale` unless you know what
        you are doing. A common mistake is to apply it to the entire data
        *before* splitting into training and test sets. This will bias the
        model evaluation because information would have leaked from the test
        set to the training set.
        In general, we recommend using
        :class:`~sklearn.preprocessing.StandardScaler` within a
        :ref:`Pipeline <pipeline>` in order to prevent most risks of data
        leaking: `pipe = make_pipeline(StandardScaler(), LogisticRegression()))`.
```

You should of course adapt `scale` and `StandardScaler`.

Please indicate below which function(s) you want to work on with e.g. "I'm working on `scale` and `robust_scale`" so that others don't pick the same ones

@scikit-learn/core-devs feel free to directly edit the warning message

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPRINT] Add warning notes in preprocessing functions #17402

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[SPRINT] Add warning notes in preprocessing functions #17402

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions