Skip to content

Feature Request: Gaussian adjust option in RobustScaler #10139

@rdturnermtl

Description

@rdturnermtl

It would be nice if sklearn.preprocessing.RobustScaler transformed data to a standard Gaussian when the input data was a Gaussian distribution. Currently, with the default arguments of quantile_range=(25.0, 75.0), if Gaussian data is passed to RobustScaler, then the transformed data has:
mean ~= 0 and std ~= 1/(norm.ppf(.75) - norm.ppf(.25)) = 0.741
It would be nice if this were 0 and 1 to be more consistent with the resulting scales in StandardScaler.

I propose adding an extra option to the constructor gauss_adjust, which if True adjusts the scales accordingly:

self.scale_ = (q[1] - q[0])

on line 1057 of data.py needs to be

self.scale_ = (q[1] - q[0]) / self.adjust

where in the constructor

self.adjust = norm.ppf(q_max / 100.0) - norm.ppf(q_min / 100.0) if gauss_adjust else 1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    EasyWell-defined and straightforward way to resolve

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions