linear_model with normalize and StandardScaler lead to faulty results with weighted constant features

When testing the results of the linear models with `normalize` set to True and `sample_weight` in the PR [in the PR](https://github.com/scikit-learn/scikit-learn/pull/19426#issuecomment-777822035) #19426 we noted that for the sparse data the result is not correct in the case when there is a constant non-zero feature, for example:

```
X = rng.rand(n_samples, n_features)
X[X < 0.5] = 0.
X[:, 2] = 1.
```

The normalization is close to 0 but never exactly 0 due to the roundoff errors so we don't replace it with 1s.

Therefore if we divide the X with mean by the normalization we get high number.

This is the same if we call `StandardScaler` with the same data and `sample_weight`.

Possibly because they are both using `mean_variance_axis()`

cc @ogrisel 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

linear_model with normalize and StandardScaler lead to faulty results with weighted constant features #19450

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

linear_model with normalize and StandardScaler lead to faulty results with weighted constant features #19450

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions