Better sample_weight support in Ridge

Currently only the `dense_cholesky` solver in `Ridge` supports `sample_weight`. To support it consistently in all solvers one can use the following trick (extract from my post on the ML):

We want to minimize \sum_i mu_i (w^T x_i - y_i)^2 where mu_i is the sample weight. This should be equivalent to \sum_i (sqrt(mu_i) w^T x_i - sqrt(mu_i) y_i)^2. So, we obtain the same result by multiplying each y_i and x_i by sqrt(mu_i).

In the dense case, it is trivial to implement but in the sparse case there's a bit of work to do as scipy sparse matrices do not support element-by-element multiplication with a vector (here the vector size is equal to `n_samples`). One should add an `inplace_csr_row_scale` utility to `sparsefuncs.pyx`. 

The test coverage of `sample_weight` needs to be greatly improved too.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better sample_weight support in Ridge #1190

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Better sample_weight support in Ridge #1190

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions