ENH add sample_weight to sparse coordinade descent#22808
ENH add sample_weight to sparse coordinade descent#22808jeremiedbb merged 11 commits intoscikit-learn:mainfrom
Conversation
|
@agramfort @TomDLT @rth You might be interested. |
agramfort
left a comment
There was a problem hiding this comment.
maybe @mathurinm has time to look
509844d to
b4557c9
Compare
agramfort
left a comment
There was a problem hiding this comment.
LGTM ! thx @lorentzenchr
just to check did you observe any slowdown in the no sample weight case due to extract branching now present in the new code?
from sklearn.linear_model import ElasticNet
from sklearn.linear_model.tests.test_sparse_coordinate_descent import make_sparse_data
X, y = make_sparse_data(n_samples=1000, n_features=10_000)
%timeit ElasticNet().fit(X, y)Results with
Note that I notices quite some variation of those timings. |
|
@lorentzenchr beware that with such data and so computations like the lipschitz constants (norm(X, axis=0) ** 2) have a heavy weight in the total time. |
|
@mathurinm good point. from sklearn.linear_model import ElasticNet
from sklearn.linear_model.tests.test_sparse_coordinate_descent import make_sparse_data
X, y = make_sparse_data(n_samples=1000, n_features=10_000, n_informative=1000)
%timeit ElasticNet(alpha=0.01).fit(X, y)Gives
|
|
perfect ! thx @lorentzenchr just need another +1 to MRG this one. 🙏 |
jeremiedbb
left a comment
There was a problem hiding this comment.
Overall looks good. I can't really review the sparse_enet_coordinate_descent details (besides trying to check that the code does what you wrote as comments), but I trust the tests and @agramfort :). And it does not modify the no sample weight caseso no big risk.
|
issue #3702 was created in 2014. We are on shedule 😄 |
|
@lorentzenchr after thoughts I don't understand why in the sparse case Is the current design adopted in order to have the same code to compute the gradient, with or without sample weights ? |
Reference Issues/PRs
Fixes #3702 by implementing the final bits.
What does this implement/fix? Explain your changes.
This PR implements
sample_weightsupport with sparseXforElasticNet,ElasticNetCV,LassoandLassoCV.Details
The objective with sample weight
swis given bySolving for the intercept
w0giveswhere the mean is a weighted average, weighted by
sw.Dense solvers go on an rescale
y1andX1bysqrt(sw)but sparse solvers cannot setX1 = X - X_meanas this destroys sparsity ofX. Therefore,X_meanis passed to the coordinate descent solver.This PR goes on and also passes sample weights to the cd solver. The alternative would be to provide
sw * X_meanwhich is a dense matrix of the same dimensions asX, not a good idea.