-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
fit_intercept in RidgeCV with sparse design matrix and gcv_mode='svd' #13325
Description
Currently the ridge with generalized cross-validation which uses an SVD of the
design matrix (the best when there are more samples than features) does not
support sparse design matrices.
this results in silently using an eigendecomposition of the Gram matrix when
gcv_mode is 'auto'
scikit-learn/sklearn/linear_model/ridge.py
Line 1031 in 7389dba
| if sparse.issparse(X) or n_features > n_samples or with_sw: |
or raising an error when the user explicitly asked for svd
scikit-learn/sklearn/linear_model/ridge.py
Line 965 in 7389dba
| raise TypeError("SVD not supported for sparse matrices") |
would it be better to:
-
allow sparse design matrices and compute the SVD using scipy.sparse.linalg.LinearOperator
-
warn the user that 'eigen' is being used, which can be very inefficient when
n_samples is much larger than n_features, so that they can decide if it is
better to use 'eigen', cast X to a dense matrix, or turn to another mode of CV
than the generalized CV