-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Closed
Labels
Description
Describe the bug
In ElasticNetCV, the first and largest value of alpha, call it alpha_max, should be just large enough to force all of the coefficients to become zero. The existing code works correctly when sample_weight is not specified. However, the computation of alpha_max does not take into account sample_weight.
Steps/Code to Reproduce
import numpy as np
from sklearn.linear_model import ElasticNet, ElasticNetCV
X = np.array([[3, 1], [2, 5], [5, 3], [1, 4]])
beta = np.array([1, 1])
y = X @ beta
w = np.array([10, 1, 10, 1])
# Fit ElasticNetCV just to get the .alphas_ attribute
enetCV = ElasticNetCV(cv=2)
enetCV.fit(X, y, sample_weight=w)
# The coefficient of ElasticNet fitted at alpha_max should be [0. 0.].
alpha_max = enetCV.alphas_[0]
enet = ElasticNet(alpha=alpha_max)
enet.fit(X, y, sample_weight=w)
print(enet.coef_) # [0.1970807 0.19708023]Expected Results
If the correct value of alpha_max is computed, then enet.coef_ should be right at the cusp of zero, such that any smaller value of alpha makes it nonzero:
def get_alpha_max(X, y, w, l1_ratio=0.5):
wn = w / w.sum()
Xn = X - np.dot(wn, X)
yn = (y - np.dot(wn, y)) * wn
return np.max(np.abs(yn @ Xn)) / l1_ratio
enet = ElasticNet(alpha=get_alpha_max(X, y, w))
enet.fit(X, y, sample_weight=w)
print(enet.coef_) # [6.70427878e-17 6.70427878e-17]Actual Results
enet.coef_ is [0.1970807 0.19708023].
Versions
System:
python: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0]
executable: /home/jhopfens/.conda/envs/jhop39/bin/python
machine: Linux-3.10.0-1160.53.1.el7.x86_64-x86_64-with-glibc2.17
Python dependencies:
pip: 21.2.4
setuptools: 58.0.4
sklearn: 1.0.2
numpy: 1.21.5
scipy: 1.7.2
Cython: 0.29.24
pandas: 1.3.5
matplotlib: 3.5.1
joblib: 1.1.0
threadpoolctl: 3.0.0
Built with OpenMP: TrueReactions are currently unavailable