-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Precision errors in LDA.explained_variance_ratio_ #6031
Copy link
Copy link
Closed
Labels
BugEasyWell-defined and straightforward way to resolveWell-defined and straightforward way to resolve
Milestone
Description
This surfaces the discussion in #5216. Here are some precision differences than @JPFrancoia and I are looking at:
import numpy as np
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.utils.testing import assert_array_almost_equal
state = np.random.RandomState(0)
X = state.normal(loc=0, scale=100, size=(40, 20))
y = state.randint(0, 3, size=(40, 1))
# Train the LDA classifier. Use the eigen solver
lda_eigen = LDA(solver='eigen', n_components=5)
lda_eigen.fit(X, y)
lda_svd = LDA(n_components=5, solver='svd')
lda_svd.fit(X, y)
assert_array_almost_equal(lda_eigen.explained_variance_ratio_, lda_svd.explained_variance_ratio_)AssertionError:
Arrays are not almost equal to 6 decimals
(shapes (20,), (3,) mismatch)
x: array([ 6.03795532e-01, 3.96204468e-01, 5.85621882e-16,
3.18609950e-16, 2.08378911e-16, 1.21510637e-16,
7.83079028e-17, 7.58612317e-17, 3.89040436e-17,...
y: array([ 5.52469269e-01, 4.47530731e-01, 7.08925911e-17])
Besides the fact that lda_eigen has the wrong number of components for the explained_variance_ratio_ (according to the docs, it should have only n_components=5), there are numeric differences as well.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
BugEasyWell-defined and straightforward way to resolveWell-defined and straightforward way to resolve