-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
check_decision_proba_consistency fails with LinearDiscriminantAnalysis #19224
Copy link
Copy link
Open
Labels
Description
The following common check recently started to fail on a Windows CI job:
name = 'LinearDiscriminantAnalysis'
estimator_orig = LinearDiscriminantAnalysis()
@ignore_warnings(category=FutureWarning)
def check_decision_proba_consistency(name, estimator_orig):
# Check whether an estimator having both decision_function and
# predict_proba methods has outputs with perfect rank correlation.
centers = [(2, 2), (4, 4)]
X, y = make_blobs(n_samples=100, random_state=0, n_features=4,
centers=centers, cluster_std=1.0, shuffle=True)
X_test = np.random.randn(20, 2) + 4
estimator = clone(estimator_orig)
if (hasattr(estimator, "decision_function") and
hasattr(estimator, "predict_proba")):
estimator.fit(X, y)
# Since the link function from decision_function() to predict_proba()
# is sometimes not precise enough (typically expit), we round to the
# 10th decimal to avoid numerical issues.
a = estimator.predict_proba(X_test)[:, 1].round(decimals=10)
b = estimator.decision_function(X_test).round(decimals=10)
> assert_array_equal(rankdata(a), rankdata(b))
E AssertionError:
E Arrays are not equal
E
E Mismatched elements: 2 / 20 (10%)
E Max absolute difference: 0.5
E Max relative difference: 0.02631579
E x: array([ 7. , 8. , 11. , 9. , 17. , 10. , 5. , 14. , 6. , 1. , 19.5,
E 4. , 2. , 16. , 12. , 13. , 3. , 15. , 19.5, 18. ])
E y: array([ 7., 8., 11., 9., 17., 10., 5., 14., 6., 1., 20., 4., 2.,
E 16., 12., 13., 3., 15., 19., 18.])This happened on this PR which should not have any impact on the behavior LinearDiscriminantAnalysis.predict_proba: #17743.
I suspect the test to be too brittle. Maybe using a test set more related to the original distribution (blobs) would avoid ties or caused by arbitrary rounding?
Reactions are currently unavailable