Skip to content

Array API test failure for probabilistic metrics with scipy==1.15.0 #32552

@StefanieSenger

Description

@StefanieSenger

Describe the bug

In #32422 we have introduced array API for probabilistic metrics and added a test (test_probabilistic_metrics_multilabel_array_api).

I had this test failing locally with scipy==1.15.0 for array-api-strict arrays on log_loss and 2d_log_loss_score, because scipy's xlogy of that version (used in _log_loss) has trouble handling array-api-strict arrays of mixed dtypes (int64 and float64).

According to sklearn/utils/_array_api.py the scipy min version for array api would be "1.14.0". Theoretically "1.15.0" should not throw an error. I am not sure what we can do about this, and if we need to, since all array libraries pass this test and it is only array-api-strict failing. We could move the internal min_scipy_version, but apart from that I cannot see any fix we could do internally.

Talking with @lesteve, he advised me to open an issue, be it just indicating to others who stumble over the same issue that upgrading scipy is the solution. @lesteve Thanks for your help in fixing this.

Traceback

prob_metric = <function log_loss at 0x7f97095f5580>, str_y_true = False, use_sample_weight = False, array_namespace = 'array_api_strict'
device_ = array_api_strict.Device('CPU_DEVICE'), dtype_name = 'float64'
    @pytest.mark.parametrize(
        "prob_metric", [brier_score_loss, log_loss, d2_brier_score, d2_log_loss_score]
    )
    @pytest.mark.parametrize("str_y_true", [False, True])
    @pytest.mark.parametrize("use_sample_weight", [False, True])
    @pytest.mark.parametrize(
        "array_namespace, device_, dtype_name", yield_namespace_device_dtype_combinations()
    )
    def test_probabilistic_metrics_array_api(
        prob_metric, str_y_true, use_sample_weight, array_namespace, device_, dtype_name
    ):
        """Test that :func:`brier_score_loss`, :func:`log_loss`, func:`d2_brier_score`
        and :func:`d2_log_loss_score` work correctly with the array API for binary
        and mutli-class inputs.
        """
        xp = _array_api_for_tests(array_namespace, device_)
        sample_weight = np.array([1, 2, 3, 1]) if use_sample_weight else None
    
        # binary case
        extra_kwargs = {}
        if str_y_true:
            y_true_np = np.array(["yes", "no", "yes", "no"])
            y_true_xp_or_np = np.asarray(y_true_np)
            if "brier" in prob_metric.__name__:
                # `brier_score_loss` and `d2_brier_score` require specifying the
                # `pos_label`
                extra_kwargs["pos_label"] = "yes"
        else:
            y_true_np = np.array([1, 0, 1, 0])
            y_true_xp_or_np = xp.asarray(y_true_np, device=device_)
    
        y_prob_np = np.array([0.5, 0.2, 0.7, 0.6], dtype=dtype_name)
        y_prob_xp = xp.asarray(y_prob_np, device=device_)
        metric_score_np = prob_metric(
            y_true_np, y_prob_np, sample_weight=sample_weight, **extra_kwargs
        )
        with config_context(array_api_dispatch=True):
>           metric_score_xp = prob_metric(
                y_true_xp_or_np, y_prob_xp, sample_weight=sample_weight, **extra_kwargs
            )

sklearn/metrics/tests/test_classification.py:3712: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn/utils/_param_validation.py:218: in wrapper
    return func(*args, **kwargs)
sklearn/metrics/_classification.py:3363: in log_loss
    return _log_loss(
sklearn/metrics/_classification.py:3378: in _log_loss
    loss = -xp.sum(xlogy(transformed_labels, y_pred), axis=1)
../../../.pyenv/versions/scikit-learn_dev/lib/python3.12/site-packages/scipy/special/_support_alternative_backends.py:167: in wrapped
    return f(*args, **kwargs)
../../../.pyenv/versions/scikit-learn_dev/lib/python3.12/site-packages/scipy/special/_support_alternative_backends.py:76: in __xlogy
    temp = x * xp.log(y)
../../../.pyenv/versions/scikit-learn_dev/lib/python3.12/site-packages/array_api_strict/_array_object.py:858: in __mul__
    other = self._check_allowed_dtypes(other, "numeric", "__mul__")
../../../.pyenv/versions/scikit-learn_dev/lib/python3.12/site-packages/array_api_strict/_array_object.py:215: in _check_allowed_dtypes
    res_dtype = _result_type(self.dtype, other.dtype)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

type1 = array_api_strict.int64, type2 = array_api_strict.float64

    def _result_type(type1: DType, type2: DType) -> DType:
        if (type1, type2) in _promotion_table:
            return _promotion_table[type1, type2]
>       raise TypeError(f"{type1} and {type2} cannot be type promoted together")
E       TypeError: array_api_strict.int64 and array_api_strict.float64 cannot be type promoted together

../../../.pyenv/versions/scikit-learn_dev/lib/python3.12/site-packages/array_api_strict/_dtypes.py:229: TypeError

Versions

scipy==1.15.0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions