Skip to content

Enabling array API dispatch causes pipelines to reject dataframe inputs #32836

@ogrisel

Description

@ogrisel

Describe the bug

Enabling the array API dispatch has a negative side effect on code that usually accepts non-array inputs such as pandas dataframes:

Steps/Code to Reproduce

# %%
import os
os.environ["SCIPY_ARRAY_API"] = "1"

# %%
import sklearn
from sklearn.datasets import fetch_openml
from sklearn.preprocessing import OneHotEncoder


adult = fetch_openml(name="adult", version=2, as_frame=True)
X, y = adult.data, adult.target

with sklearn.config_context(array_api_dispatch=True):
    OneHotEncoder().fit(
        X.select_dtypes(include=["category", "object"])
    )

Expected Results

The enabling of array API should not affect the estimator behavior on dataframe inputs.

Actual Results

Traceback (most recent call last):

  Cell In[11], [line 11](vscode-notebook-cell:?execution_count=11&line=11)
    OneHotEncoder().fit(

  File ~/code/scikit-learn/sklearn/base.py:1336 in wrapper
    return fit_method(estimator, *args, **kwargs)

  File ~/code/scikit-learn/sklearn/preprocessing/_encoders.py:999 in fit
    self._fit(

  File ~/code/scikit-learn/sklearn/preprocessing/_encoders.py:86 in _fit
    X_list, n_samples, n_features = self._check_X(

  File ~/code/scikit-learn/sklearn/preprocessing/_encoders.py:69 in _check_X
    Xi = check_array(

  File ~/code/scikit-learn/sklearn/utils/validation.py:858 in check_array
    xp, is_array_api_compliant = get_namespace(array)

  File ~/code/scikit-learn/sklearn/utils/_array_api.py:404 in get_namespace
    namespace, is_array_api_compliant = array_api_compat.get_namespace(*arrays), True

  File ~/code/scikit-learn/sklearn/externals/array_api_compat/common/_helpers.py:643 in array_namespace
    raise TypeError(f"{type(x).__name__} is not a supported array type")

TypeError: Series is not a supported array type

Versions

System:
    python: 3.13.7 | packaged by conda-forge | (main, Sep  3 2025, 14:24:46) [Clang 19.1.7 ]
executable: /Users/ogrisel/miniforge3/envs/dev/bin/python
   machine: macOS-15.6.1-arm64-arm-64bit-Mach-O

Python dependencies:
      sklearn: 1.9.dev0
          pip: 25.2
   setuptools: 80.9.0
        numpy: 2.3.3
        scipy: 1.16.3
       Cython: 3.1.4
       pandas: 3.0.0.dev0+2566.g2bb3fef887
   matplotlib: 3.10.6
       joblib: 1.5.2
threadpoolctl: 3.6.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: /Users/ogrisel/miniforge3/envs/dev/lib/libopenblas.0.dylib
        version: 0.3.30
threading_layer: openmp
   architecture: VORTEX

       user_api: openmp
   internal_api: openmp
    num_threads: 10
         prefix: libomp
       filepath: /Users/ogrisel/miniforge3/envs/dev/lib/libomp.dylib
        version: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions