Skip to content

BUG: Trees: minor bug in NaN detection leading to incorrect split. #33113

@cakedev0

Description

@cakedev0

Describe the bug

The break here is wrong, and in some edge cases it will prevent _any_isnan_axis0 detecting NaNs for some features. Then because of that, the splitter won't considers splits with NaNs on the left (because it's not aware this feature has NaNs).

Steps/Code to Reproduce

import numpy as np
from numpy import nan
from sklearn.tree import DecisionTreeRegressor

X = np.array([
    [nan, nan],
    [1, 0],
    [1, 1]
])

y = np.array([0, 0, 1])

tree = DecisionTreeRegressor(max_depth=1).fit(X, y)
assert (tree.tree_.impurity[1:] == 0).all()

Expected Results

The best split is found => No assertion error

The best split: X[:, 1] < 0.5 and missing values to the left node.

Actual Results

Assertion error.

Versions

System:
    python: 3.12.11 (main, Aug 18 2025, 19:19:11) [Clang 20.1.4 ]
executable: /home/arthur/dev-perso/scikit-learn/sklearn-env/bin/python
   machine: Linux-6.14.0-37-generic-x86_64-with-glibc2.39

Python dependencies:
      sklearn: 1.9.dev0
          pip: None
   setuptools: 80.9.0
        numpy: 2.3.5
        scipy: 1.16.3
       Cython: 3.2.1
       pandas: 2.3.3
   matplotlib: 3.10.7
       joblib: 1.5.2
threadpoolctl: 3.6.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 16
         prefix: libscipy_openblas
       filepath: /home/arthur/dev-perso/scikit-learn/sklearn-env/lib/python3.12/site-packages/numpy.libs/libscipy_openblas64_-fdde5778.so
        version: 0.3.30
threading_layer: pthreads
   architecture: Haswell

       user_api: blas
   internal_api: openblas
    num_threads: 16
         prefix: libscipy_openblas
       filepath: /home/arthur/dev-perso/scikit-learn/sklearn-env/lib/python3.12/site-packages/scipy.libs/libscipy_openblas-b75cc656.so
        version: 0.3.29.dev
threading_layer: pthreads
   architecture: Haswell

       user_api: openmp
   internal_api: openmp
    num_threads: 16
         prefix: libgomp
       filepath: /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
        version: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions