Skip to content

Using NearestNeighbors with p < 1 and floats raises an error  #26548

@catanzaromj

Description

@catanzaromj

Describe the bug

Using NearestNeighbors with p < 1 raises an error if the array X contains floats. It does not seem to raise errors if X consists of integers.

This was originally discussed in #26536

For example, this is fine:

from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1,0], [0,0], [0,1]])
neigh = NearestNeighbors(algorithm='brute',metric_params={'p':0.5})
neigh.fit(X)
neigh.radius_neighbors(X[0].reshape(1,-1), radius=4, return_distance=False)

I would expect this behavior whether X consists of floats or integers.

Steps/Code to Reproduce

from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0,0.0], [0.0,0.0], [0.0,1.0]])
neigh = NearestNeighbors(algorithm='brute',metric_params={'p':0.5})
neigh.fit(X)
neigh.radius_neighbors(X[0].reshape(1,-1), radius=4, return_distance=False)

Expected Results

array([array([0, 1, 2])], dtype=object)

Actual Results

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[40], line 6
      4 neigh = NearestNeighbors(algorithm='brute',metric_params={'p':0.5})
      5 neigh.fit(X)
----> 6 neigh.radius_neighbors(X[0].reshape(1,-1), radius=4, return_distance=False)

File ~/miniconda3/envs/dimmer_env/lib/python3.9/site-packages/sklearn/neighbors/_base.py:1161, in RadiusNeighborsMixin.radius_neighbors(self, X, radius, return_distance, sort_results)
   1153 use_pairwise_distances_reductions = (
   1154     self._fit_method == "brute"
   1155     and RadiusNeighbors.is_usable_for(
   1156         X if X is not None else self._fit_X, self._fit_X, self.effective_metric_
   1157     )
   1158 )
   1160 if use_pairwise_distances_reductions:
-> 1161     results = RadiusNeighbors.compute(
   1162         X=X,
   1163         Y=self._fit_X,
   1164         radius=radius,
   1165         metric=self.effective_metric_,
   1166         metric_kwargs=self.effective_metric_params_,
   1167         strategy="auto",
   1168         return_distance=return_distance,
   1169         sort_results=sort_results,
   1170     )
   1172 elif (
   1173     self._fit_method == "brute" and self.metric == "precomputed" and issparse(X)
   1174 ):
   1175     results = _radius_neighbors_from_graph(
   1176         X, radius=radius, return_distance=return_distance
   1177     )

File ~/miniconda3/envs/dimmer_env/lib/python3.9/site-packages/sklearn/metrics/_pairwise_distances_reduction/_dispatcher.py:421, in RadiusNeighbors.compute(cls, X, Y, radius, metric, chunk_size, metric_kwargs, strategy, return_distance, sort_results)
    335 """Return the results of the reduction for the given arguments.
    336 
    337 Parameters
   (...)
    418 returns.
    419 """
    420 if X.dtype == Y.dtype == np.float64:
--> 421     return RadiusNeighbors64.compute(
    422         X=X,
    423         Y=Y,
    424         radius=radius,
    425         metric=metric,
    426         chunk_size=chunk_size,
    427         metric_kwargs=metric_kwargs,
    428         strategy=strategy,
    429         sort_results=sort_results,
    430         return_distance=return_distance,
    431     )
    433 if X.dtype == Y.dtype == np.float32:
    434     return RadiusNeighbors32.compute(
    435         X=X,
    436         Y=Y,
   (...)
    443         return_distance=return_distance,
    444     )

File sklearn/metrics/_pairwise_distances_reduction/_radius_neighbors.pyx:110, in sklearn.metrics._pairwise_distances_reduction._radius_neighbors.RadiusNeighbors64.compute()

File sklearn/metrics/_pairwise_distances_reduction/_datasets_pair.pyx:87, in sklearn.metrics._pairwise_distances_reduction._datasets_pair.DatasetsPair64.get_for()

File sklearn/metrics/_dist_metrics.pyx:285, in sklearn.metrics._dist_metrics.DistanceMetric.get_metric()

File sklearn/metrics/_dist_metrics.pyx:1252, in sklearn.metrics._dist_metrics.MinkowskiDistance.__init__()

ValueError: p must be greater than 1

Versions

System:
    python: 3.9.16 (main, Mar  8 2023, 14:00:05)  [GCC 11.2.0]
executable: /home/XYX/miniconda3/envs/dimmer_env/bin/python
   machine: Linux-5.15.0-69-generic-x86_64-with-glibc2.31

Python dependencies:
      sklearn: 1.2.2
          pip: 23.0.1
   setuptools: 67.8.0
        numpy: 1.24.3
        scipy: 1.10.1
       Cython: None
       pandas: 2.0.1
   matplotlib: 3.7.1
       joblib: 1.2.0
threadpoolctl: 3.1.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /home/XYX/miniconda3/envs/dimmer_env/lib/python3.9/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so
        version: 0.3.21
threading_layer: pthreads
   architecture: Zen
    num_threads: 48

       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /home/XYX/miniconda3/envs/dimmer_env/lib/python3.9/site-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so
        version: 0.3.18
threading_layer: pthreads
   architecture: Zen
    num_threads: 48

       user_api: openmp
   internal_api: openmp
         prefix: libgomp
       filepath: /home/XYX/miniconda3/envs/dimmer_env/lib/python3.9/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
        version: None
    num_threads: 48

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions