-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
dbscan uses large amount of ram #26726
Copy link
Copy link
Closed
Labels
Description
Describe the bug
I'm using sklearn version 1.1.2 . In the following code dbscan uses about 15GB of memory. The size of xy is 2.88MB. This can't be right.
from sklearn.cluster import dbscan
import numpy as np
nclust = 12
cluster_size = 15000
xy = []
for i in range(nclust):
centre = np.random.uniform(0, 20000, (1,2))
cluster = np.random.randn(cluster_size, 2) * 15 + centre
xy.append(cluster)
xy = np.vstack(xy)
dbscan(xy, eps=40, min_samples=10, algorithm='kd_tree', leaf_size=500)Steps/Code to Reproduce
from sklearn.cluster import dbscan
import numpy as np
nclust = 12
cluster_size = 15000
xy = []
for i in range(nclust):
centre = np.random.uniform(0, 20000, (1,2))
cluster = np.random.randn(cluster_size, 2) * 15 + centre
xy.append(cluster)
xy = np.vstack(xy)
dbscan(xy, eps=40, min_samples=10, algorithm='kd_tree', leaf_size=500)Expected Results
No sure
Actual Results
15GB of RAM usage by dbscan execution
Versions
System:
python: 3.8.10 (default, May 26 2023, 14:05:08) [GCC 9.4.0]
executable: /home/vcn81216/fileserver_home/python/python3/bin/python
machine: Linux-5.15.0-75-generic-x86_64-with-glibc2.29
Python dependencies:
sklearn: 1.1.2
pip: 23.1.2
setuptools: 59.1.0
numpy: 1.23.2
scipy: 1.9.1
Cython: 0.29.23
pandas: 1.4.4
matplotlib: 3.5.3
joblib: 1.1.0
threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info:
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /mnt/rclsfserv005/users/vcn81216/python/python3/lib/python3.8/site-packages/numpy.libs/libopenblas64_p-r0-742d56dc.3.20.so
version: 0.3.20
threading_layer: pthreads
architecture: SkylakeX
num_threads: 12
user_api: openmp
internal_api: openmp
prefix: libgomp
filepath: /mnt/rclsfserv005/users/vcn81216/python/python3/lib/python3.8/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
version: None
num_threads: 12
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /mnt/rclsfserv005/users/vcn81216/python/python3/lib/python3.8/site-packages/scipy.libs/libopenblasp-r0-9f9f5dbc.3.18.so
version: 0.3.18
threading_layer: pthreads
architecture: SkylakeX
num_threads: 12Reactions are currently unavailable