Segmentation fault when calculating euclidean_distances for large numbers of rows

With scikit-learn 0.15.2, numpy 1.9.1, python 2.7.8 (on OS X), the following code segfaults:

```
import numpy
import sklearn.cluster

numpy.random.seed(1)
X = numpy.random.random((50000, 100))
model = sklearn.cluster.KMeans(n_clusters=3, random_state=1)
model.fit_predict(X)
print sklearn.metrics.silhouette_score(X, model.labels_, metric='euclidean')
```

Results in:

```
Segmentation fault: 11
```

Dropping the rows down to 30000, and the above completes fine. Dropping rows to 40000, and the script takes a very long amount of time, but didn't appear to segfault.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Segmentation fault when calculating euclidean_distances for large numbers of rows #4197

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Segmentation fault when calculating euclidean_distances for large numbers of rows #4197

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions