Skip to content

[MRG] TST Fix slow test in pairwise.#13433

Merged
qinhanmin2014 merged 1 commit intoscikit-learn:masterfrom
jeremiedbb:fix-slow-pdist-test
Mar 15, 2019
Merged

[MRG] TST Fix slow test in pairwise.#13433
qinhanmin2014 merged 1 commit intoscikit-learn:masterfrom
jeremiedbb:fix-slow-pdist-test

Conversation

@jeremiedbb
Copy link
Copy Markdown
Member

Fixes #13208

I've been able to reproduce locally so it was not a matter of distribution.

The issue was that with the previous working memory, pairwise_distances_chunked made 77 chunks, leading to 77 calls to joblib parallel, re-creating the thread-pool each time. (I guess this is a situation where backend='loky' would be better but should not happen in practice)

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will limiting memory to 1mb force this to be chunked?

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Mar 12, 2019 via email

@jeremiedbb
Copy link
Copy Markdown
Member Author

Will limiting memory to 1mb force this to be chunked?

Yes, there was 77 chunks before, and 7 or 8 now. Timings for this test are now reduced to a few 100s ms.

@jeremiedbb
Copy link
Copy Markdown
Member Author

I think we should be constructing the Parallel instance in pairwise_distances_chunked

It could be but it wouldn't be that simple. pairwise_distances_chunked calls pairwise_distances, so we would need to make parallel calls of pairwise_distances with n_jobs=1 for each instance of it. It would not be very clean.

However I don't think the purpose of pairwise_distances_chunked is to make lots of small chunks. It's main purpose is to limit memory usage, but I see no use case where you want to limit it to a few Mo. And as soon as your chunks are big enough, the overhead of thread creation/destruction becomes negligible. Also maybe some day loky will have a threading backend with reusable threads :)

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Mar 12, 2019 via email

@qinhanmin2014 qinhanmin2014 merged commit a8d62b1 into scikit-learn:master Mar 15, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants