Skip to content

Conversation

@SimonHeybrock
Copy link
Member

@SimonHeybrock SimonHeybrock commented Jan 28, 2025

After #3638 reduced some baseline cost, increasing thread counts earlier pays off according to my local benchmarks (group two columns into 2M bins).

Results (blue: before #3638, orange: #3638, green: this)
image
I included numpy.bincount for comparison. It does something much simpler, but it nicely illustrates that our constant (in the number of rows) baseline cost is still quite substantial, significantly affecting timings up to 10M rows or so. I will have another look at this bit as a follow-up task.

After #3637 reduced some baseline cost, increasing thread counts earlier
pays off according to my local benchmarks (group two columns into 2M
bins).
const auto size = std::max(scipp::index(1), da.dims()[dim]);
const auto nthread = size > 10000000 ? 24
: size > 1000000 ? 4
const auto nthread = size > 8000000 ? 24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these numbers really be hard-coded or should they depend on how many threads are available on the machine?

It could be that the benchmarks don't look as good on a laptop or a visa instance with less than 24 cores.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will not actually make that many threads. Instead, it cuts the input into chunks, which will be processed independently. Scipp's transform automatically uses multi-threading across these, so if you have fewer threads each thread will process multiple chunks. If chunks are too small there is extra cost to this, but I hope at this level it is not relevant.

Regarding thread counts in general, also see #3565 which would probably lead to more gains.

Base automatically changed from reduce-bin-overhead to main January 28, 2025 08:08
@SimonHeybrock SimonHeybrock merged commit 65da844 into main Jan 28, 2025
4 checks passed
@SimonHeybrock SimonHeybrock deleted the tune-group-bin-thread-count branch January 28, 2025 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants