Skip to content

Conversation

@kszucs
Copy link
Member

@kszucs kszucs commented Jun 12, 2020

Quickly wanted to add a benchmark for the Add function to verify that no significant regressions were introduced by #7341

Before:

---------------------------------------------------------------------------------------
Benchmark                                Time           CPU Iterations UserCounters...
---------------------------------------------------------------------------------------
AddArrayArrayKernel/32768/10000         18 us         18 us      35892 null_percent=0.01 size=32.768k   1.67854GB/s
AddArrayArrayKernel/32768/100           19 us         19 us      37540 null_percent=1 size=32.768k   1.61941GB/s
AddArrayArrayKernel/32768/10            20 us         20 us      37049 null_percent=10 size=32.768k   1.55599GB/s
AddArrayArrayKernel/32768/2             20 us         20 us      35394 null_percent=50 size=32.768k   1.54512GB/s
AddArrayArrayKernel/32768/1             19 us         19 us      37901 null_percent=100 size=32.768k   1.63153GB/s

After:

---------------------------------------------------------------------------------------
Benchmark                                Time           CPU Iterations UserCounters...
---------------------------------------------------------------------------------------
AddArrayArrayKernel/32768/10000         19 us         19 us      36704 null_percent=0.01 size=32.768k   1.64619GB/s
AddArrayArrayKernel/32768/100           18 us         18 us      37194 null_percent=1 size=32.768k   1.67588GB/s
AddArrayArrayKernel/32768/10            18 us         18 us      36341 null_percent=10 size=32.768k   1.65205GB/s
AddArrayArrayKernel/32768/2             18 us         18 us      37502 null_percent=50 size=32.768k     1.662GB/s
AddArrayArrayKernel/32768/1             18 us         18 us      38622 null_percent=100 size=32.768k   1.66593GB/s

cc @wesm

@github-actions
Copy link

@wesm
Copy link
Member

wesm commented Jun 12, 2020

Thanks for working on this. I'll check the benchmarks on MSVC also

@wesm
Copy link
Member

wesm commented Jun 12, 2020

There don't seem to be issues on MSVC

https://gist.github.com/wesm/45be57393b2d9186f87faae228f12380/revisions

@wesm
Copy link
Member

wesm commented Jun 13, 2020

+1. I fixed a few lingering issues that jumped out at me, will merge this once build passes

bench->Unit(benchmark::kMicrosecond);

for (const auto size : kMemorySizes) {
for (const auto size : {kL1Size, kL2Size}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a processor with 22MB L3 cache so generating that much random data is quite expensive. If we want to benchmark arrays that big we should generate a smaller sample of random data and repeat/tile it to make the bigger array.

@wesm wesm closed this in 96279ee Jun 15, 2020
@kszucs
Copy link
Member Author

kszucs commented Jun 15, 2020

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants