Describe the bug
We have a simple benchmark where we add informations about bytes/second and items/second at the end:
https://github.com/apache/arrow/blob/fc0c1a595f08aacaeec0eadd30782ffb4788e9ea/cpp/src/parquet/column_reader_benchmark.cc#L280-L281
Excerpt:
state.SetBytesProcessed(state.iterations() * num_levels * sizeof(int16_t));
state.SetItemsProcessed(state.iterations() * num_levels);
It seems clear from this that the reported bytes_per_second should be exactly 2 * items_per_second.
However, when actually running the benchmark, the reported results do not exactly satisfy this expectation, for example:
ReadLevels_BitPack/MaxLevel:3/NumLevels:8096/BatchSize:1024/LevelRepeatCount:7 1299 ns 1301 ns 538976 bytes_per_second=11.5877G/s items_per_second=6.22113G/s
This led me to add some logging to find out what happens:
ARROW_LOG(INFO) << "iterations: " << state.iterations() << ", expected b/s " << state.iterations() * num_levels * sizeof(int16_t);
state.SetBytesProcessed(state.iterations() * num_levels * sizeof(int16_t));
ARROW_LOG(INFO) << "iterations: " << state.iterations() << ", expected i/s " << state.iterations() * num_levels;
state.SetItemsProcessed(state.iterations() * num_levels);
and the printout is as follows:
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:283: iterations: 1, expected b/s 16192
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:285: iterations: 1, expected i/s 8096
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:283: iterations: 10, expected b/s 161920
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:285: iterations: 10, expected i/s 80960
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:283: iterations: 100, expected b/s 1619200
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:285: iterations: 100, expected i/s 809600
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:283: iterations: 1000, expected b/s 16192000
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:285: iterations: 1000, expected i/s 8096000
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:283: iterations: 10000, expected b/s 161920000
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:285: iterations: 10000, expected i/s 80960000
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:283: iterations: 100000, expected b/s 1619200000
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:285: iterations: 100000, expected i/s 809600000
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:283: iterations: 538976, expected b/s 8727099392
/home/antoine/arrow/dev/cpp/src/parquet/column_reader_benchmark.cc:285: iterations: 538976, expected i/s 4363549696
ReadLevels_BitPack/MaxLevel:3/NumLevels:8096/BatchSize:1024/LevelRepeatCount:7 1299 ns 1301 ns 538976 bytes_per_second=11.5877G/s items_per_second=6.22113G/s
It is not obvious how the reported numbers are computed, because bytes_per_second does not match a simple computation from the last iterations run:
>>> 8727099392 / 538976 / 1299e-9
12464973056.197075 # does not match 11.5877G/s
>>> 8727099392 / 538976/ 1301e-9
12445810914.681015 # does not match 11.5877G/s either
A similarly computed items_per_second seems closer but does not match the reported number either:
>>> 4363549696 / 538976 / 1299e-9
6232486528.098537 # close to 6.22113G/s, but not quite
>>> 4363549696 / 538976 / 1301e-9
6222905457.3405075 # ditto
System
Which OS, compiler, and compiler version are you using:
- OS: Ubuntu 22.04
- Compiler and version: gcc 12.3.0
Expected behavior
I expect at least bytes_per_second and items_per_second to respect the linear relationship following from the values given to SetBytesProcessed and SetItemsProcessed. The lack of consistency between reported numbers makes it more complicated to interpret benchmark results as well.
Describe the bug
We have a simple benchmark where we add informations about bytes/second and items/second at the end:
https://github.com/apache/arrow/blob/fc0c1a595f08aacaeec0eadd30782ffb4788e9ea/cpp/src/parquet/column_reader_benchmark.cc#L280-L281
Excerpt:
It seems clear from this that the reported
bytes_per_secondshould be exactly2 * items_per_second.However, when actually running the benchmark, the reported results do not exactly satisfy this expectation, for example:
This led me to add some logging to find out what happens:
and the printout is as follows:
It is not obvious how the reported numbers are computed, because
bytes_per_seconddoes not match a simple computation from the last iterations run:A similarly computed
items_per_secondseems closer but does not match the reported number either:System
Which OS, compiler, and compiler version are you using:
Expected behavior
I expect at least
bytes_per_secondanditems_per_secondto respect the linear relationship following from the values given toSetBytesProcessedandSetItemsProcessed. The lack of consistency between reported numbers makes it more complicated to interpret benchmark results as well.