Skip to content

Conversation

@wesm
Copy link
Member

@wesm wesm commented Nov 28, 2017

I also fixed a bug this surfaced in the hash table resize (unit test coverage was not adequate)

Now we have

$ ./release/compute-benchmark 
Run on (8 X 4200.16 MHz CPU s)
2017-11-28 18:33:53
Benchmark                                                           Time           CPU Iterations
-------------------------------------------------------------------------------------------------
BM_BuildDictionary/min_time:1.000                                1352 us       1352 us       1038   2.88639GB/s
BM_BuildStringDictionary/min_time:1.000                          3994 us       3994 us        351   75.5809MB/s
BM_UniqueInt64NoNulls/16M/50/min_time:1.000/real_time           35814 us      35816 us         39   3.49023GB/s
BM_UniqueInt64NoNulls/16M/1024/min_time:1.000/real_time        119656 us     119660 us         12   1069.73MB/s
BM_UniqueInt64NoNulls/16M/10k/min_time:1.000/real_time         174924 us     174930 us          8   731.747MB/s
BM_UniqueInt64NoNulls/16M/1024k/min_time:1.000/real_time       448425 us     448440 us          3   285.443MB/s
BM_UniqueInt64WithNulls/16M/50/min_time:1.000/real_time         49511 us      49513 us         29   2.52468GB/s
BM_UniqueInt64WithNulls/16M/1024/min_time:1.000/real_time      134519 us     134523 us         10   951.541MB/s
BM_UniqueInt64WithNulls/16M/10k/min_time:1.000/real_time       191331 us     191336 us          7   668.999MB/s
BM_UniqueInt64WithNulls/16M/1024k/min_time:1.000/real_time     533597 us     533613 us          3   239.882MB/s
BM_UniqueString10bytes/16M/50/min_time:1.000/real_time         150731 us     150736 us          9    1061.5MB/s
BM_UniqueString10bytes/16M/1024/min_time:1.000/real_time       256929 us     256938 us          5   622.739MB/s
BM_UniqueString10bytes/16M/10k/min_time:1.000/real_time        414412 us     414426 us          3    386.09MB/s
BM_UniqueString10bytes/16M/1024k/min_time:1.000/real_time     1744253 us    1744308 us          1   91.7298MB/s
BM_UniqueString100bytes/16M/50/min_time:1.000/real_time        563890 us     563909 us          2   2.77093GB/s
BM_UniqueString100bytes/16M/1024/min_time:1.000/real_time      704695 us     704720 us          2   2.21727GB/s
BM_UniqueString100bytes/16M/10k/min_time:1.000/real_time       995685 us     995721 us          2   1.56927GB/s
BM_UniqueString100bytes/16M/1024k/min_time:1.000/real_time    3584108 us    3584230 us          1   446.415MB/s

We can also refactor the hash table implementations without worrying too much about whether we're making things slower

wesm added 3 commits November 28, 2017 18:22
Change-Id: I0c2eb14f1cd8c63a79fe2a3da308c76ac19a7384
Change-Id: I0ff68d4f28fc8ccf7b2f4f5a4b8cc7e5c03aa717
Change-Id: I3e32cf5542c1eb6173eb47d624d43893008ee0ee
@wesm
Copy link
Member Author

wesm commented Nov 28, 2017

Using 0.5 for the load factor threshold for resizing makes things a lot faster for smaller cardinality tables, with minimal impact on large cardinality tables:

$ ./release/compute-benchmark 
Run on (8 X 4185.31 MHz CPU s)
2017-11-28 18:38:38
Benchmark                                                           Time           CPU Iterations
-------------------------------------------------------------------------------------------------
BM_BuildDictionary/min_time:1.000                                1328 us       1328 us       1083   2.93863GB/s
BM_BuildStringDictionary/min_time:1.000                          3143 us       3143 us        446   96.0677MB/s
BM_UniqueInt64NoNulls/16M/50/min_time:1.000/real_time           35761 us      35762 us         39   3.49545GB/s
BM_UniqueInt64NoNulls/16M/1024/min_time:1.000/real_time         69412 us      69414 us         20   1.80085GB/s
BM_UniqueInt64NoNulls/16M/10k/min_time:1.000/real_time          97227 us      97231 us         14   1.28565GB/s
BM_UniqueInt64NoNulls/16M/1024k/min_time:1.000/real_time       457800 us     457806 us          3   279.598MB/s
BM_UniqueInt64WithNulls/16M/50/min_time:1.000/real_time         48785 us      48786 us         29   2.56228GB/s
BM_UniqueInt64WithNulls/16M/1024/min_time:1.000/real_time       82978 us      82981 us         17   1.50642GB/s
BM_UniqueInt64WithNulls/16M/10k/min_time:1.000/real_time       111961 us     111965 us         13   1.11646GB/s
BM_UniqueInt64WithNulls/16M/1024k/min_time:1.000/real_time     531226 us     531244 us          3   240.952MB/s
BM_UniqueString10bytes/16M/50/min_time:1.000/real_time         150719 us     150724 us          9   1061.58MB/s
BM_UniqueString10bytes/16M/1024/min_time:1.000/real_time       193408 us     193413 us          7   827.266MB/s
BM_UniqueString10bytes/16M/10k/min_time:1.000/real_time        278841 us     278851 us          5   573.803MB/s
BM_UniqueString10bytes/16M/1024k/min_time:1.000/real_time     1700923 us    1700978 us          1   94.0666MB/s
BM_UniqueString100bytes/16M/50/min_time:1.000/real_time        563184 us     563204 us          2    2.7744GB/s
BM_UniqueString100bytes/16M/1024/min_time:1.000/real_time      636416 us     636436 us          2   2.45515GB/s
BM_UniqueString100bytes/16M/10k/min_time:1.000/real_time       861909 us     861941 us          2   1.81284GB/s
BM_UniqueString100bytes/16M/1024k/min_time:1.000/real_time    3651238 us    3651359 us          1   438.208MB/s

@wesm
Copy link
Member Author

wesm commented Nov 29, 2017

+1, we can try different hash functions in a follow-up PR

@wesm wesm closed this in aa3328d Nov 29, 2017
wesm added a commit that referenced this pull request Dec 1, 2017
…ength strings

I also fixed a bug this surfaced in the hash table resize (unit test coverage was not adequate)

Now we have

```
$ ./release/compute-benchmark
Run on (8 X 4200.16 MHz CPU s)
2017-11-28 18:33:53
Benchmark                                                           Time           CPU Iterations
-------------------------------------------------------------------------------------------------
BM_BuildDictionary/min_time:1.000                                1352 us       1352 us       1038   2.88639GB/s
BM_BuildStringDictionary/min_time:1.000                          3994 us       3994 us        351   75.5809MB/s
BM_UniqueInt64NoNulls/16M/50/min_time:1.000/real_time           35814 us      35816 us         39   3.49023GB/s
BM_UniqueInt64NoNulls/16M/1024/min_time:1.000/real_time        119656 us     119660 us         12   1069.73MB/s
BM_UniqueInt64NoNulls/16M/10k/min_time:1.000/real_time         174924 us     174930 us          8   731.747MB/s
BM_UniqueInt64NoNulls/16M/1024k/min_time:1.000/real_time       448425 us     448440 us          3   285.443MB/s
BM_UniqueInt64WithNulls/16M/50/min_time:1.000/real_time         49511 us      49513 us         29   2.52468GB/s
BM_UniqueInt64WithNulls/16M/1024/min_time:1.000/real_time      134519 us     134523 us         10   951.541MB/s
BM_UniqueInt64WithNulls/16M/10k/min_time:1.000/real_time       191331 us     191336 us          7   668.999MB/s
BM_UniqueInt64WithNulls/16M/1024k/min_time:1.000/real_time     533597 us     533613 us          3   239.882MB/s
BM_UniqueString10bytes/16M/50/min_time:1.000/real_time         150731 us     150736 us          9    1061.5MB/s
BM_UniqueString10bytes/16M/1024/min_time:1.000/real_time       256929 us     256938 us          5   622.739MB/s
BM_UniqueString10bytes/16M/10k/min_time:1.000/real_time        414412 us     414426 us          3    386.09MB/s
BM_UniqueString10bytes/16M/1024k/min_time:1.000/real_time     1744253 us    1744308 us          1   91.7298MB/s
BM_UniqueString100bytes/16M/50/min_time:1.000/real_time        563890 us     563909 us          2   2.77093GB/s
BM_UniqueString100bytes/16M/1024/min_time:1.000/real_time      704695 us     704720 us          2   2.21727GB/s
BM_UniqueString100bytes/16M/10k/min_time:1.000/real_time       995685 us     995721 us          2   1.56927GB/s
BM_UniqueString100bytes/16M/1024k/min_time:1.000/real_time    3584108 us    3584230 us          1   446.415MB/s
```

We can also refactor the hash table implementations without worrying too much about whether we're making things slower

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes #1370 from wesm/ARROW-1844 and squashes the following commits:

638f1a1 [Wes McKinney] Decrease resize load factor to 0.5
2885c64 [Wes McKinney] Multiply bytes processed by state.iterations()
f7b3619 [Wes McKinney] Add initial Unique benchmarks for int64, strings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant