colexec: add distinct mode to hashTable#45704
Conversation
c6942b9 to
edc6951
Compare
Azhng
left a comment
There was a problem hiding this comment.
Benchmark:
benchstat ~/.local/tmp/master.c10.benchmark ~/.local/tmp/dht.c10.benchmark
name old time/op new time/op delta
UnorderedDistinct/numCols=1/nulls=false/numBatches=1-8 56.0µs ±15% 54.7µs ± 3% ~ (p=0.143 n=10+10)
UnorderedDistinct/numCols=1/nulls=false/numBatches=16-8 951µs ± 2% 1238µs ± 2% +30.17% (p=0.000 n=10+9)
UnorderedDistinct/numCols=1/nulls=false/numBatches=256-8 24.9ms ± 2% 34.1ms ± 2% +37.02% (p=0.000 n=9+9)
UnorderedDistinct/numCols=1/nulls=true/numBatches=1-8 90.2µs ± 1% 73.5µs ± 1% -18.51% (p=0.000 n=7+9)
UnorderedDistinct/numCols=1/nulls=true/numBatches=16-8 1.38ms ± 1% 1.29ms ± 3% -6.51% (p=0.000 n=10+9)
UnorderedDistinct/numCols=1/nulls=true/numBatches=256-8 31.1ms ± 3% 27.4ms ± 3% -11.84% (p=0.000 n=9+9)
UnorderedDistinct/numCols=2/nulls=false/numBatches=1-8 80.8µs ± 7% 74.2µs ±10% -8.10% (p=0.006 n=8+10)
UnorderedDistinct/numCols=2/nulls=false/numBatches=16-8 1.32ms ± 2% 1.77ms ±14% +34.44% (p=0.000 n=8+10)
UnorderedDistinct/numCols=2/nulls=false/numBatches=256-8 41.2ms ± 2% 54.8ms ± 1% +33.06% (p=0.000 n=10+9)
UnorderedDistinct/numCols=2/nulls=true/numBatches=1-8 144µs ± 2% 128µs ± 2% -11.44% (p=0.000 n=9+9)
UnorderedDistinct/numCols=2/nulls=true/numBatches=16-8 2.33ms ± 3% 2.34ms ± 4% ~ (p=0.842 n=9+10)
UnorderedDistinct/numCols=2/nulls=true/numBatches=256-8 61.3ms ± 6% 57.6ms ± 4% -6.02% (p=0.000 n=10+9)
name old speed new speed delta
UnorderedDistinct/numCols=1/nulls=false/numBatches=1-8 148MB/s ±16% 150MB/s ± 3% ~ (p=0.143 n=10+10)
UnorderedDistinct/numCols=1/nulls=false/numBatches=16-8 138MB/s ± 2% 106MB/s ± 2% -23.18% (p=0.000 n=10+9)
UnorderedDistinct/numCols=1/nulls=false/numBatches=256-8 84.4MB/s ± 2% 61.6MB/s ± 1% -27.02% (p=0.000 n=9+9)
UnorderedDistinct/numCols=1/nulls=true/numBatches=1-8 90.9MB/s ± 1% 111.5MB/s ± 1% +22.72% (p=0.000 n=7+9)
UnorderedDistinct/numCols=1/nulls=true/numBatches=16-8 94.6MB/s ± 1% 101.2MB/s ± 3% +6.98% (p=0.000 n=10+9)
UnorderedDistinct/numCols=1/nulls=true/numBatches=256-8 67.5MB/s ± 3% 76.6MB/s ± 3% +13.42% (p=0.000 n=9+9)
UnorderedDistinct/numCols=2/nulls=false/numBatches=1-8 203MB/s ± 6% 221MB/s ±10% +9.03% (p=0.006 n=8+10)
UnorderedDistinct/numCols=2/nulls=false/numBatches=16-8 199MB/s ± 2% 149MB/s ±13% -25.27% (p=0.000 n=8+10)
UnorderedDistinct/numCols=2/nulls=false/numBatches=256-8 102MB/s ± 2% 77MB/s ± 1% -24.85% (p=0.000 n=10+9)
UnorderedDistinct/numCols=2/nulls=true/numBatches=1-8 114MB/s ± 2% 128MB/s ± 2% +12.92% (p=0.000 n=9+9)
UnorderedDistinct/numCols=2/nulls=true/numBatches=16-8 112MB/s ± 3% 112MB/s ± 4% ~ (p=0.842 n=9+10)
UnorderedDistinct/numCols=2/nulls=true/numBatches=256-8 68.5MB/s ± 6% 72.8MB/s ± 4% +6.34% (p=0.000 n=10+9)
Reviewable status:
complete! 0 of 0 LGTMs obtained
a1d78a5 to
84d2387
Compare
Azhng
left a comment
There was a problem hiding this comment.
Rebased to master to pick up new benchmark, the number looks quite good
benchstat ~/.local/tmp/master.new.c10.benchmark ~/.local/tmp/dht.new.c10.benchmark
name old time/op new time/op delta
Distinct/Unordered/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 290µs ± 1% 291µs ± 2% ~ (p=0.780 n=9+10)
Distinct/Unordered/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 429µs ± 2% 359µs ± 1% -16.22% (p=0.000 n=10+9)
Distinct/Unordered/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 3.46ms ± 1% 1.90ms ± 1% -45.00% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 5.54ms ± 1% 2.78ms ± 2% -49.90% (p=0.000 n=8+10)
Distinct/Unordered/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 291µs ± 1% 291µs ± 1% ~ (p=0.739 n=10+10)
Distinct/Unordered/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 417µs ± 1% 363µs ± 2% -13.10% (p=0.000 n=9+9)
Distinct/Unordered/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 3.82ms ± 3% 1.95ms ± 2% -48.88% (p=0.000 n=10+9)
Distinct/Unordered/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 6.75ms ± 1% 2.86ms ± 1% -57.57% (p=0.000 n=10+8)
Distinct/Unordered/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 300µs ± 1% 311µs ± 2% +3.73% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 439µs ± 9% 395µs ± 1% -10.13% (p=0.000 n=10+9)
Distinct/Unordered/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 4.27ms ± 4% 2.37ms ± 1% -44.54% (p=0.000 n=10+9)
Distinct/Unordered/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 6.81ms ± 5% 3.42ms ± 1% -49.86% (p=0.000 n=10+10)
name old speed new speed delta
Distinct/Unordered/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 226MB/s ± 1% 226MB/s ± 2% ~ (p=0.780 n=9+10)
Distinct/Unordered/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 306MB/s ± 2% 365MB/s ± 1% +19.35% (p=0.000 n=10+9)
Distinct/Unordered/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 303MB/s ± 1% 552MB/s ± 1% +81.82% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 378MB/s ± 1% 755MB/s ± 2% +99.62% (p=0.000 n=8+10)
Distinct/Unordered/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 225MB/s ± 1% 225MB/s ± 1% ~ (p=0.739 n=10+10)
Distinct/Unordered/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 314MB/s ± 1% 362MB/s ± 2% +15.07% (p=0.000 n=9+9)
Distinct/Unordered/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 275MB/s ± 3% 537MB/s ± 2% +95.58% (p=0.000 n=10+9)
Distinct/Unordered/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 311MB/s ± 1% 732MB/s ± 1% +135.67% (p=0.000 n=10+8)
Distinct/Unordered/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 219MB/s ± 1% 211MB/s ± 2% -3.59% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 299MB/s ± 9% 331MB/s ± 3% +10.79% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 245MB/s ± 4% 442MB/s ± 1% +80.23% (p=0.000 n=10+9)
Distinct/Unordered/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 308MB/s ± 4% 614MB/s ± 1% +99.36% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Distinct/Unordered/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1164kB ± 0% +24.97% (p=0.000 n=9+10)
Distinct/Unordered/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.20MB ± 0% +3.01% (p=0.000 n=9+8)
Distinct/Unordered/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 1.17MB ± 0% -85.68% (p=0.000 n=10+8)
Distinct/Unordered/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 1.2MB ± 0% -91.34% (p=0.000 n=8+9)
Distinct/Unordered/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1166kB ± 0% +25.14% (p=0.000 n=8+8)
Distinct/Unordered/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.20MB ± 0% +3.34% (p=0.000 n=6+9)
Distinct/Unordered/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 1.20MB ± 0% -85.29% (p=0.000 n=8+9)
Distinct/Unordered/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 1.3MB ± 0% -90.92% (p=0.000 n=8+8)
Distinct/Unordered/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1181kB ± 0% +26.70% (p=0.000 n=8+9)
Distinct/Unordered/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.23MB ± 0% +5.84% (p=0.000 n=7+8)
Distinct/Unordered/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 1.77MB ± 0% -78.27% (p=0.000 n=8+10)
Distinct/Unordered/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 2.2MB ± 0% -83.99% (p=0.000 n=9+10)
name old allocs/op new allocs/op delta
Distinct/Unordered/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 102 ± 0% 76 ± 0% -25.49% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 168 ± 0% 108 ± 0% -35.71% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 622 ± 0% 200 ± 0% -67.85% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.37k ± 0% -69.21% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 102 ± 0% 80 ± 0% -21.57% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 168 ± 0% 124 ± 0% -26.19% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 622 ± 0% 216 ± 0% -65.27% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.40k ± 0% -67.22% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 102 ± 0% 90 ± 0% -11.76% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 168 ± 0% 144 ± 0% -14.29% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 622 ± 0% 239 ± 0% -61.58% (p=0.000 n=10+10)
Distinct/Unordered/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.44k ± 0% -63.82% (p=0.000 n=10+10)
Reviewable status:
complete! 0 of 0 LGTMs obtained
b309846 to
277f414
Compare
asubiotto
left a comment
There was a problem hiding this comment.
Great to see this! Do those benchmark numbers include nulls? It would also be nice to see the benchmarks of the HashJoiner, just to make sure there are no regressions.
Reviewable status:
complete! 0 of 0 LGTMs obtained
Azhng
left a comment
There was a problem hiding this comment.
I don't think the new benchmark include nulls. But I double checked the benchmark from before, it was actually slower without the nulls. (not exactly sure why). I will modify the test to include nulls
HashJoiner benchmark:
Looks like there's some accident minor performance boosts:
benchstat ~/.local/tmp/master.hj.c10.benchmark ~/.local/tmp/dht.hj.c10.benchmark
name old time/op new time/op delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-8 262µs ± 3% 264µs ±13% ~ (p=0.353 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-8 18.8ms ± 1% 18.3ms ± 1% -2.86% (p=0.000 n=10+8)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-8 259ms ± 3% 254ms ± 1% -1.98% (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-8 282µs ± 1% 280µs ± 5% ~ (p=0.515 n=8+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-8 25.5ms ± 2% 26.1ms ± 5% ~ (p=0.211 n=9+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-8 1.07s ± 8% 1.03s ± 3% -4.01% (p=0.004 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-8 273µs ± 2% 268µs ± 3% -1.86% (p=0.046 n=8+9)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-8 20.9ms ±10% 20.3ms ± 6% ~ (p=0.211 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-8 285ms ± 5% 280ms ± 2% ~ (p=0.156 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-8 292µs ± 1% 284µs ± 1% -2.71% (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-8 27.2ms ± 2% 27.0ms ± 1% -0.99% (p=0.010 n=10+9)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-8 1.08s ± 1% 1.09s ± 2% ~ (p=0.905 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-8 339µs ± 1% 335µs ± 2% -1.07% (p=0.027 n=9+8)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-8 28.9ms ± 3% 28.6ms ± 1% -1.10% (p=0.043 n=10+8)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-8 422ms ± 1% 414ms ± 1% -1.96% (p=0.000 n=8+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-8 357µs ± 3% 344µs ± 2% -3.63% (p=0.000 n=8+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-8 35.5ms ± 6% 34.9ms ± 1% -1.52% (p=0.035 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-8 1.24s ± 5% 1.22s ± 1% -1.68% (p=0.008 n=9+8)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-8 350µs ± 1% 347µs ± 1% -0.88% (p=0.000 n=8+8)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-8 30.0ms ± 1% 30.9ms ± 3% +2.85% (p=0.000 n=9+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-8 447ms ± 2% 442ms ± 2% ~ (p=0.089 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-8 366µs ± 1% 360µs ± 3% -1.60% (p=0.006 n=10+9)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-8 37.0ms ± 2% 37.1ms ± 2% ~ (p=0.549 n=9+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-8 1.30s ± 1% 1.29s ± 3% ~ (p=0.393 n=10+10)
name old speed new speed delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-8 500MB/s ± 3% 498MB/s ±12% ~ (p=0.353 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-8 892MB/s ± 1% 919MB/s ± 1% +2.94% (p=0.000 n=10+8)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-8 1.04GB/s ± 3% 1.06GB/s ± 1% +2.01% (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-8 464MB/s ± 1% 469MB/s ± 5% ~ (p=0.515 n=8+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-8 658MB/s ± 2% 644MB/s ± 5% ~ (p=0.211 n=9+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-8 251MB/s ± 8% 261MB/s ± 3% +4.04% (p=0.004 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-8 480MB/s ± 2% 489MB/s ± 3% +1.90% (p=0.046 n=8+9)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-8 803MB/s ± 9% 826MB/s ± 6% ~ (p=0.211 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-8 943MB/s ± 5% 957MB/s ± 2% ~ (p=0.156 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-8 449MB/s ± 1% 462MB/s ± 1% +2.79% (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-8 616MB/s ± 2% 622MB/s ± 1% +1.00% (p=0.010 n=10+9)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-8 248MB/s ± 1% 247MB/s ± 2% ~ (p=0.905 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-8 387MB/s ± 1% 391MB/s ± 2% +1.09% (p=0.027 n=9+8)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-8 580MB/s ± 2% 586MB/s ± 1% +1.10% (p=0.043 n=10+8)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-8 636MB/s ± 1% 648MB/s ± 1% +2.00% (p=0.000 n=8+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-8 367MB/s ± 3% 381MB/s ± 2% +3.75% (p=0.000 n=8+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-8 473MB/s ± 6% 480MB/s ± 1% +1.49% (p=0.035 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-8 217MB/s ± 5% 221MB/s ± 1% +1.67% (p=0.008 n=9+8)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-8 375MB/s ± 1% 378MB/s ± 1% +0.89% (p=0.000 n=8+8)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-8 558MB/s ± 1% 543MB/s ± 2% -2.76% (p=0.000 n=9+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-8 600MB/s ± 2% 608MB/s ± 2% ~ (p=0.089 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-8 358MB/s ± 1% 364MB/s ± 3% +1.64% (p=0.006 n=10+9)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-8 454MB/s ± 2% 453MB/s ± 2% ~ (p=0.549 n=9+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-8 207MB/s ± 1% 208MB/s ± 3% ~ (p=0.393 n=10+10)
name old alloc/op new alloc/op delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-8 872kB ± 0% 872kB ± 0% +0.01% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-8 49.9MB ± 0% 49.9MB ± 0% ~ (p=0.183 n=9+9)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-8 739MB ± 0% 739MB ± 0% +0.00% (p=0.009 n=9+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-8 893kB ± 0% 893kB ± 0% +0.01% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-8 52.3MB ± 0% 52.3MB ± 0% ~ (p=0.165 n=10+8)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-8 777MB ± 0% 777MB ± 0% ~ (p=0.106 n=9+8)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-8 875kB ± 0% 875kB ± 0% +0.01% (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-8 50.1MB ± 0% 50.1MB ± 0% +0.00% (p=0.000 n=8+9)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-8 743MB ± 0% 743MB ± 0% ~ (p=0.652 n=9+8)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-8 896kB ± 0% 896kB ± 0% +0.01% (p=0.000 n=7+9)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-8 52.5MB ± 0% 52.5MB ± 0% +0.00% (p=0.000 n=8+8)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-8 781MB ± 0% 781MB ± 0% +0.00% (p=0.021 n=9+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-8 872kB ± 0% 872kB ± 0% +0.01% (p=0.000 n=8+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-8 49.9MB ± 0% 49.9MB ± 0% ~ (p=0.164 n=10+8)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-8 739MB ± 0% 739MB ± 0% +0.00% (p=0.000 n=9+8)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-8 893kB ± 0% 893kB ± 0% +0.01% (p=0.000 n=10+8)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-8 52.3MB ± 0% 52.3MB ± 0% +0.00% (p=0.000 n=9+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-8 777MB ± 0% 777MB ± 0% +0.00% (p=0.013 n=8+8)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-8 875kB ± 0% 875kB ± 0% +0.01% (p=0.000 n=9+8)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-8 50.1MB ± 0% 50.1MB ± 0% +0.00% (p=0.000 n=10+8)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-8 743MB ± 0% 743MB ± 0% ~ (p=0.098 n=10+9)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-8 896kB ± 0% 896kB ± 0% +0.01% (p=0.001 n=8+6)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-8 52.5MB ± 0% 52.5MB ± 0% +0.00% (p=0.000 n=9+9)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-8 781MB ± 0% 781MB ± 0% +0.00% (p=0.025 n=9+9)
name old allocs/op new allocs/op delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-8 156 ± 0% 157 ± 0% +0.64% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-8 1.31k ± 0% 1.31k ± 0% +0.08% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-8 16.8k ± 0% 16.8k ± 0% +0.01% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-8 158 ± 0% 159 ± 0% +0.63% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-8 1.31k ± 0% 1.31k ± 0% +0.08% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-8 16.8k ± 0% 16.8k ± 0% ~ (p=0.137 n=10+8)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-8 158 ± 0% 159 ± 0% +0.63% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-8 1.31k ± 0% 1.31k ± 0% +0.08% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-8 16.8k ± 0% 16.8k ± 0% +0.00% (p=0.015 n=10+9)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-8 160 ± 0% 161 ± 0% +0.63% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-8 1.31k ± 0% 1.31k ± 0% +0.08% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-8 16.8k ± 0% 16.8k ± 0% +0.01% (p=0.003 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-8 156 ± 0% 157 ± 0% +0.64% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-8 1.31k ± 0% 1.31k ± 0% +0.08% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-8 16.8k ± 0% 16.8k ± 0% +0.01% (p=0.000 n=9+8)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-8 158 ± 0% 159 ± 0% +0.63% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-8 1.31k ± 0% 1.31k ± 0% +0.08% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-8 16.8k ± 0% 16.8k ± 0% +0.01% (p=0.001 n=9+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-8 158 ± 0% 159 ± 0% +0.63% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-8 1.31k ± 0% 1.31k ± 0% +0.08% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-8 16.8k ± 0% 16.8k ± 0% +0.00% (p=0.023 n=10+8)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-8 160 ± 0% 161 ± 0% +0.63% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-8 1.31k ± 0% 1.31k ± 0% +0.08% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-8 16.8k ± 0% 16.8k ± 0% +0.00% (p=0.007 n=10+9)
Reviewable status:
complete! 0 of 0 LGTMs obtained
Azhng
left a comment
There was a problem hiding this comment.
Added nulls into benchmark,
New Benchmark
benchstat ~/.local/tmp/master.nulls.c10.benchmark ~/.local/tmp/dht.nulls.c10.benchmark
name old time/op new time/op delta
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 294µs ± 1% 286µs ± 1% -2.92% (p=0.000 n=9+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 419µs ± 1% 362µs ± 1% -13.70% (p=0.000 n=9+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 3.50ms ± 2% 1.91ms ± 1% -45.28% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 5.62ms ± 2% 2.84ms ± 3% -49.44% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 291µs ± 1% 290µs ± 1% ~ (p=0.387 n=9+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 419µs ± 1% 364µs ± 0% -13.11% (p=0.000 n=10+8)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 4.11ms ± 2% 1.96ms ± 1% -52.44% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 6.58ms ± 3% 2.84ms ± 4% -56.90% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 309µs ± 4% 313µs ± 0% +0.99% (p=0.021 n=10+8)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 442µs ± 4% 393µs ± 1% -11.18% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 4.34ms ± 2% 2.34ms ± 0% -46.06% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 6.94ms ± 3% 3.42ms ± 1% -50.65% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 403µs ± 2% 379µs ± 2% -5.79% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 628µs ± 1% 528µs ± 4% -15.91% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 5.11ms ± 0% 3.06ms ± 1% -40.07% (p=0.000 n=8+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 9.16ms ± 2% 5.04ms ± 1% -44.94% (p=0.000 n=9+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 407µs ± 2% 380µs ± 1% -6.56% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 643µs ± 2% 520µs ± 1% -19.06% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 5.93ms ± 0% 3.09ms ± 2% -47.86% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 12.3ms ± 1% 5.1ms ± 1% -58.26% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 417µs ± 4% 402µs ± 2% -3.54% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 647µs ± 3% 572µs ± 4% -11.56% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 6.36ms ± 4% 3.75ms ± 4% -40.99% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 11.2ms ± 3% 5.8ms ± 1% -48.67% (p=0.000 n=10+9)
name old speed new speed delta
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 222MB/s ± 3% 229MB/s ± 1% +3.40% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 313MB/s ± 1% 362MB/s ± 1% +15.88% (p=0.000 n=9+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 300MB/s ± 2% 548MB/s ± 1% +82.74% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 373MB/s ± 2% 738MB/s ± 3% +97.79% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 225MB/s ± 1% 226MB/s ± 1% ~ (p=0.374 n=9+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 313MB/s ± 1% 360MB/s ± 0% +15.09% (p=0.000 n=10+8)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 255MB/s ± 2% 536MB/s ± 1% +110.25% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 319MB/s ± 3% 740MB/s ± 4% +132.02% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 212MB/s ± 4% 210MB/s ± 0% -1.00% (p=0.019 n=10+8)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 296MB/s ± 4% 334MB/s ± 1% +12.55% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 242MB/s ± 2% 448MB/s ± 0% +85.36% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 302MB/s ± 3% 613MB/s ± 1% +102.62% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 163MB/s ± 2% 173MB/s ± 2% +6.15% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 209MB/s ± 1% 248MB/s ± 4% +18.97% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 205MB/s ± 0% 343MB/s ± 1% +66.85% (p=0.000 n=8+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 229MB/s ± 2% 416MB/s ± 1% +81.61% (p=0.000 n=9+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 161MB/s ± 2% 172MB/s ± 1% +7.00% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 204MB/s ± 2% 252MB/s ± 1% +23.53% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 177MB/s ± 0% 339MB/s ± 2% +91.80% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 171MB/s ± 1% 410MB/s ± 1% +139.56% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 157MB/s ± 4% 163MB/s ± 2% +3.64% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 203MB/s ± 3% 229MB/s ± 4% +13.08% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 165MB/s ± 4% 279MB/s ± 4% +69.46% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 187MB/s ± 3% 364MB/s ± 1% +94.75% (p=0.000 n=10+9)
name old alloc/op new alloc/op delta
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1164kB ± 0% +24.92% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.20MB ± 0% +3.00% (p=0.000 n=10+8)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 1.18MB ± 0% -85.60% (p=0.000 n=8+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 1.2MB ± 0% -91.34% (p=0.000 n=9+8)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1166kB ± 0% +25.13% (p=0.001 n=8+6)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.20MB ± 0% +3.33% (p=0.000 n=8+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 1.20MB ± 0% -85.29% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 1.2MB ± 0% -91.16% (p=0.000 n=9+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1180kB ± 0% +26.69% (p=0.000 n=8+8)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.23MB ± 0% +5.83% (p=0.000 n=6+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 1.77MB ± 0% -78.27% (p=0.000 n=8+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 2.2MB ± 0% -83.99% (p=0.000 n=10+8)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1164kB ± 0% +24.96% (p=0.000 n=8+8)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.20MB ± 0% +3.16% (p=0.000 n=8+8)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 1.18MB ± 0% -85.60% (p=0.000 n=8+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 1.2MB ± 0% -91.34% (p=0.000 n=8+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1166kB ± 0% +25.13% (p=0.000 n=9+8)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.20MB ± 0% +3.33% (p=0.000 n=9+8)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 1.20MB ± 0% -85.29% (p=0.000 n=8+8)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 1.3MB ± 0% -90.93% (p=0.000 n=9+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 932kB ± 0% 1180kB ± 0% +26.69% (p=0.000 n=8+8)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 1.16MB ± 0% 1.23MB ± 0% +5.83% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 8.17MB ± 0% 2.00MB ± 0% -75.56% (p=0.000 n=8+8)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 14.0MB ± 0% 2.2MB ± 0% -83.99% (p=0.000 n=10+9)
name old allocs/op new allocs/op delta
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 102 ± 0% 72 ± 0% -29.41% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 168 ± 0% 108 ± 0% -35.71% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 622 ± 0% 208 ± 0% -66.56% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.37k ± 0% -69.21% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 102 ± 0% 80 ± 0% -21.57% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 168 ± 0% 124 ± 0% -26.19% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 622 ± 0% 216 ± 0% -65.27% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.39k ± 0% -67.88% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 102 ± 0% 90 ± 0% -11.76% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 168 ± 0% 144 ± 0% -14.29% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 622 ± 0% 239 ± 0% -61.58% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.44k ± 0% -63.82% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-8 102 ± 0% 76 ± 0% -25.49% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-8 168 ± 0% 120 ± 0% -28.57% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-8 622 ± 0% 208 ± 0% -66.56% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.37k ± 0% -69.21% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-8 102 ± 0% 80 ± 0% -21.57% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-8 168 ± 0% 124 ± 0% -26.19% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-8 622 ± 0% 216 ± 0% -65.27% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.40k ± 0% -67.22% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-8 102 ± 0% 90 ± 0% -11.76% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-8 168 ± 0% 144 ± 0% -14.29% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-8 622 ± 0% 242 ± 0% -61.09% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-8 1.21k ± 0% 0.44k ± 0% -63.82% (p=0.000 n=10+10)
Reviewable status:
complete! 0 of 0 LGTMs obtained
yuzefovich
left a comment
There was a problem hiding this comment.
Very nice work! I have a bunch of nits and some comments, but nothing major.
nit: for the benchmarks, you could probably choose only 1 out of 4 "sheets" of information (I prefer looking at the second one, about the speed in MB/s), this will make it easier to digest. I'd also be interested to see the number for partiallyOrderedDistinct since that operator reuses this unorderedDistinct.
Reviewed 7 of 7 files at r1, 1 of 1 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Azhng)
pkg/sql/colexec/hashtable.go, line 29 at r1 (raw file):
const ( // hashTableFullMode is the mode where hashTable buffer all input tuples and
nit: s/buffer all/buffers all/g and s/populate/populates/g.
pkg/sql/colexec/hashtable.go, line 33 at r1 (raw file):
hashTableFullMode hashTableMode = iota // hashTableDistinctMode is the mode where hashTable only buffer distinct
nit: s/buffer distinct/buffers only distinct/g. Also, I'd omit the second sentence of the comment since it seems too specific to the use case of unorderedDistinct and probably adds the confusion.
pkg/sql/colexec/hashtable.go, line 90 at r1 (raw file):
toCheck []uint64 // hashBuffer stores the hash values of each tuple in the probe table.
nit: you could extend this comment to highlight that hashBuffer will be dynamically updated when we're looking for distinct tuples on the probe side in "distinct" mode.
pkg/sql/colexec/hashtable.go, line 94 at r1 (raw file):
// keyCols stores the indices of the input batch which are key columns. keyCols []uint32
This slice seems to always be {0, 1, 2, ..., nKeys - 1}, can we eliminate it?
pkg/sql/colexec/hashtable.go, line 258 at r1 (raw file):
ht.probeScratch.keyCols = probeKeyCols ht.probeScratch.first = make([]uint64, numBuckets) ht.probeScratch.next = make([]uint64, coldata.BatchSize()+1)
Why do we need plus ones here? I feel like for most of these slices we don't need it, and, instead, in the code we could have keyID - 1 or whatever when accessing them. It'd be nice to unify it with non-distinct mode.
pkg/sql/colexec/hashtable.go, line 289 at r1 (raw file):
// ht.next is used to store the computed hash value of each key. ht.buildScratch.next = make([]uint64, ht.vals.Length()+1)
We should keep maybeAllocateUint64Array call here instead of using make directly.
pkg/sql/colexec/hashtable.go, line 341 at r1 (raw file):
// and length. // NOTE: the hashTable *must* have been already built. func (ht *hashTable) removeDuplicatesInBufferedBatch(batch coldata.Batch) {
It seems like these two removeDuplicates* functions are quite similar, could we refactor the code to support both with a single function?
pkg/sql/colexec/hashtable.go, line 358 at r1 (raw file):
// Continue searching for the build table matching keys while the toCheck // array is non-empty. nToCheck =
nit: seems like an unnecessary line break.
pkg/sql/colexec/hashtable_tmpl.go, line 86 at r1 (raw file):
// found. // {{if .UseProbeSel}}
nit: UseProbeSel and UseBuildSel probably should be added as arguments to the template function (similar to _SELECT_DISTINCT) to stay consistent.
pkg/sql/colexec/hashtable_tmpl.go, line 135 at r1 (raw file):
// {{if .SelectDistinct}} if keyID == 0 {
nit: you could rearrange the code using if statement:
if keyID != 0 {
...
}
// {{if .SelectDistinct}}
else {
ht.probeScratch.distinct[toCheck] = true
}
// {{end}}
pkg/sql/colexec/hashtable_tmpl.go, line 205 at r1 (raw file):
if probeSel != nil { if buildSel != nil { _CHECK_COL_WITH_NULLS(
nit: I'd keep these templated function calls as one long line, similar to all others like it in this file.
pkg/sql/colexec/hashtable_tmpl.go, line 284 at r1 (raw file):
switch probeType { // {{range $lTyp, $rTypToOverload := .}} // {{with $overload := index $rTypToOverload $lTyp}}
Oh, this looks fancy. What does it do?
pkg/sql/colexec/hashtable_tmpl.go, line 349 at r1 (raw file):
} // */}} // findDistinctTuples find all tuples in probeVecs that are not present in
nit: s/find all/finds all/g.
pkg/sql/colexec/hashtable_tmpl.go, line 353 at r1 (raw file):
// keyIDs in headID buffer. // NOTE: It assumes that probeVecs does not contain any duplicates itself. // NOTE: It assumes that probSel has already being populated and it is not
We should probably enforce this assumption as the first line in the function.
pkg/sql/colexec/hashtable_tmpl.go, line 428 at r1 (raw file):
visited[ht.probeScratch.headID[i]] = true // Compacting and dedup hash buffer.
nit: s/dedup/deduplicating/g.
pkg/sql/colexec/hashtable_tmpl.go, line 436 at r1 (raw file):
} copy(visited, zeroBoolColumn)
Shouldn't we do this zero out before we use this buffer? I think usually the contract is that before you're reusing some slice, you make sure that it is in the state you expect it to be, and in this case we seem to expect visited to consist of only false values before the loop.
pkg/sql/colexec/hashtable_tmpl.go, line 442 at r1 (raw file):
} // */}} // updateSel updates the selection vector in the given batch using the headID
This comment should be extended. It's not clear what this function's purpose is and what it does.
pkg/sql/colexec/hashtable_tmpl.go, line 458 at r1 (raw file):
} // distinctCheck determines if the current key in the groupID buckets matches the
nit: s/groupID buckets/groupID bucket/g.
pkg/sql/colexec/partially_ordered_distinct.go, line 124 at r1 (raw file):
input: input, inputTypes: inputTypes, windowedBatch: allocator.NewMemBatch(inputTypes),
Why is this change?
pkg/sql/colexec/unordered_distinct.go, line 67 at r1 (raw file):
buildFinished bool // sel is a list of indices to select representing the distinct rows.
nit: the comment should be removed as well.
pkg/sql/colexec/unordered_distinct.go, line 89 at r1 (raw file):
} // Since hashTable only buffers distinct tuple when it is built in distinct
nit: sounds like an incomplete sentence. Probably something like "We're using the hashTable in distinct mode, so it buffers only distinct tuples, as a result, we will be simply returning all buffered tuples."
Also, this assignment should be moved into the if block above.
pkg/sql/colexec/unordered_distinct.go, line 94 at r1 (raw file):
// Create and return the next batch of input to a maximum size of // coldata.BatchSize(). The rows in the new batch are specified by the
nit: second sentence of the comment should be deleted.
pkg/sql/colexec/execgen/cmd/execgen/hashtable_gen.go, line 44 at r1 (raw file):
`{{template "checkColBody" buildDict "Global" .Global "UseProbeSel" .UseProbeSel "UseBuildSel" .UseBuildSel "ProbeHasNulls" $7 "BuildHasNulls" $8 "AllowNullEquality" $9 "SelectDistinct" $10}}`) checkColWithNull := makeFunctionRegex("_CHECK_COL_WITH_NULLS", 8)
nit: s/checkColWithNull/checkColWithNulls/g.
pkg/sql/colexec/execgen/cmd/execgen/hashtable_gen.go, line 48 at r1 (raw file):
`{{template "checkColWithNulls" buildDict "Global" . "UseProbeSel" $7 "UseBuildSel" $8}}`) checkColForDistinctWithNull := makeFunctionRegex("_CHECK_COL_FOR_DISTINCT_WITH_NULLS", 8)
ditto (missing s).
Azhng
left a comment
There was a problem hiding this comment.
Here is the benchmark for partiallyOrderedDistinct.
Haven't got a chance to run it on GCE worker yet. Will post the result once I have it set up.
name old speed new speed delta
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=2/ordCols=1-8 383MB/s ± 1% 420MB/s ± 9% +9.73% (p=0.001 n=8+9)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=4/ordCols=2-8 525MB/s ± 1% 562MB/s ± 1% +7.21% (p=0.000 n=8+9)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=2/ordCols=1-8 503MB/s ± 1% 624MB/s ± 1% +24.17% (p=0.000 n=9+9)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=4/ordCols=2-8 684MB/s ± 1% 869MB/s ± 1% +27.06% (p=0.000 n=9+9)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=2/ordCols=1-8 283MB/s ± 2% 325MB/s ± 2% +14.85% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=4/ordCols=2-8 447MB/s ± 1% 408MB/s ± 1% -8.65% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=2/ordCols=1-8 281MB/s ± 1% 321MB/s ± 2% +14.10% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=4/ordCols=2-8 539MB/s ± 0% 612MB/s ± 2% +13.55% (p=0.000 n=9+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=2/ordCols=1-8 100MB/s ± 1% 109MB/s ± 2% +9.35% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=4/ordCols=2-8 134MB/s ± 2% 124MB/s ± 1% -7.64% (p=0.000 n=9+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=2/ordCols=1-8 111MB/s ± 1% 115MB/s ± 2% +3.98% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=4/ordCols=2-8 121MB/s ± 1% 133MB/s ± 1% +9.46% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=2/ordCols=1-8 314MB/s ± 0% 311MB/s ± 1% -0.92% (p=0.000 n=8+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=4/ordCols=2-8 406MB/s ± 1% 452MB/s ± 2% +11.39% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=2/ordCols=1-8 387MB/s ± 2% 402MB/s ± 5% ~ (p=0.065 n=10+9)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=4/ordCols=2-8 499MB/s ± 1% 474MB/s ± 1% -4.94% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=2/ordCols=1-8 269MB/s ± 1% 241MB/s ± 1% -10.58% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=4/ordCols=2-8 275MB/s ± 1% 345MB/s ± 1% +25.65% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=2/ordCols=1-8 262MB/s ± 1% 262MB/s ± 1% ~ (p=0.968 n=9+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=4/ordCols=2-8 346MB/s ± 1% 378MB/s ± 1% +9.24% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=2/ordCols=1-8 92.3MB/s ± 1% 103.3MB/s ± 1% +11.98% (p=0.000 n=8+9)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=4/ordCols=2-8 119MB/s ± 1% 110MB/s ± 1% -7.45% (p=0.000 n=9+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=2/ordCols=1-8 90.2MB/s ± 1% 94.2MB/s ± 1% +4.38% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=4/ordCols=2-8 116MB/s ± 2% 122MB/s ± 2% +4.77% (p=0.000 n=9+10)
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/colexec/hashtable.go, line 94 at r1 (raw file):
Previously, yuzefovich wrote…
This slice seems to always be
{0, 1, 2, ..., nKeys - 1}, can we eliminate it?
Done.
pkg/sql/colexec/hashtable.go, line 258 at r1 (raw file):
Previously, yuzefovich wrote…
Why do we need plus ones here? I feel like for most of these slices we don't need it, and, instead, in the code we could have
keyID - 1or whatever when accessing them. It'd be nice to unify it with non-distinct mode.
Updated. Only the next slices would require to have plus ones.
Done
pkg/sql/colexec/hashtable.go, line 289 at r1 (raw file):
Previously, yuzefovich wrote…
We should keep
maybeAllocateUint64Arraycall here instead of usingmakedirectly.
Done.
My bad, probably slipped through when I was trying resolving the merge conflict.
pkg/sql/colexec/hashtable.go, line 341 at r1 (raw file):
Previously, yuzefovich wrote…
It seems like these two
removeDuplicates*functions are quite similar, could we refactor the code to support both with a single function?
Done.
pkg/sql/colexec/hashtable_tmpl.go, line 135 at r1 (raw file):
Previously, yuzefovich wrote…
nit: you could rearrange the code using
ifstatement:if keyID != 0 { ... } // {{if .SelectDistinct}} else { ht.probeScratch.distinct[toCheck] = true } // {{end}}
Turns out go compiler enforce that else must follow immediately after }, otherwise it's a syntax error and gofmt can't even fix it.
pkg/sql/colexec/hashtable_tmpl.go, line 284 at r1 (raw file):
Previously, yuzefovich wrote…
Oh, this looks fancy. What does it do?
This is because the overloads we pass in to the template is a map[coltypes.T]map[coltypes.T]*overloads. This is so that we can generate comparison codes for comparable but not identical types.
But in this function since we know we are going to be appending the probe table into build table they column types will have to be the same, so we only need to generate comparison code for identical types.
So in this case . is the .Global object (the two level map) and semantically this means:
for lTyp, rTypeToOverload := range .Global {
overload := rTypeOverload[lTyp]
...
}I could have just modify checkCol function and reuse it but I didn't want to add another boolean flag into that function's call path. This kinda make the code a little bit more duplicated but I feel it makes each individual function simpler.
pkg/sql/colexec/hashtable_tmpl.go, line 353 at r1 (raw file):
Previously, yuzefovich wrote…
We should probably enforce this assumption as the first line in the function.
Done.
pkg/sql/colexec/hashtable_tmpl.go, line 436 at r1 (raw file):
Previously, yuzefovich wrote…
Shouldn't we do this zero out before we use this buffer? I think usually the contract is that before you're reusing some slice, you make sure that it is in the state you expect it to be, and in this case we seem to expect
visitedto consist of onlyfalsevalues before the loop.
Done.
Zero out visited before and after the function. Since the underneath storage is also used elsewhere.
pkg/sql/colexec/hashtable_tmpl.go, line 442 at r1 (raw file):
Previously, yuzefovich wrote…
This comment should be extended. It's not clear what this function's purpose is and what it does.
Done.
pkg/sql/colexec/partially_ordered_distinct.go, line 124 at r1 (raw file):
Previously, yuzefovich wrote…
Why is this change?
Because when creating batches with NewMemBatchWithSize(inputType, 0), it causes the selection vector of that batch to be allocated with 0 capacity and length.
In the hash table, a lot of places assumes that once batch.SetSelection(true) is called, next time when we call batch.Selection() it will return a selection vector with length equal to coldata.BatchSize().
Correct me if my assumption is wrong.
pkg/sql/colexec/execgen/cmd/execgen/hashtable_gen.go, line 48 at r1 (raw file):
Previously, yuzefovich wrote…
ditto (missing
s).
Done.
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 4 of 4 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Azhng)
pkg/sql/colexec/hashtable.go, line 341 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Done.
Nice!
pkg/sql/colexec/hashtable.go, line 33 at r3 (raw file):
hashTableFullMode hashTableMode = iota // hashTableDistinctMode is the mode where hashTable only buffers only
nit: double "only."
pkg/sql/colexec/hashtable.go, line 341 at r3 (raw file):
// a slice of key columns of the probe table, number of tuple to check and the // selection vector of the probe table and returns number of tuples that needs // to be checked for next iteration. It populate the ht.probeScratch.headID to
nit: s/populate/populates/g.
pkg/sql/colexec/hashtable.go, line 342 at r3 (raw file):
// selection vector of the probe table and returns number of tuples that needs // to be checked for next iteration. It populate the ht.probeScratch.headID to // point to the keyIDs that need be included in probe table's selection vector.
nit: s/need be/need to be/g.
pkg/sql/colexec/hashtable_tmpl.go, line 135 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Turns out go compiler enforce that
elsemust follow immediately after}, otherwise it's a syntax error andgofmtcan't even fix it.
I see, TIL.
pkg/sql/colexec/hashtable_tmpl.go, line 284 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
This is because the overloads we pass in to the template is a
map[coltypes.T]map[coltypes.T]*overloads. This is so that we can generate comparison codes for comparable but not identical types.
But in this function since we know we are going to be appending the probe table into build table they column types will have to be the same, so we only need to generate comparison code for identical types.So in this case
.is the.Globalobject (the two level map) and semantically this means:for lTyp, rTypeToOverload := range .Global { overload := rTypeOverload[lTyp] ... }I could have just modify
checkColfunction and reuse it but I didn't want to add another boolean flag into that function's call path. This kinda make the code a little bit more duplicated but I feel it makes each individual function simpler.
I haven't seen use of index directive before, thanks for the explanation, this makes sense. It'd be definitely useful to include the code snipped you posted as the templated comment into the code (i.e. surrounding it with //{{/* and //*/}}).
pkg/sql/colexec/hashtable_tmpl.go, line 436 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Done.
Zero out
visitedbefore and after the function. Since the underneath storage is also used elsewhere.
I think the "after" zero out should be moved closer to where it is used elsewhere.
pkg/sql/colexec/partially_ordered_distinct.go, line 124 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Because when creating batches with
NewMemBatchWithSize(inputType, 0), it causes the selection vector of that batch to be allocated with 0 capacity and length.
In the hash table, a lot of places assumes that oncebatch.SetSelection(true)is called, next time when we callbatch.Selection()it will return a selection vector with length equal tocoldata.BatchSize().Correct me if my assumption is wrong.
I see. We have NewMemBatchNoCols for such cases, can you try it out?
pkg/sql/colexec/unordered_distinct.go, line 88 at r3 (raw file):
// We're using the hashTable in distinct mode, so it buffers only distinct // tuples, as a result, we will be simply returning all buffered tuples
nit: missing period.
Previously hashTable will buffer all tuples before building `head` and `same` vector. Now hashTable supports distinct mode where it only buffers the distinct tuples from upstream operator. This removes the need of traversing through the `head` and `same` vectors. Instead, now user of the hashTable can directly build hashTable in distinct mode and copy the buffered tuples. Release note: None
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/colexec/hashtable_tmpl.go, line 284 at r1 (raw file):
Previously, yuzefovich wrote…
I haven't seen use of
indexdirective before, thanks for the explanation, this makes sense. It'd be definitely useful to include the code snipped you posted as the templated comment into the code (i.e. surrounding it with//{{/*and//*/}}).
Done.
pkg/sql/colexec/partially_ordered_distinct.go, line 124 at r1 (raw file):
Previously, yuzefovich wrote…
I see. We have
NewMemBatchNoColsfor such cases, can you try it out?
Done.
Azhng
left a comment
There was a problem hiding this comment.
Rerun the benchmark on the GCE workers:
Unordered distinct:
name old speed new speed delta
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-24 154MB/s ± 5% 156MB/s ± 3% ~ (p=0.460 n=10+8)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-24 219MB/s ± 3% 236MB/s ±13% ~ (p=0.143 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-24 136MB/s ± 3% 384MB/s ± 2% +181.65% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-24 245MB/s ± 1% 518MB/s ± 2% +111.55% (p=0.000 n=9+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-24 154MB/s ± 4% 155MB/s ± 3% ~ (p=0.754 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-24 214MB/s ± 4% 249MB/s ± 2% +16.29% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-24 181MB/s ± 3% 373MB/s ± 2% +106.24% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-24 225MB/s ± 6% 504MB/s ± 2% +123.53% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-24 150MB/s ± 4% 147MB/s ± 4% ~ (p=0.143 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-24 207MB/s ± 6% 204MB/s ± 6% ~ (p=0.182 n=10+9)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-24 165MB/s ± 2% 301MB/s ± 4% +82.26% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-24 201MB/s ± 3% 398MB/s ± 2% +97.83% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=2/ordCols=0-24 113MB/s ± 4% 122MB/s ± 4% +8.02% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=4/ordCols=0-24 141MB/s ± 3% 168MB/s ± 2% +19.43% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=2/ordCols=0-24 138MB/s ± 4% 236MB/s ± 2% +71.44% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=4/ordCols=0-24 156MB/s ± 3% 282MB/s ± 1% +80.65% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=2/ordCols=0-24 109MB/s ± 5% 117MB/s ± 1% +6.74% (p=0.000 n=10+9)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=4/ordCols=0-24 143MB/s ± 3% 164MB/s ± 2% +14.59% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=2/ordCols=0-24 100MB/s ± 3% 226MB/s ± 2% +125.29% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=4/ordCols=0-24 142MB/s ± 3% 272MB/s ± 3% +91.53% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=2/ordCols=0-24 108MB/s ± 3% 112MB/s ± 3% +3.43% (p=0.001 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=4/ordCols=0-24 132MB/s ± 3% 150MB/s ± 8% +13.62% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=2/ordCols=0-24 107MB/s ± 5% 191MB/s ± 4% +78.89% (p=0.000 n=10+10)
Distinct/Unordered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=4/ordCols=0-24 125MB/s ± 2% 237MB/s ± 4% +89.61% (p=0.000 n=10+10)
Partially ordered
name old speed new speed delta
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=2/ordCols=1-24 231MB/s ± 3% 306MB/s ± 3% +32.19% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.001/rows=4096/cols=4/ordCols=2-24 345MB/s ± 3% 418MB/s ± 1% +21.01% (p=0.000 n=10+8)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=2/ordCols=1-24 351MB/s ± 1% 444MB/s ± 2% +26.69% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.001/rows=65536/cols=4/ordCols=2-24 486MB/s ± 1% 605MB/s ± 3% +24.28% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=2/ordCols=1-24 182MB/s ± 4% 195MB/s ± 2% +6.92% (p=0.000 n=9+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.010/rows=4096/cols=4/ordCols=2-24 317MB/s ± 1% 325MB/s ± 3% +2.49% (p=0.000 n=9+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=2/ordCols=1-24 266MB/s ± 3% 233MB/s ± 2% -12.54% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.010/rows=65536/cols=4/ordCols=2-24 405MB/s ± 2% 388MB/s ± 3% -3.99% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=2/ordCols=1-24 52.6MB/s ± 1% 72.5MB/s ± 2% +37.79% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.100/rows=4096/cols=4/ordCols=2-24 72.1MB/s ± 1% 82.0MB/s ± 2% +13.81% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=2/ordCols=1-24 61.0MB/s ± 2% 74.9MB/s ± 2% +22.74% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=false/newTupleProbability=0.100/rows=65536/cols=4/ordCols=2-24 77.3MB/s ± 2% 81.6MB/s ± 1% +5.58% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=2/ordCols=1-24 196MB/s ± 1% 225MB/s ± 3% +14.81% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.001/rows=4096/cols=4/ordCols=2-24 256MB/s ± 1% 273MB/s ± 1% +6.55% (p=0.000 n=9+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=2/ordCols=1-24 246MB/s ± 2% 191MB/s ± 1% -22.35% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.001/rows=65536/cols=4/ordCols=2-24 314MB/s ± 2% 432MB/s ± 2% +37.66% (p=0.000 n=10+9)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=2/ordCols=1-24 179MB/s ± 1% 160MB/s ± 4% -10.54% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.010/rows=4096/cols=4/ordCols=2-24 180MB/s ± 2% 202MB/s ± 2% +12.28% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=2/ordCols=1-24 193MB/s ± 1% 175MB/s ± 1% -9.37% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.010/rows=65536/cols=4/ordCols=2-24 309MB/s ± 2% 268MB/s ± 4% -13.10% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=2/ordCols=1-24 53.6MB/s ± 1% 61.7MB/s ± 4% +14.98% (p=0.000 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.100/rows=4096/cols=4/ordCols=2-24 72.6MB/s ± 2% 74.2MB/s ± 2% +2.17% (p=0.002 n=10+10)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=2/ordCols=1-24 58.4MB/s ± 2% 68.1MB/s ± 0% +16.58% (p=0.000 n=10+8)
Distinct/PartiallyOrdered/hasNulls=true/newTupleProbability=0.100/rows=65536/cols=4/ordCols=2-24 90.2MB/s ± 2% 91.8MB/s ± 3% +1.82% (p=0.011 n=10+10)
Hash join
name old speed new speed delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24 315MB/s ±11% 367MB/s ± 3% +16.50% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24 554MB/s ± 2% 582MB/s ± 2% +5.07% (p=0.000 n=10+9)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24 582MB/s ± 3% 619MB/s ± 3% +6.38% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24 300MB/s ± 9% 321MB/s ± 7% +6.77% (p=0.007 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24 329MB/s ± 3% 350MB/s ± 4% +6.50% (p=0.000 n=10+9)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24 180MB/s ± 8% 201MB/s ± 5% +11.32% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24 284MB/s ± 5% 354MB/s ± 6% +24.45% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24 503MB/s ± 3% 562MB/s ± 3% +11.65% (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24 539MB/s ± 3% 570MB/s ± 1% +5.84% (p=0.000 n=10+8)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24 297MB/s ± 5% 302MB/s ±10% ~ (p=0.912 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24 310MB/s ± 6% 339MB/s ± 9% +9.27% (p=0.002 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24 173MB/s ± 8% 183MB/s ± 2% +5.57% (p=0.003 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24 257MB/s ± 8% 273MB/s ± 4% +5.92% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24 372MB/s ± 1% 392MB/s ± 1% +5.41% (p=0.000 n=9+7)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24 396MB/s ± 4% 405MB/s ± 4% +2.38% (p=0.014 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24 252MB/s ± 5% 252MB/s ± 2% ~ (p=0.971 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24 262MB/s ± 6% 266MB/s ± 5% ~ (p=0.353 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24 152MB/s ± 4% 153MB/s ± 4% ~ (p=0.720 n=10+9)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24 234MB/s ± 3% 267MB/s ± 5% +14.01% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24 365MB/s ± 2% 383MB/s ± 2% +4.94% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24 373MB/s ± 4% 391MB/s ± 3% +4.78% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24 231MB/s ±10% 252MB/s ± 3% +8.89% (p=0.003 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24 240MB/s ± 3% 259MB/s ± 4% +7.69% (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24 147MB/s ± 4% 150MB/s ± 4% ~ (p=0.066 n=10+10)
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
|
Great work Archer! 👍 @Azhng |
yuzefovich
left a comment
There was a problem hiding this comment.
The numbers look very good! I think there might be something to investigate about the partially ordered distinct (maybe we don't need to zero out some slices when resetting), but that belongs in a separate PR.
Reviewed 4 of 4 files at r4.
Reviewable status:complete! 1 of 0 LGTMs obtained
|
Sounds good. Will investigate later. TFTR! bors r+ |
Merge conflict (retrying...) |
45704: colexec: add distinct mode to hashTable r=Azhng a=Azhng Previously hashTable will buffer all tuples before building `head` and `same` vector. Now hashTable supports distinct mode where it only buffers the distinct tuples from upstream operator. This removes the need of traversing through the `head` and `same` vectors. Instead, now user of the hashTable can directly build hashTable in distinct mode and copy the buffered tuples. Closes #44404 Release note: None 45727: cli: allow for `cockroach demo` to start in secure mode r=knz a=rohany Fixes #45607. Release note (cli change): This PR adds support for `cockroach demo` to start in secure mode using the flag `--insecure=false`. 45759: storage: Do not create nil data pointers in rocksdb slices r=miretskiy a=miretskiy Fixes #45524 Libroach's DBSlice data type gets converted to the rocksdb::Slice whenever we need to access underlying rocksdb (iterators, get/set, etc). However, internally, rocksdb::Slice asserts that the underlying data pointer may not be null in the rocksdb::Slice::compare as observed in #45524 This change modifies our rocksdb bridge code to never generate DBSlices with nullptr data. Release notes: None Co-authored-by: Azhng <archer.xn@gmail.com> Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu> Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
|
@knz should we do "bors r-" to not confuse bors? |
|
I don't know :) but i think it won't hurt. bors r- |
Canceled |
Previously hashTable will buffer all tuples before building
headandsamevector.Now hashTable supports distinct mode where it only buffers
the distinct tuples from upstream operator. This removes the need of traversing
through the
headandsamevectors. Instead, now user of the hashTablecan directly build hashTable in distinct mode and copy the buffered tuples.
Closes #44404
Release note: None