Skip to content

colexec: remove templating nulls vs no-nulls case in merge joiner#57266

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:mj-nulls
Dec 1, 2020
Merged

colexec: remove templating nulls vs no-nulls case in merge joiner#57266
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:mj-nulls

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Nov 30, 2020

This commit removes the templaing of "with-nulls" and "without-nulls"
cases of the merge joiner because the benchmarks have shown that
skipping nulls check in the latter case doesn't offer much performance
improvement, but having that templating increases the amount of
generated code by a factor of 2 or more.

Additionally, this commit adjusts the template slightly to omit some
empty lines.

Release note: None

@yuzefovich yuzefovich requested review from a team and asubiotto November 30, 2020 23:51
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

This commit removes the templaing of "with-nulls" and "without-nulls"
cases of the merge joiner because the benchmarks have shown that
skipping nulls check in the latter case doesn't offer much performance
improvement, but having that templating increases the amount of
generated code by a factor of 2 or more.

Additionally, this commit adjusts the template slightly to omit some
empty lines.

Release note: None
@yuzefovich
Copy link
Copy Markdown
Member Author

The benchmarks:

name                                                      old time/op    new time/op    delta
MergeJoiner/rows=16/hasNulls=false-24                       14.7µs ± 2%    14.5µs ± 1%   -1.43%  (p=0.034 n=10+9)
MergeJoiner/rows=16/hasNulls=true-24                        13.8µs ± 3%    13.5µs ± 5%   -2.14%  (p=0.023 n=10+10)
MergeJoiner/rows=1024/hasNulls=false-24                      192µs ± 1%     191µs ± 3%     ~     (p=0.173 n=8+10)
MergeJoiner/rows=1024/hasNulls=true-24                       203µs ± 1%     189µs ± 3%   -6.70%  (p=0.000 n=10+10)
MergeJoiner/rows=65536/hasNulls=false-24                    3.69ms ± 0%    3.99ms ± 0%   +8.02%  (p=0.000 n=9+10)
MergeJoiner/rows=65536/hasNulls=true-24                     4.85ms ± 0%    4.02ms ± 0%  -17.22%  (p=0.000 n=10+9)
MergeJoiner/oneSideRepeat-rows=16/hasNulls=false-24         13.8µs ± 3%    13.9µs ± 2%     ~     (p=0.133 n=10+9)
MergeJoiner/oneSideRepeat-rows=16/hasNulls=true-24          7.28µs ± 2%    7.40µs ± 2%   +1.67%  (p=0.013 n=10+9)
MergeJoiner/oneSideRepeat-rows=1024/hasNulls=false-24       45.4µs ± 2%    46.3µs ± 2%   +1.90%  (p=0.001 n=10+10)
MergeJoiner/oneSideRepeat-rows=1024/hasNulls=true-24        20.9µs ± 3%    17.8µs ± 2%  -14.65%  (p=0.000 n=10+10)
MergeJoiner/oneSideRepeat-rows=65536/hasNulls=false-24       865µs ± 1%     917µs ± 2%   +6.02%  (p=0.000 n=10+10)
MergeJoiner/oneSideRepeat-rows=65536/hasNulls=true-24       1.29ms ± 0%    1.13ms ± 1%  -12.56%  (p=0.000 n=8+10)
MergeJoiner/bothSidesRepeat-rows=16/hasNulls=false-24       17.4µs ± 2%    17.4µs ± 3%     ~     (p=1.000 n=10+10)
MergeJoiner/bothSidesRepeat-rows=16/hasNulls=true-24        17.8µs ± 3%    17.8µs ± 5%     ~     (p=0.730 n=9+9)
MergeJoiner/bothSidesRepeat-rows=1024/hasNulls=false-24      149µs ± 2%     153µs ± 1%   +2.83%  (p=0.000 n=10+9)
MergeJoiner/bothSidesRepeat-rows=1024/hasNulls=true-24       248µs ± 4%     237µs ± 3%   -4.44%  (p=0.000 n=10+10)
MergeJoiner/bothSidesRepeat-rows=65536/hasNulls=false-24    23.7ms ± 0%    26.9ms ± 1%  +13.29%  (p=0.000 n=10+9)
MergeJoiner/bothSidesRepeat-rows=65536/hasNulls=true-24     52.6ms ± 1%    52.0ms ± 0%   -1.11%  (p=0.000 n=10+10)

name                                                      old speed      new speed      delta
MergeJoiner/rows=16/hasNulls=false-24                     17.4MB/s ± 2%  17.7MB/s ± 1%   +1.44%  (p=0.033 n=10+9)
MergeJoiner/rows=16/hasNulls=true-24                      18.6MB/s ± 3%  19.0MB/s ± 5%   +2.19%  (p=0.022 n=10+10)
MergeJoiner/rows=1024/hasNulls=false-24                   85.4MB/s ± 1%  86.0MB/s ± 3%     ~     (p=0.166 n=8+10)
MergeJoiner/rows=1024/hasNulls=true-24                    80.8MB/s ± 1%  86.6MB/s ± 3%   +7.20%  (p=0.000 n=10+10)
MergeJoiner/rows=65536/hasNulls=false-24                   284MB/s ± 0%   263MB/s ± 0%   -7.42%  (p=0.000 n=9+10)
MergeJoiner/rows=65536/hasNulls=true-24                    216MB/s ± 0%   261MB/s ± 0%  +20.81%  (p=0.000 n=10+9)
MergeJoiner/oneSideRepeat-rows=16/hasNulls=false-24       18.5MB/s ± 3%  18.4MB/s ± 2%     ~     (p=0.117 n=10+9)
MergeJoiner/oneSideRepeat-rows=16/hasNulls=true-24        35.2MB/s ± 2%  34.6MB/s ± 2%   -1.64%  (p=0.012 n=10+9)
MergeJoiner/oneSideRepeat-rows=1024/hasNulls=false-24      361MB/s ± 3%   354MB/s ± 2%   -1.87%  (p=0.001 n=10+10)
MergeJoiner/oneSideRepeat-rows=1024/hasNulls=true-24       785MB/s ± 3%   920MB/s ± 2%  +17.17%  (p=0.000 n=10+10)
MergeJoiner/oneSideRepeat-rows=65536/hasNulls=false-24    1.21GB/s ± 1%  1.14GB/s ± 2%   -5.68%  (p=0.000 n=10+10)
MergeJoiner/oneSideRepeat-rows=65536/hasNulls=true-24      812MB/s ± 0%   929MB/s ± 1%  +14.36%  (p=0.000 n=8+10)
MergeJoiner/bothSidesRepeat-rows=16/hasNulls=false-24     14.7MB/s ± 2%  14.7MB/s ± 3%     ~     (p=1.000 n=10+10)
MergeJoiner/bothSidesRepeat-rows=16/hasNulls=true-24      14.4MB/s ± 3%  14.4MB/s ± 5%     ~     (p=0.730 n=9+9)
MergeJoiner/bothSidesRepeat-rows=1024/hasNulls=false-24    110MB/s ± 2%   107MB/s ± 1%   -2.76%  (p=0.000 n=10+9)
MergeJoiner/bothSidesRepeat-rows=1024/hasNulls=true-24    66.1MB/s ± 3%  69.2MB/s ± 3%   +4.63%  (p=0.000 n=10+10)
MergeJoiner/bothSidesRepeat-rows=65536/hasNulls=false-24  44.2MB/s ± 0%  39.0MB/s ± 1%  -11.73%  (p=0.000 n=10+9)
MergeJoiner/bothSidesRepeat-rows=65536/hasNulls=true-24   19.9MB/s ± 0%  20.1MB/s ± 0%   +1.03%  (p=0.000 n=9+10)

name                                                      old alloc/op   new alloc/op   delta
MergeJoiner/rows=16/hasNulls=false-24                       3.94kB ± 0%    3.94kB ± 0%     ~     (all equal)
MergeJoiner/rows=16/hasNulls=true-24                        2.83kB ± 0%    2.83kB ± 0%     ~     (all equal)
MergeJoiner/rows=1024/hasNulls=false-24                      377kB ± 0%     377kB ± 0%     ~     (p=0.161 n=9+10)
MergeJoiner/rows=1024/hasNulls=true-24                       377kB ± 0%     377kB ± 0%     ~     (all equal)
MergeJoiner/rows=65536/hasNulls=false-24                     406kB ± 0%     406kB ± 0%   +0.00%  (p=0.000 n=9+10)
MergeJoiner/rows=65536/hasNulls=true-24                      406kB ± 0%     406kB ± 0%   -0.01%  (p=0.000 n=10+8)
MergeJoiner/oneSideRepeat-rows=16/hasNulls=false-24         3.94kB ± 0%    3.94kB ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=16/hasNulls=true-24          1.20kB ± 0%    1.20kB ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=1024/hasNulls=false-24       53.6kB ± 0%    53.6kB ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=1024/hasNulls=true-24        1.20kB ± 0%    1.20kB ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=65536/hasNulls=false-24      1.14MB ± 0%    1.14MB ± 0%   +0.00%  (p=0.000 n=10+10)
MergeJoiner/oneSideRepeat-rows=65536/hasNulls=true-24       1.12MB ± 0%    1.12MB ± 0%   -0.00%  (p=0.000 n=7+10)
MergeJoiner/bothSidesRepeat-rows=16/hasNulls=false-24       5.81kB ± 0%    5.81kB ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=16/hasNulls=true-24        5.81kB ± 0%    5.81kB ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=1024/hasNulls=false-24     59.0kB ± 0%    59.0kB ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=1024/hasNulls=true-24      59.0kB ± 0%    59.0kB ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=65536/hasNulls=false-24    83.8kB ± 0%    83.9kB ± 0%   +0.13%  (p=0.000 n=8+10)
MergeJoiner/bothSidesRepeat-rows=65536/hasNulls=true-24     84.9kB ± 0%    84.8kB ± 0%   -0.07%  (p=0.006 n=10+9)

name                                                      old allocs/op  new allocs/op  delta
MergeJoiner/rows=16/hasNulls=false-24                         80.0 ± 0%      80.0 ± 0%     ~     (all equal)
MergeJoiner/rows=16/hasNulls=true-24                          69.0 ± 0%      69.0 ± 0%     ~     (all equal)
MergeJoiner/rows=1024/hasNulls=false-24                        147 ± 0%       147 ± 0%     ~     (all equal)
MergeJoiner/rows=1024/hasNulls=true-24                         147 ± 0%       147 ± 0%     ~     (all equal)
MergeJoiner/rows=65536/hasNulls=false-24                     1.03k ± 0%     1.03k ± 0%     ~     (all equal)
MergeJoiner/rows=65536/hasNulls=true-24                      1.03k ± 0%     1.03k ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=16/hasNulls=false-24           80.0 ± 0%      80.0 ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=16/hasNulls=true-24            36.0 ± 0%      36.0 ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=1024/hasNulls=false-24          135 ± 0%       135 ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=1024/hasNulls=true-24          36.0 ± 0%      36.0 ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=65536/hasNulls=false-24       1.02k ± 0%     1.02k ± 0%     ~     (all equal)
MergeJoiner/oneSideRepeat-rows=65536/hasNulls=true-24        1.00k ± 0%     1.00k ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=16/hasNulls=false-24         91.0 ± 0%      91.0 ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=16/hasNulls=true-24          91.0 ± 0%      91.0 ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=1024/hasNulls=false-24        137 ± 0%       137 ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=1024/hasNulls=true-24         137 ± 0%       137 ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=65536/hasNulls=false-24     1.02k ± 0%     1.02k ± 0%     ~     (all equal)
MergeJoiner/bothSidesRepeat-rows=65536/hasNulls=true-24      1.02k ± 0%     1.02k ± 0%     ~     (all equal)

(Note that I had to modify the benchmark to add the nulls into the input.)

Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: great to see the LOC reduction

Reviewed 2 of 12 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)

@yuzefovich
Copy link
Copy Markdown
Member Author

Debugging step-by-step has become a lot better too since the size of the generated functions decreased significantly.

TFTR!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 1, 2020

Build succeeded:

@craig craig bot merged commit 6785859 into cockroachdb:master Dec 1, 2020
@yuzefovich yuzefovich deleted the mj-nulls branch December 1, 2020 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants