Skip to content

Conversation

@wesm
Copy link
Member

@wesm wesm commented Jun 16, 2020

NOTE: the diff is artificially larger due to some code rearranging (that was necessitated because of how some data selection code is shared between the Take and Filter implementations).

Summary:

  • Filter is now 1.5-10+x faster across the board, most notably on primitive types with very high selectivity or very low selectivity filters. The BitBlockCounters do a lot of the heavy lifting in that case but even in the worst case scenario when the block counters never encounter a "full" block, this is still consistently faster.
  • Total -O3 code size for both Take and Filter is now about 600KB. That's down from about 8MB total prior to this patch and ARROW-5760

Some incidental changes:

  • Implemented a fast conversion from boolean filter to take indices (aka "selection vector"), compute::internal::GetTakeIndices. I have also altered the implementation of filtering a record batch to use this, which should be faster (it would be good to have some benchmarks to confirm this).
  • Various expansions to the BitBlockCounter classes that I needed to support this work
  • Fixed a bug ARROW-9142 with RandomArrayGenerator::Boolean. The probability parameter was being interpreted as the probability of a false value rather than the probability of a true. IIUC with Bernoulli distributions, the probability specified is P(X = 1) not P(X = 0). Please someone confirm this.

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

Here's benchmark runs on my machine

If you want to benchmark yourself, please use this branch for the "before": https://github.com/wesm/arrow/tree/ARROW-9075-comparison. It contains the RandomArrayGenerator::Boolean change and some other changes to the benchmarks without which the results will be non-comparable

@github-actions
Copy link

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

To show some simple numbers to show the perf before and after in Python, this example has a high selectivity (all but one value selected) and low selectivity filter (1/100 and 1/1000):

import numpy as np
import pandas as pd
import pyarrow as pa
import pyarrow.compute as pc

string_values = pa.array([pd.util.testing.rands(16)
                          for i in range(10000)] * 100)
double_values = pa.array(np.random.randn(1000000))

all_but_one = np.ones(len(string_values), dtype=bool)
all_but_one[500000] = False

one_in_100 = np.array(np.random.binomial(1, 0.01, size=1000000), dtype=bool)
one_in_1000 = np.array(np.random.binomial(1, 0.001, size=1000000), dtype=bool)

before:

In [2]: timeit pc.filter(double_values, one_in_100)                                                                                                
2.06 ms ± 41.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [3]: timeit pc.filter(double_values, one_in_1000)                                                                                               
1.82 ms ± 3.69 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [4]: timeit pc.filter(double_values, all_but_one)                                                                                               
5.75 ms ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [5]: timeit pc.filter(string_values, one_in_100)                                                                                                
2.23 ms ± 14.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [6]: timeit pc.filter(string_values, one_in_1000)                                                                                               
1.85 ms ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [7]: timeit pc.filter(string_values, all_but_one)                                                                                               
11.6 ms ± 183 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

after

In [4]: timeit pc.filter(double_values, one_in_100)                                               
1.1 ms ± 7.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [5]: timeit pc.filter(double_values, one_in_1000)
531 µs ± 8.52 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [7]: timeit pc.filter(double_values, all_but_one)                                              
1.83 ms ± 7.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [10]: timeit pc.filter(string_values, one_in_100)                                                                                               
1.28 ms ± 3.16 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [11]: timeit pc.filter(string_values, one_in_1000)                                                                                              
561 µs ± 1.69 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [12]: timeit pc.filter(string_values, all_but_one)                                                                                              
6.66 ms ± 34.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

EDIT: updated benchmarks for low-selectivity optimization

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

The RTools 4.0 build is spurious. This is ready for review

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

I implemented some other optimizations, especially for the case where neither values nor filter contain nulls. I'm working on updated benchmarks

Updated benchmarks: https://gist.github.com/wesm/ad07cec1613b6327926dfe1d95e7f4f0/revisions?diff=split

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

I found some issues in the Python benchmarks I posted before. Here's the updated setup and current numbers

setup (I was including the cost of converting NumPy booleans to Arrow booleans in the prior results). I also added a "worst case scenario" where 50% of values are not selected

import numpy as np
import pandas as pd
import pyarrow as pa
import pyarrow.compute as pc

string_values = pa.array([pd.util.testing.rands(16)
                          for i in range(10000)] * 100)
double_values = pa.array(np.random.randn(1000000))

all_but_one = np.ones(len(string_values), dtype=bool)
all_but_one[500000] = False

one_in_2 = np.array(np.random.binomial(1, 0.50, size=1000000), dtype=bool)
one_in_100 = np.array(np.random.binomial(1, 0.01, size=1000000), dtype=bool)
one_in_1000 = np.array(np.random.binomial(1, 0.001, size=1000000), dtype=bool)

all_but_one = pa.array(all_but_one)
one_in_2 = pa.array(one_in_2)
one_in_100 = pa.array(one_in_100)
one_in_1000 = pa.array(one_in_1000)

before:

In [2]: timeit pc.filter(double_values, all_but_one)                                                                                                                                           
5.15 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [3]: timeit pc.filter(double_values, one_in_100)                                                                                                                                            
1.45 ms ± 8.44 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [4]: timeit pc.filter(double_values, one_in_1000)                                                                                                                                           
1.37 ms ± 8.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [5]: timeit pc.filter(double_values, one_in_2)                                                                                                                                              
7.08 ms ± 108 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [6]: timeit pc.filter(string_values, all_but_one)                                                                                                                                           
11 ms ± 204 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [7]: timeit pc.filter(string_values, one_in_100)                                                                                                                                            
1.64 ms ± 9.58 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [8]: timeit pc.filter(string_values, one_in_1000)                                                                                                                                           
1.45 ms ± 4.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [9]: timeit pc.filter(string_values, one_in_2)                                                                                                                                              
11.4 ms ± 117 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

after:

In [2]: timeit pc.filter(double_values, all_but_one)                                                                                                                                           
370 µs ± 2.69 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [3]: timeit pc.filter(double_values, one_in_100)                                                                                                                                            
645 µs ± 3.82 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [4]: timeit pc.filter(double_values, one_in_1000)                                                                                                                                           
124 µs ± 1.51 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [6]: timeit pc.filter(double_values, one_in_2)                                                                                                                                              
5.11 ms ± 38.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [8]: timeit pc.filter(string_values, all_but_one)                                                                                                                                           
6.51 ms ± 21.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [9]: timeit pc.filter(string_values, one_in_100)                                                                                                                                            
680 µs ± 3.63 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [10]: timeit pc.filter(string_values, one_in_1000)                                                                                                                                          
188 µs ± 849 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [11]: timeit pc.filter(string_values, one_in_2)                                                                                                                                             
7.73 ms ± 63.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

@ursabot benchmark --help

@ursabot
Copy link

ursabot commented Jun 16, 2020

Usage: @ursabot benchmark [OPTIONS] [<baseline>]

  Run the benchmark suite in comparison mode.

  This command will run the benchmark suite for tip of the branch commit
  against `<baseline>` (or master if not provided).

  Examples:

  # Run the all the benchmarks
  @ursabot benchmark

  # Compare only benchmarks where the name matches the /^Sum/ regex
  @ursabot benchmark --benchmark-filter=^Sum

  # Compare only benchmarks where the suite matches the /compute-/ regex.
  # A suite is the C++ binary.
  @ursabot benchmark --suite-filter=compute-

  # Sometimes a new optimization requires the addition of new benchmarks to
  # quantify the performance increase. When doing this be sure to add the
  # benchmark in a separate commit before introducing the optimization.
  #
  # Note that specifying the baseline is the only way to compare using a new
  # benchmark, since master does not contain the new benchmark and no
  # comparison is possible.
  #
  # The following command compares the results of matching benchmarks,
  # compiling against HEAD and the provided baseline commit, e.g. eaf8302.
  # You can use this to quantify the performance improvement of new
  # optimizations or to check for regressions.
  @ursabot benchmark --benchmark-filter=MyBenchmark eaf8302

Options:
  --suite-filter <regex>      Regex filtering benchmark suites.
  --benchmark-filter <regex>  Regex filtering benchmarks.
  --help                      Show this message and exit.

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

@ursabot benchmark --benchmark-filter=Filter 66df3d0

@ursabot
Copy link

ursabot commented Jun 16, 2020

AMD64 Ubuntu 18.04 C++ Benchmark (#112487) builder has been succeeded.

Revision: 31a66630f6bcb9a3f74912da7d31ac2412e97184

  =======================================  ===============  ===============  =========
  benchmark                                baseline         contender        change
  =======================================  ===============  ===============  =========
  FilterInt64FilterWithNulls/262144/3      563.800 MiB/sec  576.625 MiB/sec  2.275%
- FilterStringFilterWithNulls/262144/3     498.174 MiB/sec  434.196 MiB/sec  -12.842%
  FilterFSLInt64FilterNoNulls/262144/9     158.897 MiB/sec  268.195 MiB/sec  68.785%
  FilterInt64FilterNoNulls/262144/14       2.793 GiB/sec    6.554 GiB/sec    134.709%
  FilterFSLInt64FilterNoNulls/262144/11    2.356 GiB/sec    5.386 GiB/sec    128.589%
  FilterStringFilterNoNulls/262144/2       4.937 GiB/sec    10.996 GiB/sec   122.715%
  FilterFSLInt64FilterWithNulls/262144/5   1.590 GiB/sec    4.193 GiB/sec    163.732%
  FilterInt64FilterWithNulls/262144/12     519.932 MiB/sec  496.829 MiB/sec  -4.443%
  FilterInt64FilterNoNulls/262144/0        669.365 MiB/sec  7.541 GiB/sec    1053.558%
  FilterFSLInt64FilterNoNulls/262144/1     268.027 MiB/sec  560.837 MiB/sec  109.246%
  FilterStringFilterNoNulls/262144/6       488.692 MiB/sec  481.827 MiB/sec  -1.405%
  FilterInt64FilterNoNulls/262144/8        2.735 GiB/sec    6.313 GiB/sec    130.810%
  FilterInt64FilterNoNulls/262144/5        2.809 GiB/sec    6.018 GiB/sec    114.267%
- FilterStringFilterWithNulls/262144/12    84.168 MiB/sec   70.410 MiB/sec   -16.346%
  FilterFSLInt64FilterNoNulls/262144/0     169.867 MiB/sec  718.594 MiB/sec  323.035%
  FilterStringFilterWithNulls/262144/14    355.644 MiB/sec  878.914 MiB/sec  147.133%
  FilterStringFilterWithNulls/262144/2     3.338 GiB/sec    8.903 GiB/sec    166.736%
  FilterFSLInt64FilterWithNulls/262144/1   263.151 MiB/sec  512.905 MiB/sec  94.909%
  FilterFSLInt64FilterNoNulls/262144/14    2.395 GiB/sec    5.212 GiB/sec    117.604%
  FilterInt64FilterWithNulls/262144/11     1.729 GiB/sec    4.684 GiB/sec    170.948%
  FilterInt64FilterNoNulls/262144/9        566.051 MiB/sec  3.083 GiB/sec    457.794%
- FilterStringFilterWithNulls/262144/10    619.724 MiB/sec  578.798 MiB/sec  -6.604%
  FilterInt64FilterWithNulls/262144/1      541.616 MiB/sec  558.958 MiB/sec  3.202%
  FilterFSLInt64FilterWithNulls/262144/14  1.596 GiB/sec    4.061 GiB/sec    154.454%
  FilterFSLInt64FilterWithNulls/262144/0   170.064 MiB/sec  398.738 MiB/sec  134.464%
  FilterInt64FilterWithNulls/262144/2      1.739 GiB/sec    4.883 GiB/sec    180.721%
  FilterInt64FilterWithNulls/262144/4      528.271 MiB/sec  555.772 MiB/sec  5.206%
  FilterFSLInt64FilterNoNulls/262144/2     2.383 GiB/sec    6.074 GiB/sec    154.832%
  FilterInt64FilterNoNulls/262144/4        584.370 MiB/sec  579.728 MiB/sec  -0.794%
  FilterInt64FilterNoNulls/262144/12       575.177 MiB/sec  3.023 GiB/sec    438.268%
- FilterStringFilterWithNulls/262144/9     459.179 MiB/sec  394.515 MiB/sec  -14.083%
  FilterStringFilterNoNulls/262144/5       4.936 GiB/sec    10.562 GiB/sec   113.987%
  FilterInt64FilterNoNulls/262144/2        2.838 GiB/sec    7.390 GiB/sec    160.374%
  FilterFSLInt64FilterNoNulls/262144/7     261.996 MiB/sec  464.922 MiB/sec  77.454%
  FilterStringFilterNoNulls/262144/14      580.305 MiB/sec  1.253 GiB/sec    121.158%
  FilterFSLInt64FilterWithNulls/262144/13  249.426 MiB/sec  386.982 MiB/sec  55.149%
- FilterInt64FilterWithNulls/262144/9      530.774 MiB/sec  497.368 MiB/sec  -6.294%
  FilterStringFilterWithNulls/262144/8     3.270 GiB/sec    8.467 GiB/sec    158.943%
  FilterFSLInt64FilterNoNulls/262144/10    257.812 MiB/sec  390.196 MiB/sec  51.349%
- FilterStringFilterNoNulls/262144/13      98.039 MiB/sec   90.475 MiB/sec   -7.716%
  FilterInt64FilterWithNulls/262144/8      1.737 GiB/sec    4.652 GiB/sec    167.790%
  FilterFSLInt64FilterWithNulls/262144/3   167.057 MiB/sec  351.817 MiB/sec  110.597%
- FilterStringFilterWithNulls/262144/6     494.580 MiB/sec  429.801 MiB/sec  -13.098%
  FilterFSLInt64FilterWithNulls/262144/12  165.174 MiB/sec  262.176 MiB/sec  58.728%
  FilterInt64FilterWithNulls/262144/7      526.592 MiB/sec  541.187 MiB/sec  2.772%
  FilterStringFilterNoNulls/262144/11      4.531 GiB/sec    9.652 GiB/sec    113.006%
  FilterStringFilterWithNulls/262144/1     662.260 MiB/sec  633.359 MiB/sec  -4.364%
  FilterStringFilterWithNulls/262144/4     670.467 MiB/sec  644.877 MiB/sec  -3.817%
  FilterStringFilterNoNulls/262144/0       503.582 MiB/sec  550.304 MiB/sec  9.278%
- FilterStringFilterNoNulls/262144/9       443.066 MiB/sec  390.416 MiB/sec  -11.883%
  FilterFSLInt64FilterNoNulls/262144/13    251.747 MiB/sec  351.809 MiB/sec  39.747%
  FilterInt64FilterNoNulls/262144/11       2.788 GiB/sec    6.687 GiB/sec    139.878%
- FilterInt64FilterWithNulls/262144/0      620.421 MiB/sec  585.692 MiB/sec  -5.598%
  FilterFSLInt64FilterWithNulls/262144/8   1.593 GiB/sec    4.155 GiB/sec    160.783%
- FilterStringFilterNoNulls/262144/7       692.942 MiB/sec  654.463 MiB/sec  -5.553%
  FilterStringFilterNoNulls/262144/8       4.900 GiB/sec    10.519 GiB/sec   114.694%
  FilterInt64FilterWithNulls/262144/10     510.602 MiB/sec  527.612 MiB/sec  3.331%
  FilterFSLInt64FilterNoNulls/262144/3     159.401 MiB/sec  555.494 MiB/sec  248.487%
  FilterFSLInt64FilterNoNulls/262144/6     162.294 MiB/sec  399.907 MiB/sec  146.410%
- FilterStringFilterWithNulls/262144/0     517.359 MiB/sec  439.657 MiB/sec  -15.019%
  FilterInt64FilterWithNulls/262144/13     502.220 MiB/sec  527.971 MiB/sec  5.128%
  FilterStringFilterWithNulls/262144/7     666.386 MiB/sec  638.254 MiB/sec  -4.221%
  FilterInt64FilterNoNulls/262144/6        603.261 MiB/sec  3.473 GiB/sec    489.518%
  FilterStringFilterWithNulls/262144/11    2.994 GiB/sec    8.094 GiB/sec    170.304%
  FilterFSLInt64FilterWithNulls/262144/6   165.225 MiB/sec  335.017 MiB/sec  102.765%
  FilterFSLInt64FilterWithNulls/262144/7   257.333 MiB/sec  466.760 MiB/sec  81.383%
  FilterInt64FilterNoNulls/262144/7        583.317 MiB/sec  564.896 MiB/sec  -3.158%
  FilterStringFilterNoNulls/262144/4       691.530 MiB/sec  699.221 MiB/sec  1.112%
  FilterFSLInt64FilterWithNulls/262144/11  1.592 GiB/sec    4.057 GiB/sec    154.837%
- FilterStringFilterNoNulls/262144/12      88.970 MiB/sec   70.067 MiB/sec   -21.246%
  FilterInt64FilterNoNulls/262144/10       562.254 MiB/sec  545.802 MiB/sec  -2.926%
  FilterInt64FilterWithNulls/262144/14     1.738 GiB/sec    4.747 GiB/sec    173.077%
  FilterFSLInt64FilterWithNulls/262144/2   1.570 GiB/sec    4.295 GiB/sec    173.597%
  FilterInt64FilterNoNulls/262144/13       558.715 MiB/sec  554.622 MiB/sec  -0.733%
  FilterInt64FilterWithNulls/262144/6      561.253 MiB/sec  537.786 MiB/sec  -4.181%
  FilterStringFilterWithNulls/262144/13    91.370 MiB/sec   89.650 MiB/sec   -1.882%
  FilterFSLInt64FilterNoNulls/262144/12    153.042 MiB/sec  241.416 MiB/sec  57.745%
  FilterFSLInt64FilterNoNulls/262144/5     2.414 GiB/sec    5.672 GiB/sec    134.917%
  FilterFSLInt64FilterNoNulls/262144/8     2.377 GiB/sec    5.541 GiB/sec    133.082%
- FilterStringFilterNoNulls/262144/10      632.556 MiB/sec  572.816 MiB/sec  -9.444%
  FilterFSLInt64FilterWithNulls/262144/9   166.869 MiB/sec  288.049 MiB/sec  72.620%
  FilterInt64FilterNoNulls/262144/1        599.855 MiB/sec  912.146 MiB/sec  52.061%
  FilterStringFilterWithNulls/262144/5     3.295 GiB/sec    8.587 GiB/sec    160.574%
  FilterFSLInt64FilterNoNulls/262144/4     263.896 MiB/sec  514.836 MiB/sec  95.091%
  FilterFSLInt64FilterWithNulls/262144/4   258.744 MiB/sec  477.042 MiB/sec  84.369%
  FilterInt64FilterWithNulls/262144/5      1.735 GiB/sec    4.728 GiB/sec    172.542%
  FilterStringFilterNoNulls/262144/3       495.135 MiB/sec  539.178 MiB/sec  8.895%
  FilterInt64FilterNoNulls/262144/3        611.978 MiB/sec  3.929 GiB/sec    557.402%
  FilterFSLInt64FilterWithNulls/262144/10  255.156 MiB/sec  417.072 MiB/sec  63.458%
  FilterStringFilterNoNulls/262144/1       714.448 MiB/sec  713.457 MiB/sec  -0.139%
  =======================================  ===============  ===============  =========

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

The string perf regressions are mostly for the cases where 99.9% of the values are selected. I'll take a closer look at this to see what can be done. The varbinary case is so important that we might want to create a specialized implementation for it

@fsaintjacques
Copy link
Contributor

Still, a 10% decrease for string is highly tolerable for a 50-150% increase for all other types.

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

True. I think for binary-based types we need to implement bulk-block-appends. It's beyond the scope of this PR -- I will take a brief look to see if there's anything dumb (like messing up the preallocation) that I did that's making things slower

@wesm
Copy link
Member Author

wesm commented Jun 16, 2020

I'll have to deal with the string optimization in a follow up PR, so I'm going to leave this for review as is. It would be good to get this merged sooner rather than later.

EDIT: opened https://issues.apache.org/jira/browse/ARROW-9152

@pitrou
Copy link
Member

pitrou commented Jun 17, 2020

Everything is much faster here, including string filtering.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't taken a look at everything.

Copy link
Contributor

@fsaintjacques fsaintjacques left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments regarding testing and implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be extracted as a ScalarFunction named popcount or so (follow up)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@wesm
Copy link
Member Author

wesm commented Jun 17, 2020

@ursabot benchmark --benchmark-filter=Filter c4f425768

@wesm
Copy link
Member Author

wesm commented Jun 17, 2020

I think I improved some of the readability problems and addressed the other comments. I'd like to merge this soon once CI is green

@ursabot
Copy link

ursabot commented Jun 17, 2020

AMD64 Ubuntu 18.04 C++ Benchmark (#112952) builder has been succeeded.

Revision: f50b39e54c50e8a53606eda486c88e6ec51d7006

  =======================================  ===============  ================  ========
  benchmark                                baseline         contender         change
  =======================================  ===============  ================  ========
- FilterFSLInt64FilterNoNulls/262144/14    5.457 GiB/sec    4.398 GiB/sec     -19.404%
  FilterStringFilterWithNulls/262144/4     642.405 MiB/sec  677.920 MiB/sec   5.528%
- FilterFSLInt64FilterNoNulls/262144/7     463.992 MiB/sec  378.391 MiB/sec   -18.449%
  FilterFSLInt64FilterWithNulls/262144/6   333.996 MiB/sec  320.327 MiB/sec   -4.093%
- FilterFSLInt64FilterWithNulls/262144/1   516.189 MiB/sec  459.926 MiB/sec   -10.900%
- FilterStringFilterNoNulls/262144/4       681.504 MiB/sec  595.788 MiB/sec   -12.577%
- FilterFSLInt64FilterNoNulls/262144/8     5.889 GiB/sec    4.675 GiB/sec     -20.610%
- FilterInt64FilterWithNulls/262144/10     606.960 MiB/sec  547.973 MiB/sec   -9.718%
- FilterInt64FilterNoNulls/262144/7        638.264 MiB/sec  568.923 MiB/sec   -10.864%
  FilterStringFilterWithNulls/262144/6     431.474 MiB/sec  484.077 MiB/sec   12.191%
- FilterStringFilterNoNulls/262144/14      1.245 GiB/sec    1008.386 MiB/sec  -20.893%
  FilterFSLInt64FilterWithNulls/262144/11  4.239 GiB/sec    4.029 GiB/sec     -4.954%
- FilterStringFilterNoNulls/262144/8       10.899 GiB/sec   8.494 GiB/sec     -22.064%
- FilterFSLInt64FilterNoNulls/262144/4     515.626 MiB/sec  406.426 MiB/sec   -21.178%
  FilterInt64FilterNoNulls/262144/6        3.697 GiB/sec    3.525 GiB/sec     -4.664%
  FilterInt64FilterNoNulls/262144/8        6.829 GiB/sec    6.809 GiB/sec     -0.301%
- FilterFSLInt64FilterNoNulls/262144/2     6.453 GiB/sec    4.950 GiB/sec     -23.289%
- FilterInt64FilterWithNulls/262144/13     606.984 MiB/sec  548.948 MiB/sec   -9.561%
- FilterStringFilterNoNulls/262144/1       707.132 MiB/sec  609.027 MiB/sec   -13.874%
  FilterStringFilterWithNulls/262144/3     436.301 MiB/sec  488.825 MiB/sec   12.038%
  FilterStringFilterWithNulls/262144/1     616.105 MiB/sec  675.493 MiB/sec   9.639%
  FilterStringFilterNoNulls/262144/3       548.660 MiB/sec  533.539 MiB/sec   -2.756%
- FilterFSLInt64FilterNoNulls/262144/9     268.363 MiB/sec  250.359 MiB/sec   -6.709%
- FilterStringFilterNoNulls/262144/13      89.995 MiB/sec   76.326 MiB/sec    -15.189%
  FilterStringFilterWithNulls/262144/12    71.366 MiB/sec   82.415 MiB/sec    15.483%
  FilterInt64FilterNoNulls/262144/9        3.209 GiB/sec    3.114 GiB/sec     -2.971%
  FilterFSLInt64FilterWithNulls/262144/9   288.819 MiB/sec  276.679 MiB/sec   -4.203%
  FilterStringFilterNoNulls/262144/12      66.141 MiB/sec   65.509 MiB/sec    -0.956%
- FilterFSLInt64FilterWithNulls/262144/4   474.907 MiB/sec  429.013 MiB/sec   -9.664%
- FilterInt64FilterWithNulls/262144/1      651.659 MiB/sec  556.258 MiB/sec   -14.640%
  FilterStringFilterWithNulls/262144/14    911.019 MiB/sec  871.756 MiB/sec   -4.310%
- FilterInt64FilterNoNulls/262144/4        675.941 MiB/sec  569.448 MiB/sec   -15.755%
- FilterFSLInt64FilterNoNulls/262144/13    352.227 MiB/sec  307.638 MiB/sec   -12.659%
  FilterInt64FilterWithNulls/262144/5      5.129 GiB/sec    4.921 GiB/sec     -4.068%
- FilterFSLInt64FilterWithNulls/262144/14  4.168 GiB/sec    3.909 GiB/sec     -6.200%
  FilterStringFilterWithNulls/262144/9     396.156 MiB/sec  442.591 MiB/sec   11.721%
- FilterFSLInt64FilterNoNulls/262144/3     554.664 MiB/sec  464.787 MiB/sec   -16.204%
- FilterStringFilterNoNulls/262144/2       11.394 GiB/sec   8.924 GiB/sec     -21.683%
- FilterStringFilterWithNulls/262144/8     8.856 GiB/sec    8.075 GiB/sec     -8.825%
- FilterFSLInt64FilterNoNulls/262144/10    389.368 MiB/sec  333.033 MiB/sec   -14.468%
- FilterFSLInt64FilterNoNulls/262144/11    5.587 GiB/sec    4.507 GiB/sec     -19.338%
  FilterStringFilterWithNulls/262144/10    580.314 MiB/sec  612.106 MiB/sec   5.478%
- FilterFSLInt64FilterNoNulls/262144/5     6.032 GiB/sec    4.717 GiB/sec     -21.802%
- FilterFSLInt64FilterNoNulls/262144/0     725.211 MiB/sec  565.535 MiB/sec   -22.018%
- FilterInt64FilterNoNulls/262144/3        4.266 GiB/sec    3.855 GiB/sec     -9.641%
- FilterInt64FilterWithNulls/262144/12     549.159 MiB/sec  499.761 MiB/sec   -8.995%
- FilterInt64FilterWithNulls/262144/0      622.810 MiB/sec  497.075 MiB/sec   -20.188%
- FilterInt64FilterNoNulls/262144/1        1.021 GiB/sec    980.686 MiB/sec   -6.230%
- FilterFSLInt64FilterWithNulls/262144/0   399.890 MiB/sec  375.677 MiB/sec   -6.055%
- FilterFSLInt64FilterWithNulls/262144/2   4.497 GiB/sec    4.233 GiB/sec     -5.880%
- FilterFSLInt64FilterNoNulls/262144/1     564.700 MiB/sec  431.560 MiB/sec   -23.577%
- FilterInt64FilterWithNulls/262144/9      549.832 MiB/sec  499.657 MiB/sec   -9.125%
- FilterInt64FilterWithNulls/262144/7      625.701 MiB/sec  550.091 MiB/sec   -12.084%
  FilterInt64FilterNoNulls/262144/14       6.386 GiB/sec    6.901 GiB/sec     8.073%
  FilterInt64FilterWithNulls/262144/8      5.034 GiB/sec    4.958 GiB/sec     -1.517%
  FilterInt64FilterNoNulls/262144/12       3.215 GiB/sec    3.131 GiB/sec     -2.607%
  FilterStringFilterNoNulls/262144/0       560.832 MiB/sec  545.275 MiB/sec   -2.774%
- FilterStringFilterNoNulls/262144/7       641.313 MiB/sec  582.952 MiB/sec   -9.100%
- FilterInt64FilterWithNulls/262144/3      615.558 MiB/sec  496.003 MiB/sec   -19.422%
- FilterStringFilterNoNulls/262144/10      578.560 MiB/sec  506.085 MiB/sec   -12.527%
  FilterInt64FilterWithNulls/262144/14     4.934 GiB/sec    4.873 GiB/sec     -1.228%
  FilterInt64FilterNoNulls/262144/5        7.145 GiB/sec    6.863 GiB/sec     -3.945%
  FilterStringFilterWithNulls/262144/7     632.496 MiB/sec  669.411 MiB/sec   5.836%
  FilterInt64FilterWithNulls/262144/11     4.937 GiB/sec    4.860 GiB/sec     -1.544%
- FilterStringFilterWithNulls/262144/5     9.095 GiB/sec    8.275 GiB/sec     -9.015%
  FilterStringFilterNoNulls/262144/6       483.482 MiB/sec  470.273 MiB/sec   -2.732%
- FilterFSLInt64FilterWithNulls/262144/7   464.358 MiB/sec  418.157 MiB/sec   -9.949%
- FilterStringFilterNoNulls/262144/11      10.039 GiB/sec   7.873 GiB/sec     -21.572%
  FilterInt64FilterNoNulls/262144/11       6.389 GiB/sec    6.942 GiB/sec     8.664%
- FilterFSLInt64FilterNoNulls/262144/6     400.926 MiB/sec  355.070 MiB/sec   -11.437%
- FilterStringFilterNoNulls/262144/5       10.942 GiB/sec   8.621 GiB/sec     -21.211%
  FilterInt64FilterNoNulls/262144/2        7.901 GiB/sec    7.942 GiB/sec     0.526%
- FilterFSLInt64FilterWithNulls/262144/13  387.523 MiB/sec  354.145 MiB/sec   -8.613%
- FilterInt64FilterNoNulls/262144/10       635.634 MiB/sec  574.368 MiB/sec   -9.639%
- FilterStringFilterWithNulls/262144/11    8.363 GiB/sec    7.663 GiB/sec     -8.365%
- FilterInt64FilterWithNulls/262144/4      644.733 MiB/sec  554.689 MiB/sec   -13.966%
- FilterInt64FilterWithNulls/262144/2      5.308 GiB/sec    4.950 GiB/sec     -6.739%
- FilterInt64FilterWithNulls/262144/6      582.743 MiB/sec  494.561 MiB/sec   -15.132%
  FilterFSLInt64FilterWithNulls/262144/5   4.299 GiB/sec    4.094 GiB/sec     -4.757%
  FilterInt64FilterNoNulls/262144/0        7.685 GiB/sec    8.021 GiB/sec     4.371%
- FilterInt64FilterNoNulls/262144/13       634.999 MiB/sec  574.211 MiB/sec   -9.573%
- FilterStringFilterWithNulls/262144/2     9.478 GiB/sec    8.593 GiB/sec     -9.337%
  FilterFSLInt64FilterWithNulls/262144/8   4.256 GiB/sec    4.060 GiB/sec     -4.609%
- FilterFSLInt64FilterWithNulls/262144/10  422.316 MiB/sec  380.968 MiB/sec   -9.791%
  FilterStringFilterNoNulls/262144/9       383.197 MiB/sec  374.020 MiB/sec   -2.395%
- FilterFSLInt64FilterNoNulls/262144/12    242.820 MiB/sec  227.762 MiB/sec   -6.201%
  FilterStringFilterWithNulls/262144/0     429.008 MiB/sec  493.378 MiB/sec   15.004%
- FilterFSLInt64FilterWithNulls/262144/12  267.881 MiB/sec  249.827 MiB/sec   -6.739%
  FilterFSLInt64FilterWithNulls/262144/3   349.988 MiB/sec  337.076 MiB/sec   -3.689%
  FilterStringFilterWithNulls/262144/13    90.911 MiB/sec   97.476 MiB/sec    7.222%
  =======================================  ===============  ================  ========

@wesm
Copy link
Member Author

wesm commented Jun 17, 2020

Something weird with the commit history, I'm not sure those benchmarks are right. I'll rebase things again and rerun

wesm added 2 commits June 17, 2020 12:52
Small fix

More work, start writing filter -> selection vector

Things compiling again finally

BinaryBitBlockCounter tests passing

Consolidate take/filter tests in same module, fix GetTakeIndices / GetFilterOutputSize unit tests and implementations

Finish filter implementation, tests passing again

Clean up includes

Tweak benchmark parameters

Some string streamlining

Python fixes

Python test fixes. Add fast path for low-selectivity filters

Low selectivity path for non-primitive filtering

VisitFilter is not a dependent template

Implement some obvious non-null filter optimizations

Fix typo
…ter paths less spaghetti

Split primitive filter paths between DROP/EMIT_NULL, improve readability
@wesm
Copy link
Member Author

wesm commented Jun 17, 2020

@ursabot benchmark --benchmark-filter=Filter 04006ff

@ursabot
Copy link

ursabot commented Jun 17, 2020

AMD64 Ubuntu 18.04 C++ Benchmark (#112989) builder has been succeeded.

Revision: 21227cc

  =======================================  ===============  ================  ========
  benchmark                                baseline         contender         change
  =======================================  ===============  ================  ========
- FilterStringFilterNoNulls/262144/7       637.909 MiB/sec  572.355 MiB/sec   -10.276%
- FilterStringFilterNoNulls/262144/8       10.897 GiB/sec   8.711 GiB/sec     -20.057%
  FilterStringFilterNoNulls/262144/6       485.775 MiB/sec  476.410 MiB/sec   -1.928%
  FilterStringFilterWithNulls/262144/4     649.558 MiB/sec  677.796 MiB/sec   4.347%
  FilterInt64FilterNoNulls/262144/9        3.212 GiB/sec    3.264 GiB/sec     1.612%
- FilterFSLInt64FilterNoNulls/262144/13    351.877 MiB/sec  308.073 MiB/sec   -12.449%
- FilterFSLInt64FilterNoNulls/262144/10    389.471 MiB/sec  333.418 MiB/sec   -14.392%
- FilterInt64FilterNoNulls/262144/4        668.729 MiB/sec  625.199 MiB/sec   -6.509%
  FilterFSLInt64FilterWithNulls/262144/9   287.988 MiB/sec  276.495 MiB/sec   -3.991%
- FilterStringFilterWithNulls/262144/2     9.441 GiB/sec    8.793 GiB/sec     -6.865%
  FilterStringFilterWithNulls/262144/12    73.855 MiB/sec   82.463 MiB/sec    11.656%
- FilterFSLInt64FilterNoNulls/262144/5     6.091 GiB/sec    4.403 GiB/sec     -27.714%
- FilterFSLInt64FilterNoNulls/262144/3     550.519 MiB/sec  463.959 MiB/sec   -15.723%
  FilterInt64FilterNoNulls/262144/2        7.988 GiB/sec    7.976 GiB/sec     -0.147%
- FilterStringFilterNoNulls/262144/4       700.795 MiB/sec  605.189 MiB/sec   -13.643%
- FilterFSLInt64FilterWithNulls/262144/1   516.544 MiB/sec  460.521 MiB/sec   -10.846%
- FilterStringFilterWithNulls/262144/8     8.877 GiB/sec    8.364 GiB/sec     -5.779%
- FilterFSLInt64FilterWithNulls/262144/3   350.123 MiB/sec  329.103 MiB/sec   -6.004%
  FilterStringFilterWithNulls/262144/3     435.836 MiB/sec  494.167 MiB/sec   13.384%
  FilterInt64FilterNoNulls/262144/10       630.544 MiB/sec  628.104 MiB/sec   -0.387%
- FilterStringFilterNoNulls/262144/5       11.014 GiB/sec   8.788 GiB/sec     -20.216%
  FilterInt64FilterNoNulls/262144/3        4.263 GiB/sec    4.181 GiB/sec     -1.936%
  FilterInt64FilterWithNulls/262144/1      635.637 MiB/sec  615.015 MiB/sec   -3.244%
  FilterStringFilterWithNulls/262144/7     638.645 MiB/sec  678.465 MiB/sec   6.235%
- FilterFSLInt64FilterNoNulls/262144/2     6.506 GiB/sec    5.012 GiB/sec     -22.975%
- FilterFSLInt64FilterNoNulls/262144/0     729.854 MiB/sec  569.623 MiB/sec   -21.954%
  FilterInt64FilterNoNulls/262144/5        6.946 GiB/sec    6.899 GiB/sec     -0.674%
  FilterInt64FilterWithNulls/262144/12     545.763 MiB/sec  547.657 MiB/sec   0.347%
  FilterStringFilterNoNulls/262144/9       383.858 MiB/sec  377.178 MiB/sec   -1.740%
- FilterFSLInt64FilterNoNulls/262144/8     5.825 GiB/sec    4.702 GiB/sec     -19.289%
  FilterInt64FilterNoNulls/262144/13       632.053 MiB/sec  633.157 MiB/sec   0.175%
  FilterInt64FilterNoNulls/262144/1        1.020 GiB/sec    1.022 GiB/sec     0.239%
- FilterFSLInt64FilterNoNulls/262144/12    242.197 MiB/sec  228.152 MiB/sec   -5.799%
  FilterInt64FilterWithNulls/262144/4      640.980 MiB/sec  614.192 MiB/sec   -4.179%
  FilterInt64FilterWithNulls/262144/8      4.967 GiB/sec    5.071 GiB/sec     2.102%
- FilterFSLInt64FilterWithNulls/262144/0   396.373 MiB/sec  374.388 MiB/sec   -5.546%
  FilterInt64FilterWithNulls/262144/11     4.934 GiB/sec    4.997 GiB/sec     1.282%
- FilterFSLInt64FilterNoNulls/262144/14    5.435 GiB/sec    4.459 GiB/sec     -17.946%
  FilterInt64FilterNoNulls/262144/12       3.255 GiB/sec    3.185 GiB/sec     -2.144%
  FilterStringFilterWithNulls/262144/1     638.704 MiB/sec  690.413 MiB/sec   8.096%
- FilterStringFilterNoNulls/262144/2       11.411 GiB/sec   9.040 GiB/sec     -20.778%
  FilterInt64FilterWithNulls/262144/6      582.753 MiB/sec  554.462 MiB/sec   -4.855%
  FilterStringFilterWithNulls/262144/10    586.149 MiB/sec  616.404 MiB/sec   5.162%
  FilterInt64FilterNoNulls/262144/0        7.653 GiB/sec    7.971 GiB/sec     4.146%
  FilterInt64FilterWithNulls/262144/13     590.396 MiB/sec  607.816 MiB/sec   2.951%
- FilterStringFilterNoNulls/262144/14      1.254 GiB/sec    1011.778 MiB/sec  -21.233%
- FilterFSLInt64FilterWithNulls/262144/4   474.573 MiB/sec  428.073 MiB/sec   -9.798%
  FilterInt64FilterWithNulls/262144/2      5.245 GiB/sec    5.072 GiB/sec     -3.310%
- FilterStringFilterWithNulls/262144/11    8.381 GiB/sec    7.793 GiB/sec     -7.006%
  FilterFSLInt64FilterWithNulls/262144/14  4.065 GiB/sec    3.917 GiB/sec     -3.648%
- FilterFSLInt64FilterNoNulls/262144/1     566.516 MiB/sec  432.124 MiB/sec   -23.723%
  FilterStringFilterWithNulls/262144/6     431.308 MiB/sec  489.475 MiB/sec   13.486%
- FilterFSLInt64FilterNoNulls/262144/9     267.636 MiB/sec  250.549 MiB/sec   -6.385%
- FilterFSLInt64FilterWithNulls/262144/2   4.505 GiB/sec    4.244 GiB/sec     -5.789%
- FilterStringFilterNoNulls/262144/1       699.807 MiB/sec  605.175 MiB/sec   -13.523%
  FilterInt64FilterWithNulls/262144/14     4.914 GiB/sec    4.970 GiB/sec     1.141%
- FilterStringFilterNoNulls/262144/11      9.990 GiB/sec    7.988 GiB/sec     -20.035%
- FilterStringFilterNoNulls/262144/12      70.677 MiB/sec   65.603 MiB/sec    -7.180%
  FilterStringFilterWithNulls/262144/9     395.814 MiB/sec  447.434 MiB/sec   13.042%
  FilterFSLInt64FilterWithNulls/262144/6   333.780 MiB/sec  319.575 MiB/sec   -4.256%
  FilterFSLInt64FilterWithNulls/262144/8   4.263 GiB/sec    4.091 GiB/sec     -4.021%
  FilterInt64FilterNoNulls/262144/14       6.414 GiB/sec    6.933 GiB/sec     8.095%
  FilterStringFilterWithNulls/262144/0     441.849 MiB/sec  496.266 MiB/sec   12.316%
  FilterInt64FilterNoNulls/262144/11       6.411 GiB/sec    6.874 GiB/sec     7.218%
- FilterInt64FilterNoNulls/262144/7        648.036 MiB/sec  547.011 MiB/sec   -15.589%
- FilterFSLInt64FilterWithNulls/262144/10  419.063 MiB/sec  381.681 MiB/sec   -8.920%
- FilterFSLInt64FilterWithNulls/262144/13  386.755 MiB/sec  353.726 MiB/sec   -8.540%
  FilterInt64FilterNoNulls/262144/8        6.724 GiB/sec    7.073 GiB/sec     5.190%
  FilterInt64FilterWithNulls/262144/9      545.560 MiB/sec  545.449 MiB/sec   -0.020%
- FilterStringFilterNoNulls/262144/10      575.809 MiB/sec  507.681 MiB/sec   -11.832%
- FilterStringFilterWithNulls/262144/5     9.154 GiB/sec    8.428 GiB/sec     -7.931%
  FilterStringFilterNoNulls/262144/0       519.896 MiB/sec  554.802 MiB/sec   6.714%
  FilterFSLInt64FilterWithNulls/262144/5   4.294 GiB/sec    4.126 GiB/sec     -3.911%
- FilterFSLInt64FilterNoNulls/262144/7     463.085 MiB/sec  378.577 MiB/sec   -18.249%
  FilterFSLInt64FilterWithNulls/262144/11  4.245 GiB/sec    4.061 GiB/sec     -4.333%
  FilterStringFilterNoNulls/262144/3       544.102 MiB/sec  542.846 MiB/sec   -0.231%
- FilterInt64FilterWithNulls/262144/0      617.474 MiB/sec  560.813 MiB/sec   -9.176%
  FilterInt64FilterWithNulls/262144/7      619.732 MiB/sec  609.068 MiB/sec   -1.721%
  FilterStringFilterWithNulls/262144/13    91.185 MiB/sec   97.530 MiB/sec    6.958%
- FilterStringFilterWithNulls/262144/14    929.857 MiB/sec  874.512 MiB/sec   -5.952%
- FilterInt64FilterWithNulls/262144/3      604.918 MiB/sec  560.882 MiB/sec   -7.280%
- FilterFSLInt64FilterNoNulls/262144/4     514.014 MiB/sec  411.713 MiB/sec   -19.902%
- FilterFSLInt64FilterWithNulls/262144/7   463.921 MiB/sec  417.320 MiB/sec   -10.045%
- FilterFSLInt64FilterWithNulls/262144/12  267.697 MiB/sec  247.408 MiB/sec   -7.579%
- FilterFSLInt64FilterNoNulls/262144/11    5.632 GiB/sec    4.533 GiB/sec     -19.515%
- FilterStringFilterNoNulls/262144/13      90.578 MiB/sec   76.367 MiB/sec    -15.690%
  FilterInt64FilterNoNulls/262144/6        3.709 GiB/sec    3.680 GiB/sec     -0.786%
  FilterInt64FilterWithNulls/262144/5      5.115 GiB/sec    4.997 GiB/sec     -2.309%
  FilterInt64FilterWithNulls/262144/10     604.161 MiB/sec  607.760 MiB/sec   0.596%
- FilterFSLInt64FilterNoNulls/262144/6     389.763 MiB/sec  354.969 MiB/sec   -8.927%
  =======================================  ===============  ================  ========

@wesm
Copy link
Member Author

wesm commented Jun 17, 2020

So these "readability" improvements made performance worse so I'll revert them

@wesm
Copy link
Member Author

wesm commented Jun 17, 2020

@ursabot benchmark --benchmark-filter=Filter 04006ff

@ursabot
Copy link

ursabot commented Jun 17, 2020

AMD64 Ubuntu 18.04 C++ Benchmark (#113048) builder has been succeeded.

Revision: 54bb838

  =======================================  ===============  ===============  ========
  benchmark                                baseline         contender        change
  =======================================  ===============  ===============  ========
  FilterStringFilterWithNulls/262144/9     395.928 MiB/sec  397.664 MiB/sec  0.439%
  FilterInt64FilterWithNulls/262144/0      621.828 MiB/sec  613.884 MiB/sec  -1.277%
  FilterStringFilterWithNulls/262144/10    578.179 MiB/sec  577.449 MiB/sec  -0.126%
  FilterFSLInt64FilterWithNulls/262144/14  4.068 GiB/sec    4.018 GiB/sec    -1.247%
  FilterInt64FilterWithNulls/262144/13     604.515 MiB/sec  575.481 MiB/sec  -4.803%
  FilterFSLInt64FilterNoNulls/262144/13    350.875 MiB/sec  355.061 MiB/sec  1.193%
  FilterStringFilterWithNulls/262144/0     441.188 MiB/sec  442.379 MiB/sec  0.270%
  FilterInt64FilterWithNulls/262144/7      623.569 MiB/sec  594.423 MiB/sec  -4.674%
  FilterStringFilterWithNulls/262144/12    73.925 MiB/sec   73.930 MiB/sec   0.007%
  FilterStringFilterNoNulls/262144/3       548.889 MiB/sec  548.269 MiB/sec  -0.113%
  FilterInt64FilterNoNulls/262144/0        7.942 GiB/sec    8.079 GiB/sec    1.727%
  FilterInt64FilterNoNulls/262144/6        3.827 GiB/sec    3.725 GiB/sec    -2.665%
  FilterStringFilterWithNulls/262144/2     9.138 GiB/sec    9.205 GiB/sec    0.726%
  FilterFSLInt64FilterWithNulls/262144/13  385.938 MiB/sec  370.599 MiB/sec  -3.975%
  FilterInt64FilterWithNulls/262144/9      549.281 MiB/sec  542.112 MiB/sec  -1.305%
  FilterInt64FilterWithNulls/262144/2      5.253 GiB/sec    5.047 GiB/sec    -3.918%
  FilterFSLInt64FilterNoNulls/262144/5     5.778 GiB/sec    5.676 GiB/sec    -1.761%
  FilterStringFilterNoNulls/262144/1       711.705 MiB/sec  697.941 MiB/sec  -1.934%
  FilterStringFilterNoNulls/262144/0       560.111 MiB/sec  560.315 MiB/sec  0.036%
  FilterStringFilterWithNulls/262144/5     8.773 GiB/sec    8.976 GiB/sec    2.318%
  FilterInt64FilterWithNulls/262144/11     4.863 GiB/sec    4.942 GiB/sec    1.631%
  FilterFSLInt64FilterWithNulls/262144/11  4.145 GiB/sec    4.089 GiB/sec    -1.362%
  FilterInt64FilterNoNulls/262144/2        7.854 GiB/sec    7.609 GiB/sec    -3.117%
  FilterStringFilterNoNulls/262144/11      9.751 GiB/sec    9.565 GiB/sec    -1.904%
  FilterStringFilterNoNulls/262144/7       641.570 MiB/sec  650.710 MiB/sec  1.425%
  FilterStringFilterWithNulls/262144/3     435.185 MiB/sec  436.932 MiB/sec  0.401%
  FilterFSLInt64FilterNoNulls/262144/14    5.202 GiB/sec    5.302 GiB/sec    1.915%
  FilterInt64FilterNoNulls/262144/4        674.907 MiB/sec  654.585 MiB/sec  -3.011%
  FilterInt64FilterNoNulls/262144/5        7.023 GiB/sec    6.971 GiB/sec    -0.741%
  FilterInt64FilterWithNulls/262144/12     548.203 MiB/sec  542.909 MiB/sec  -0.966%
  FilterFSLInt64FilterNoNulls/262144/10    387.772 MiB/sec  390.564 MiB/sec  0.720%
  FilterInt64FilterWithNulls/262144/8      4.951 GiB/sec    5.094 GiB/sec    2.880%
  FilterStringFilterNoNulls/262144/13      90.750 MiB/sec   91.694 MiB/sec   1.040%
  FilterFSLInt64FilterWithNulls/262144/12  230.292 MiB/sec  263.113 MiB/sec  14.252%
  FilterStringFilterNoNulls/262144/12      70.772 MiB/sec   70.740 MiB/sec   -0.044%
  FilterStringFilterWithNulls/262144/14    927.254 MiB/sec  925.791 MiB/sec  -0.158%
  FilterStringFilterNoNulls/262144/5       10.587 GiB/sec   10.322 GiB/sec   -2.509%
  FilterFSLInt64FilterNoNulls/262144/3     551.473 MiB/sec  556.816 MiB/sec  0.969%
  FilterInt64FilterNoNulls/262144/14       6.302 GiB/sec    6.848 GiB/sec    8.656%
  FilterInt64FilterWithNulls/262144/14     4.804 GiB/sec    4.945 GiB/sec    2.933%
  FilterStringFilterNoNulls/262144/14      1.257 GiB/sec    1.247 GiB/sec    -0.814%
  FilterFSLInt64FilterNoNulls/262144/6     399.266 MiB/sec  402.455 MiB/sec  0.799%
  FilterInt64FilterWithNulls/262144/5      5.037 GiB/sec    4.954 GiB/sec    -1.645%
  FilterFSLInt64FilterNoNulls/262144/8     5.576 GiB/sec    5.576 GiB/sec    -0.004%
  FilterFSLInt64FilterNoNulls/262144/7     462.231 MiB/sec  456.668 MiB/sec  -1.203%
  FilterFSLInt64FilterNoNulls/262144/11    5.377 GiB/sec    5.381 GiB/sec    0.082%
  FilterStringFilterNoNulls/262144/6       487.645 MiB/sec  487.464 MiB/sec  -0.037%
  FilterStringFilterNoNulls/262144/4       687.214 MiB/sec  678.019 MiB/sec  -1.338%
  FilterFSLInt64FilterWithNulls/262144/9   287.916 MiB/sec  285.805 MiB/sec  -0.733%
  FilterInt64FilterNoNulls/262144/9        3.245 GiB/sec    3.126 GiB/sec    -3.683%
  FilterFSLInt64FilterWithNulls/262144/1   514.149 MiB/sec  501.235 MiB/sec  -2.512%
  FilterInt64FilterNoNulls/262144/11       6.304 GiB/sec    6.838 GiB/sec    8.471%
  FilterInt64FilterWithNulls/262144/4      642.597 MiB/sec  617.492 MiB/sec  -3.907%
  FilterFSLInt64FilterNoNulls/262144/0     723.263 MiB/sec  719.475 MiB/sec  -0.524%
  FilterFSLInt64FilterWithNulls/262144/2   4.335 GiB/sec    4.281 GiB/sec    -1.228%
  FilterStringFilterWithNulls/262144/8     8.635 GiB/sec    8.847 GiB/sec    2.451%
  FilterFSLInt64FilterWithNulls/262144/4   473.024 MiB/sec  457.711 MiB/sec  -3.237%
  FilterStringFilterWithNulls/262144/4     637.237 MiB/sec  646.187 MiB/sec  1.405%
  FilterStringFilterWithNulls/262144/6     430.118 MiB/sec  433.059 MiB/sec  0.684%
  FilterStringFilterNoNulls/262144/10      572.254 MiB/sec  573.892 MiB/sec  0.286%
  FilterStringFilterWithNulls/262144/1     644.800 MiB/sec  644.056 MiB/sec  -0.115%
  FilterStringFilterWithNulls/262144/7     635.644 MiB/sec  640.796 MiB/sec  0.810%
  FilterInt64FilterWithNulls/262144/6      581.863 MiB/sec  575.886 MiB/sec  -1.027%
  FilterFSLInt64FilterNoNulls/262144/4     513.508 MiB/sec  499.319 MiB/sec  -2.763%
  FilterInt64FilterNoNulls/262144/13       632.203 MiB/sec  613.689 MiB/sec  -2.928%
  FilterStringFilterNoNulls/262144/8       10.491 GiB/sec   10.181 GiB/sec   -2.953%
  FilterFSLInt64FilterNoNulls/262144/1     563.147 MiB/sec  540.663 MiB/sec  -3.993%
  FilterFSLInt64FilterNoNulls/262144/9     267.226 MiB/sec  269.194 MiB/sec  0.736%
  FilterFSLInt64FilterWithNulls/262144/10  420.329 MiB/sec  405.197 MiB/sec  -3.600%
- FilterInt64FilterNoNulls/262144/1        1.022 GiB/sec    922.850 MiB/sec  -11.845%
  FilterInt64FilterNoNulls/262144/7        652.709 MiB/sec  631.526 MiB/sec  -3.245%
  FilterStringFilterNoNulls/262144/2       11.144 GiB/sec   10.843 GiB/sec   -2.698%
  FilterStringFilterWithNulls/262144/13    91.231 MiB/sec   91.638 MiB/sec   0.446%
  FilterInt64FilterNoNulls/262144/12       3.242 GiB/sec    3.112 GiB/sec    -4.024%
  FilterFSLInt64FilterNoNulls/262144/12    242.297 MiB/sec  242.607 MiB/sec  0.128%
  FilterFSLInt64FilterNoNulls/262144/2     6.165 GiB/sec    6.062 GiB/sec    -1.679%
  FilterFSLInt64FilterWithNulls/262144/6   331.566 MiB/sec  332.386 MiB/sec  0.247%
  FilterInt64FilterWithNulls/262144/1      648.702 MiB/sec  622.712 MiB/sec  -4.006%
  FilterFSLInt64FilterWithNulls/262144/5   4.123 GiB/sec    4.122 GiB/sec    -0.014%
  FilterFSLInt64FilterWithNulls/262144/0   399.262 MiB/sec  398.338 MiB/sec  -0.231%
  FilterFSLInt64FilterWithNulls/262144/3   347.643 MiB/sec  349.930 MiB/sec  0.658%
  FilterInt64FilterNoNulls/262144/3        4.312 GiB/sec    4.291 GiB/sec    -0.478%
  FilterStringFilterWithNulls/262144/11    8.207 GiB/sec    8.348 GiB/sec    1.720%
  FilterStringFilterNoNulls/262144/9       391.780 MiB/sec  391.367 MiB/sec  -0.106%
  FilterFSLInt64FilterWithNulls/262144/8   4.142 GiB/sec    4.103 GiB/sec    -0.926%
  FilterInt64FilterNoNulls/262144/8        6.703 GiB/sec    6.908 GiB/sec    3.063%
  FilterInt64FilterWithNulls/262144/10     604.595 MiB/sec  575.671 MiB/sec  -4.784%
  FilterFSLInt64FilterWithNulls/262144/7   461.693 MiB/sec  447.411 MiB/sec  -3.093%
  FilterInt64FilterNoNulls/262144/10       632.128 MiB/sec  614.452 MiB/sec  -2.796%
  FilterInt64FilterWithNulls/262144/3      613.629 MiB/sec  607.939 MiB/sec  -0.927%
  =======================================  ===============  ===============  ========

@wesm
Copy link
Member Author

wesm commented Jun 17, 2020

+1. Thanks all for the comments

@wesm wesm closed this in d0f3b5f Jun 17, 2020
@wesm wesm deleted the ARROW-9075 branch June 17, 2020 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants