Skip to content

Changes (Ptr Word8, a) to PeekResult {-# UNPACK #-} (Ptr Word8) !a#98

Merged
mgsloan merged 1 commit intomgsloan:masterfrom
VyacheslavHashov:strict-result
Feb 18, 2017
Merged

Changes (Ptr Word8, a) to PeekResult {-# UNPACK #-} (Ptr Word8) !a#98
mgsloan merged 1 commit intomgsloan:masterfrom
VyacheslavHashov:strict-result

Conversation

@VyacheslavHashov
Copy link
Copy Markdown
Contributor

I am writing encoders/decoders for PostgreSQL binary protocol using store-core and I have found that parsers with strict structures work about 20% faster in my case. Your benchmark shows slight performance gains too.

Current version:

benchmarking decode/ (Vector Int)
time                 498.3 ns   (498.3 ns .. 498.4 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 498.3 ns   (498.3 ns .. 498.4 ns)
std dev              166.8 ps   (135.5 ps .. 223.5 ps)

benchmarking decode/1kb storable (Vector Int32)
time                 52.96 ns   (52.92 ns .. 53.01 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 53.04 ns   (53.01 ns .. 53.09 ns)
std dev              141.4 ps   (110.5 ps .. 188.8 ps)

benchmarking decode/10kb storable (Vector Int32)
time                 282.7 ns   (282.4 ns .. 282.9 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 282.7 ns   (282.5 ns .. 282.9 ns)
std dev              628.2 ps   (502.4 ps .. 807.2 ps)

benchmarking decode/1kb normal (Vector Int32)
time                 1.225 μs   (1.223 μs .. 1.229 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.230 μs   (1.225 μs .. 1.236 μs)
std dev              15.87 ns   (10.02 ns .. 22.36 ns)
variance introduced by outliers: 11% (moderately inflated)

benchmarking decode/10kb normal (Vector Int32)
time                 12.38 μs   (12.36 μs .. 12.43 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 12.39 μs   (12.36 μs .. 12.52 μs)
std dev              165.3 ns   (4.813 ns .. 379.8 ns)

benchmarking decode/ (Vector SmallProduct)
time                 2.349 μs   (2.347 μs .. 2.350 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.350 μs   (2.349 μs .. 2.351 μs)
std dev              4.135 ns   (3.487 ns .. 5.236 ns)

benchmarking decode/ (Vector SmallProductManual)
time                 1.508 μs   (1.507 μs .. 1.509 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.508 μs   (1.507 μs .. 1.510 μs)
std dev              4.189 ns   (2.095 ns .. 8.388 ns)

benchmarking decode/ (Vector SmallSum)
time                 1.629 μs   (1.627 μs .. 1.632 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.632 μs   (1.629 μs .. 1.641 μs)
std dev              17.24 ns   (4.673 ns .. 31.10 ns)

benchmarking decode/ (Vector SmallSumManual)
time                 990.8 ns   (990.2 ns .. 991.8 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 991.3 ns   (990.6 ns .. 994.1 ns)
std dev              3.658 ns   (1.048 ns .. 7.950 ns)

benchmarking decode/ (Vector ((Int,Int),(Int,Int)))
time                 1.303 μs   (1.302 μs .. 1.304 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.303 μs   (1.302 μs .. 1.303 μs)
std dev              1.769 ns   (1.612 ns .. 2.052 ns)

benchmarking decode/ (Vector SomeData)
time                 2.069 μs   (2.068 μs .. 2.070 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.070 μs   (2.069 μs .. 2.070 μs)
std dev              2.211 ns   (1.636 ns .. 3.016 ns)

With strict custom structure:

benchmarking decode/ (Vector Int)
time                 491.6 ns   (491.5 ns .. 491.6 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 491.6 ns   (491.6 ns .. 491.6 ns)
std dev              73.37 ps   (59.71 ps .. 90.25 ps)

benchmarking decode/1kb storable (Vector Int32)
time                 51.07 ns   (51.07 ns .. 51.09 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 51.14 ns   (51.12 ns .. 51.17 ns)
std dev              96.97 ps   (77.08 ps .. 113.0 ps)

benchmarking decode/10kb storable (Vector Int32)
time                 279.3 ns   (279.0 ns .. 279.5 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 279.0 ns   (278.8 ns .. 279.2 ns)
std dev              725.8 ps   (609.2 ps .. 895.4 ps)

benchmarking decode/1kb normal (Vector Int32)
time                 1.233 μs   (1.228 μs .. 1.243 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.230 μs   (1.228 μs .. 1.236 μs)
std dev              8.810 ns   (208.5 ps .. 20.13 ns)

benchmarking decode/10kb normal (Vector Int32)
time                 12.50 μs   (12.50 μs .. 12.51 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 12.52 μs   (12.52 μs .. 12.52 μs)
std dev              6.185 ns   (4.934 ns .. 8.016 ns)

benchmarking decode/ (Vector SmallProduct)
time                 1.986 μs   (1.978 μs .. 1.990 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.983 μs   (1.979 μs .. 1.986 μs)
std dev              11.67 ns   (9.556 ns .. 15.60 ns)

benchmarking decode/ (Vector SmallProductManual)
time                 1.433 μs   (1.433 μs .. 1.434 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.434 μs   (1.433 μs .. 1.434 μs)
std dev              814.6 ps   (555.6 ps .. 1.341 ns)

benchmarking decode/ (Vector SmallSum)
time                 1.424 μs   (1.423 μs .. 1.424 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.424 μs   (1.424 μs .. 1.424 μs)
std dev              477.9 ps   (386.2 ps .. 575.3 ps)

benchmarking decode/ (Vector SmallSumManual)
time                 832.8 ns   (832.5 ns .. 833.0 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 833.0 ns   (832.9 ns .. 833.2 ns)
std dev              344.6 ps   (172.9 ps .. 655.0 ps)

benchmarking decode/ (Vector ((Int,Int),(Int,Int)))
time                 1.331 μs   (1.331 μs .. 1.332 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.332 μs   (1.332 μs .. 1.333 μs)
std dev              1.451 ns   (1.158 ns .. 1.743 ns)

benchmarking decode/ (Vector SomeData)
time                 1.691 μs   (1.689 μs .. 1.693 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.690 μs   (1.688 μs .. 1.692 μs)
std dev              5.669 ns   (5.147 ns .. 6.221 ns)

@mgsloan
Copy link
Copy Markdown
Owner

mgsloan commented Feb 18, 2017

Makes sense, thanks!

@mgsloan mgsloan merged commit db9ccc7 into mgsloan:master Feb 18, 2017
mgsloan added a commit that referenced this pull request Feb 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants