benchmark.cc: Default to inverted mode, add small_digits mode.#74
Merged
Conversation
First, this extracts generate_float() and generate_double(). That eliminates the `r` integers, so we need another way to print the exact data in verbose mode. C99's hexfloat conversion specifiers are easy to use. "%.6a" and "%.13a" print enough hexits for round-tripping floats and doubles. Finally, we can also simplify %lf to %f; the arguments are doubles (and C11 says that the 'l' length modifier "has no effect on a following a, A, e, E, f, F, g, or G conversion specifier").
This makes it easier to pass options to bench32() and bench64().
This option stresses Ryu's codepaths for small integers. It accepts values in the range [1, 7]. (32-bit floats have insufficient precision for larger values. With a little work, this range could be extended for 64-bit doubles, if benchmarking moderate-length integers is interesting.) This also modifies verbose mode to print ryu_output, so we can see what Ryu is emitting (and verify that small_digits mode is actually testing small integers). As the example in the comment explains, "-small_digits=3" tests values in the range [1.00, 9.99]. These will be printed as: 1E0, 1.01E0, ..., 1.09E0, 1.1E0, 1.11E0, ..., 9.98E0, 9.99E0 That is, there are a few 1-digit and 2-digit values, although most are 3-digit (and none are longer). Currently, shorter output appears to be more stressful for doubles: ``` 64: 118.619 1.991 (x86 benchmark_clang -ryu -64) 64: 277.499 3.048 (x86 benchmark_clang -ryu -64 -small_digits=7) 64: 306.753 2.787 (x86 benchmark_clang -ryu -64 -small_digits=6) 64: 327.964 3.427 (x86 benchmark_clang -ryu -64 -small_digits=5) 64: 347.708 2.876 (x86 benchmark_clang -ryu -64 -small_digits=4) 64: 369.915 2.371 (x86 benchmark_clang -ryu -64 -small_digits=3) 64: 403.309 9.321 (x86 benchmark_clang -ryu -64 -small_digits=2) 64: 477.200 3.409 (x86 benchmark_clang -ryu -64 -small_digits=1) 64: 42.266 1.270 (x64 benchmark_clang -ryu -64) 64: 45.798 1.356 (x64 benchmark_clang -ryu -64 -small_digits=7) 64: 47.418 1.454 (x64 benchmark_clang -ryu -64 -small_digits=6) 64: 49.004 1.464 (x64 benchmark_clang -ryu -64 -small_digits=5) 64: 50.620 1.209 (x64 benchmark_clang -ryu -64 -small_digits=4) 64: 52.759 1.275 (x64 benchmark_clang -ryu -64 -small_digits=3) 64: 55.585 1.402 (x64 benchmark_clang -ryu -64 -small_digits=2) 64: 66.844 1.378 (x64 benchmark_clang -ryu -64 -small_digits=1) ``` Interestingly, floats behave similarly except that "unlimited" digits are slower than -small_digits=7. I'm not sure why this is the case. ``` 32: 42.478 1.558 (x86 benchmark_clang -ryu -32) 32: 33.758 1.145 (x86 benchmark_clang -ryu -32 -small_digits=7) 32: 35.518 1.048 (x86 benchmark_clang -ryu -32 -small_digits=6) 32: 36.035 1.113 (x86 benchmark_clang -ryu -32 -small_digits=5) 32: 37.629 0.999 (x86 benchmark_clang -ryu -32 -small_digits=4) 32: 39.157 1.061 (x86 benchmark_clang -ryu -32 -small_digits=3) 32: 45.113 1.027 (x86 benchmark_clang -ryu -32 -small_digits=2) 32: 55.080 1.227 (x86 benchmark_clang -ryu -32 -small_digits=1) 32: 30.599 1.528 (x64 benchmark_clang -ryu -32) 32: 23.771 0.907 (x64 benchmark_clang -ryu -32 -small_digits=7) 32: 24.571 1.140 (x64 benchmark_clang -ryu -32 -small_digits=6) 32: 25.138 0.864 (x64 benchmark_clang -ryu -32 -small_digits=5) 32: 26.579 1.020 (x64 benchmark_clang -ryu -32 -small_digits=4) 32: 27.664 1.095 (x64 benchmark_clang -ryu -32 -small_digits=3) 32: 30.341 1.405 (x64 benchmark_clang -ryu -32 -small_digits=2) 32: 32.580 1.129 (x64 benchmark_clang -ryu -32 -small_digits=1) ```
Owner
|
I was using the int output to generate the graphs in the paper (with gnuplot). I'd prefer to keep that; I'm not sure this can easily be changed in bash or gnuplot. |
Contributor
Author
|
Restored! Also, I looked at gnuplot.template but couldn't figure out how to adapt it to the addition of ryu_output; is there an easy way to do that, or would it tolerate the string field being moved to the end? I think it's useful but of course I don't want to break your graphs. If necessary, I could add yet another option to emit ryu_output. |
Owner
|
I'll take a look. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note: If the hexfloat change is undesirable, I can restore the original behavior with a tiny bit of work.
benchmark.cc: Reject unrecognized options.
benchmark.cc: Print hexfloats in verbose mode.
First, this extracts generate_float() and generate_double().
That eliminates the
rintegers, so we need another way to print the exact data in verbose mode. C99's hexfloat conversion specifiers are easy to use. "%.6a" and "%.13a" print enough hexits for round-tripping floats and doubles.Finally, we can also simplify %lf to %f; the arguments are doubles (and C11 says that the 'l' length modifier "has no effect on a following a, A, e, E, f, F, g, or G conversion specifier").
benchmark.cc: Default to inverted mode, add "-classic".
benchmark.cc: Extract benchmark_options.
This makes it easier to pass options to bench32() and bench64().
benchmark.cc: Validate samples and iterations options.
benchmark.cc: Add "-small_digits=%i".
This option stresses Ryu's codepaths for small integers. It accepts values in the range [1, 7]. (32-bit floats have insufficient precision for larger values. With a little work, this range could be extended for 64-bit doubles, if benchmarking moderate-length integers is interesting.)
This also modifies verbose mode to print ryu_output, so we can see what Ryu is emitting (and verify that small_digits mode is actually testing small integers).
As the example in the comment explains, "-small_digits=3" tests values in the range [1.00, 9.99]. These will be printed as:
1E0, 1.01E0, ..., 1.09E0, 1.1E0, 1.11E0, ..., 9.98E0, 9.99E0
That is, there are a few 1-digit and 2-digit values, although most are 3-digit (and none are longer).
Currently, shorter output appears to be more stressful for doubles:
Interestingly, floats behave similarly except that "unlimited" digits are slower than -small_digits=7. I'm not sure why this is the case.