-
Notifications
You must be signed in to change notification settings - Fork 24.4k
optimizing d2string() and addReplyDouble() with grisu2: double to string conversion based on Florian Loitsch's Grisu-algorithm #10587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…h's Grisu-algorithm
zuiderkwast
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I didn't read the deps code.
|
|
||
| Fast and accurate double to string conversion based on Florian Loitsch's Grisu-algorithm[1]. | ||
|
|
||
| This port contains a subset of the 'C' version of Fast and accurate double to string conversion based on Florian Loitsch's Grisu-algorithm available at [github.com/night-shift/fpconv](https://github.com/night-shift/fpconv)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a subset of the files from that repo?
It would be useful to include some info about how to update the lib in the future. If some change is needed, it could be included as a patch file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree. will prepare the initial patch @zuiderkwast
Co-authored-by: sundb <sundbcn@gmail.com>
|
@oranagra @sundb and @zuiderkwast I revamped this PR and on my personal laptop ( os:Darwin 21.6.0 arm64 , compier:Apple clang version 14.0.0 (clang-1400.0.29.102) ) redoing the same zrange benchmarks resulted in larger inputs now comparing unstable vs this PR. The improvement is even larger. I've retriggered the "official automation" so lets wait for the grafana data to be populated, but it all points to a good improvement. To test it I simply ran ( after the pre-poluation described in the top comment ) : achievable ops/sec on a standalone scenario
here's the full output of memtier: unstablethis PR |
|
@filipecosta90 i see some tests are failing, please look into it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yossigo please comment about the dependency and its license.
yossigo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oranagra LGTM.
WRT to the failures, I've fixed the code/tests accordingly. Tomorrow I'll update the top comment with further data and finalized charts. WRT to the April staleness, I guess it was due to the fact this was initially marked as a 7.2 effort? "@oranagra oranagra added this to backlog in 7.2 via automation on Apr 17" |
tests/unit/type/zset.tcl
Outdated
| assert_encoding $encoding zscoretest | ||
| for {set i 0} {$i < $elements} {incr i} { | ||
| assert_equal [lindex $aux $i] [r zscore zscoretest $i] | ||
| assert_equal [format %.17g [lindex $aux $i]] [format %.17g [r zscore zscoretest $i]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please explain why this formatting is needed.
maybe in a code comment as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oranagra fixed it in the latest commit. Please review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that format is used in 3 places, so the comment may be needed in all, or at least some cross reference.
but i'm still not sure i understand what changed here in that respect and why we needed that change, can you please explain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oranagra WRT to the 3 places I've added a note on the FP representation and made reference on all 3.
WRT to why we needed that change it was because snprintf could output more digits than precision digits required to represent the number ( see https://www.ncl.ucar.edu/Support/talk_archives/2013/2176.html ). To ensure that there is no difference I'm making sure that we use the same format to compare the string representation instead of the doubles. Notice both snprintf and the grisu2 produce IEEE compliant representations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does assert_equal compare the values as strings or as floats?
If we compare the string representation, I agree they can differ for the same float value, but the float values should be equal since there are no float calculations, thus no rounding errors involved.
% expr 0.8 == 0.79999999999999999
1
% expr 0.8 == 0.8e0
1
% expr 0.79999999999999999
0.8
Therefore, I think we can compare them as floats and expect them to be equal. We can use expr to convert a string to float.
Another way is of course to convert the string to float and then back to string using format %.17g but it seems slightly more complex to me.
Btw, an implicit conversion from float to string seems to give 17 digits precision in TCL 8.6 anyway:
$ tclsh
% expr 0.12345678901234567890
0.12345678901234568
% puts [expr 0.12345678901234567890]
0.12345678901234568
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oranagra WRT
The code comments may be OK, my recent questions about the test suite changes are if you can explain here why these changes are needed.
I.e. If the new algorithm was completely compatible, no changes would be required. So I want to understand the differences.
Here's a sample of the failed test:
Expected '0.62137966259446908' to be equal to '0.6213796625944691'
Notice that the comparisons are string. If we convert them to double they represent the same number as showcased bellow:
:~# tclsh
% expr 0.62137966259446908
0.6213796625944691
% expr 0.6213796625944691
0.6213796625944691
%
Notice that the last string is the output of the grisu2, meaning we're replying in the exact IEEE required format representation.
@zuiderkwast you're indeed right. I was converting reply to double and then back to string with the format specified. I've changed the comparison towards your recommendation and there is no need to do the conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice that you agree. :-)
|
@oranagra I believe all your comments were addressed. Can you check this again? |
|
The code comments may be OK, my recent questions about the test suite changes are if you can explain here why these changes are needed. |
@oranagra I've replied on the @zuiderkwast comment: #10587 (comment) and adjusted the comments and comparison accordingly. |
…ing conversion based on Florian Loitsch's Grisu-algorithm (redis#10587) All commands / use cases that heavily rely on double to a string representation conversion, (e.g. meaning take a double-precision floating-point number like 1.5 and return a string like "1.5" ), could benefit from a performance boost by swapping snprintf(buf,len,"%.17g",value) by the equivalent [fpconv_dtoa](https://github.com/night-shift/fpconv) or any other algorithm that ensures 100% coverage of conversion. This is a well-studied topic and Projects like MongoDB. RedPanda, PyTorch leverage libraries ( fmtlib ) that use the optimized double to string conversion underneath. The positive impact can be substantial. This PR uses the grisu2 approach ( grisu explained on https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf section 5 ). test suite changes: Despite being compatible, in some cases it produces a different result from printf, and some tests had to be adjusted. one case is that `%.17g` (which means %e or %f which ever is shorter), chose to use `5000000000` instead of 5e+9, which sounds like a bug? In other cases, we changed TCL to compare numbers instead of strings to ignore minor rounding issues (`expr 0.8 == 0.79999999999999999`)
…ing conversion based on Florian Loitsch's Grisu-algorithm (redis#10587) All commands / use cases that heavily rely on double to a string representation conversion, (e.g. meaning take a double-precision floating-point number like 1.5 and return a string like "1.5" ), could benefit from a performance boost by swapping snprintf(buf,len,"%.17g",value) by the equivalent [fpconv_dtoa](https://github.com/night-shift/fpconv) or any other algorithm that ensures 100% coverage of conversion. This is a well-studied topic and Projects like MongoDB. RedPanda, PyTorch leverage libraries ( fmtlib ) that use the optimized double to string conversion underneath. The positive impact can be substantial. This PR uses the grisu2 approach ( grisu explained on https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf section 5 ). test suite changes: Despite being compatible, in some cases it produces a different result from printf, and some tests had to be adjusted. one case is that `%.17g` (which means %e or %f which ever is shorter), chose to use `5000000000` instead of 5e+9, which sounds like a bug? In other cases, we changed TCL to compare numbers instead of strings to ignore minor rounding issues (`expr 0.8 == 0.79999999999999999`)
Since lua_Number is not explicitly an integer or a double, we need to
make an effort
to convert it as an integer when that's possible, since the string could
later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead
of 100000000).
Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`,
which ever is shorter,
this would break if we try to pass a long integer number to a command
that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing
in getLongLongFromObjectOrReply will fail.
```
> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.
```
Switch to using ll2string if the number can be safely represented as a
long long.
The problem was introduced in #10587 (Redis 7.2).
closes #13113.
---------
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Since lua_Number is not explicitly an integer or a double, we need to
make an effort
to convert it as an integer when that's possible, since the string could
later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead
of 100000000).
Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`,
which ever is shorter,
this would break if we try to pass a long integer number to a command
that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing
in getLongLongFromObjectOrReply will fail.
```
> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.
```
Switch to using ll2string if the number can be safely represented as a
long long.
The problem was introduced in redis#10587 (Redis 7.2).
closes redis#13113.
---------
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 5fdaa53)
Since lua_Number is not explicitly an integer or a double, we need to
make an effort
to convert it as an integer when that's possible, since the string could
later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead
of 100000000).
Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`,
which ever is shorter,
this would break if we try to pass a long integer number to a command
that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing
in getLongLongFromObjectOrReply will fail.
```
> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.
```
Switch to using ll2string if the number can be safely represented as a
long long.
The problem was introduced in #10587 (Redis 7.2).
closes #13113.
---------
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 5fdaa53)
Since lua_Number is not explicitly an integer or a double, we need to
make an effort
to convert it as an integer when that's possible, since the string could
later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead
of 100000000).
Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`,
which ever is shorter,
this would break if we try to pass a long integer number to a command
that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing
in getLongLongFromObjectOrReply will fail.
```
> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.
```
Switch to using ll2string if the number can be safely represented as a
long long.
The problem was introduced in redis#10587 (Redis 7.2).
closes redis#13113.
---------
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Fixes #8826 8826
As pointed out on #8826 all commands / use cases that heavily rely on double to a string representation conversion, (e.g. meaning take a double-precision floating-point number like 1.5 and return a string like "1.5" ), could benefit from a performance boost by swapping snprintf(buf,len,"%.17g",value) by the equivalent fpconv_dtoa or any other algorithm that ensures 100% coverage of conversion.
This is a well-studied topic and Projects like MongoDB. RedPanda, PyTorch leverage libraries ( fmtlib ) that use the optimized double to string conversion underneath.
The positive impact can be substantial. This PR uses the grisu2 approach ( grisu explained on https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf section 5 ).
test suite changes
Despite being compatible, in some cases it produces a different result from printf, and some tests had to be adjusted.
one case is that
%.17g(which means %e or %f which ever is shorter), chose to use5000000000instead of 5e+9, which sounds like a bug?In other cases, we changed TCL to compare numbers instead of strings to ignore minor rounding issues (
expr 0.8 == 0.79999999999999999)Impact on Sorted Sets
READS
If we look at ZRANGE WITHSCORES command impact we saw 23% improvement on the achievable ops/sec on replies with 10 elements, 50% on replies with 100 elements and 68% on replies with 1000 elements. For more details check the section below.
Testing positive impact on ZSET commands (with double scores) :
This can be easily tested using https://github.com/redis/redis-benchmarks-specification ( install details on repo )
If you want to manually replicate the results, then follow the next steps:
To populate the data
To benchmark (swap keyname for 10 and 100 variants) :
results comparing unstable ( 9ab873d ) vs this PR:
achiavable ops/sec on a standalone scenario
overall client side p50(ms)