-
Notifications
You must be signed in to change notification settings - Fork 21
feat: change midpoint definition to cast integers to float #3767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
09723f4 to
2ff32eb
Compare
| } else if constexpr (std::is_same_v<std::decay_t<decltype(a)>, int32_t>) { | ||
| return std::midpoint(static_cast<float>(a), static_cast<float>(b)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not double? Float seems dangerous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would double the memory usage compared to what the user might expect (compared to the behavior today). The float64 type might then propagate when they do other operations on the resulting midpoints, it might be surprising and hard to debug.
This will indeed lead to wrong results for very large integers a and b that are close to each other.
But unless I'm mistaken the size of the error will be the same order of magnitude as the float rounding error in that region, so I don't think it's a big issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would double the memory usage compared to what the user might expect (compared to the behavior today).
Bin-edges are typically small. I am way more concerned about loss of precision.
The float64 type might then propagate when they do other operations on the resulting midpoints, it might be surprising and hard to debug.
The float32 type might then propagate when they do other operations on the resulting midpoints, it might be surprising and hard to debug. For example, float32 leads to loss of counts/precision when accumulating in a histogram.
This will indeed lead to wrong results for very large integers
aandbthat are close to each other. But unless I'm mistaken the size of the error will be the same order of magnitude as the float rounding error in that region, so I don't think it's a big issue.
The problem is not just large integers. Float32 is too lossy in quite a few cases, especially when it propagates to non-coord arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bin-edges are typically small. I am way more concerned about loss of precision.
Okay then we can convert to float64. 👍
But we still have the same issue when midpoints is called on large 64 bit integer inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is the same issue. Precision loss with 64 bit is much much less likely.
|
@jl-wynen you had some comments yesterday when we talked, now I don't remember exactly what but I remember thinking that it was a good point. Was it basically what you already wrote in the issue, or was there something else? |
Fixes #3765
This returns something closer to what most users would expect.
It might fail for very large integers.