Skip to content

Conversation

@jokasimr
Copy link
Contributor

@jokasimr jokasimr commented Sep 24, 2025

Fixes #3765

This returns something closer to what most users would expect.

It might fail for very large integers.

@jokasimr jokasimr requested a review from jl-wynen September 24, 2025 12:47
Comment on lines 164 to 165
} else if constexpr (std::is_same_v<std::decay_t<decltype(a)>, int32_t>) {
return std::midpoint(static_cast<float>(a), static_cast<float>(b));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not double? Float seems dangerous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would double the memory usage compared to what the user might expect (compared to the behavior today). The float64 type might then propagate when they do other operations on the resulting midpoints, it might be surprising and hard to debug.

This will indeed lead to wrong results for very large integers a and b that are close to each other.
But unless I'm mistaken the size of the error will be the same order of magnitude as the float rounding error in that region, so I don't think it's a big issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would double the memory usage compared to what the user might expect (compared to the behavior today).

Bin-edges are typically small. I am way more concerned about loss of precision.

The float64 type might then propagate when they do other operations on the resulting midpoints, it might be surprising and hard to debug.

The float32 type might then propagate when they do other operations on the resulting midpoints, it might be surprising and hard to debug. For example, float32 leads to loss of counts/precision when accumulating in a histogram.

This will indeed lead to wrong results for very large integers a and b that are close to each other. But unless I'm mistaken the size of the error will be the same order of magnitude as the float rounding error in that region, so I don't think it's a big issue.

The problem is not just large integers. Float32 is too lossy in quite a few cases, especially when it propagates to non-coord arrays.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bin-edges are typically small. I am way more concerned about loss of precision.

Okay then we can convert to float64. 👍

But we still have the same issue when midpoints is called on large 64 bit integer inputs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is the same issue. Precision loss with 64 bit is much much less likely.

@jokasimr
Copy link
Contributor Author

@jl-wynen you had some comments yesterday when we talked, now I don't remember exactly what but I remember thinking that it was a good point. Was it basically what you already wrote in the issue, or was there something else?

@jokasimr jokasimr merged commit 7e1eab7 into main Sep 29, 2025
4 checks passed
@jokasimr jokasimr deleted the change-midpoint branch September 29, 2025 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

midpoints with integer input returns left edges

3 participants