-
Notifications
You must be signed in to change notification settings - Fork 1
Resample TOA-TOF-converted histogram data using rebin
#245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| of the strictly increasing sections, with a step size equal to the difference | ||
| between the first two values of the section with the minimum start value (which is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with a step size equal to the difference between the first two values of the section with the minimum start value
In the previous verison, we were taking like a mean bin size. I felt it's was little safer than just taking the first one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was to minimize the amount of resampling performed. This implementation should effectively "do nothing" for one of the frames. Can we keep it like this and see if it works well in practice?
| between the first two values of the section with the minimum start value (which is | ||
| not necessarily the first section). | ||
| """ | ||
| min_val, max_val = get_min_max(var, dim=dim, slices=slices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we searching on the slices instead of just a nanmin/nanmax of var because the bounds of var might not belong to a section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, we want to get the range that the union of sections covers.
| return sc.arange( | ||
| dim=dim, | ||
| start=min_val.value, | ||
| stop=max_val.value + step.value, # Ensure the last bin edge is included |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember struggling with arange and step before, with cases where if max_val was an exact multiple of step, then the max of the arange would be max_val.
E.g.

When using this to histogram afterwards, the max_val would then be dropped (because we include the min value and exclude the max when using bin edges).
Can the same thing happen here? (or maybe rebin does not have the same issue as hist?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebin does have the same issue as hist. But this case is different: We did not compute a max of events. We computed a max of bin edges, so the resulting value should already be an epsilon above the last event (if there were events, which there are not).
Or did you ask about sth. else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it wasn't something different. Because it's max of bin edges, it's not an issue here. I think that's correct.
| assert sections == [slice(0, 2), slice(2, 5), slice(5, 8)] | ||
|
|
||
| def test_given_flat_sections_finds_strictly_increasing_parts_only(self): | ||
| var = sc.array(dims=['x'], values=[1, 2, 2, 3, 4, 4, 5]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: what happens with something like [1, 2, 2, 2, 3, 4, 4, 5]? (note the three 2s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test now.
| assert max_val.value == 50 # The max value is at the end of the second slice | ||
|
|
||
| def test_with_float_data(self): | ||
| var = sc.array(dims=['x'], values=[1.1, 2.2, 3.3, 4.4, 5.5]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it also work with datetimes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect datetime input coords? Does tof support that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't expect datetimes. I was just wondering if it could be used as a general tool.
But let's not complicate things.
nvaytet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for getting this to work in a proper way 👍
This replaces the$N\rightarrow\infty$
to_eventsmechanism, which introduced random noise in the bin-edges, causing problems in stream processing. Forto_eventsshould produce the same result as therebinsolution (module differences in how the bin size is chosen).This also fixes the (I think) bug in
to_events, which included counts from unphysical negative-size bins between frames.