Skip to content

Add tolerance parameter to events_from_annotations to  #12321

@rcmdnk

Description

@rcmdnk

Describe the new feature or enhancement

Add tol parameter to annotations_from_events to avoid omitting epochs by rounding errors.

Describe your proposed implementation

In the function named events_from_annotations, there is a section of code as follows:

for annot in annotations[event_sel]:
      annot_offset = annot["onset"] + annot["duration"]
      _onsets = np.arange(
          start=annot["onset"], stop=annot_offset, step=chunk_duration
      )
      good_events = annot_offset - _onsets >= chunk_duration 
      if good_events.any():

However, if we consider values like

annot['onset']=32760.12
annot['duration']=30.0
chunk_duration=30.0

In theory, this region should be obtained as a 30-second epoch, but due to rounding errors, it results in:

>>> annot_offset = annot['onset'] + annot['duration']
>>> print(annot_offset - np.arange(annot['onset'], annot_offset,  30.0) - 30.0)
array([-3.63797881e-12])

This causes good_events to contain False, and thus, this epoch is not captured.

However, in reality, this interval's data should be fully treated as an epoch. Therefore, I'm considering introducing a tolerance and modifying the code like this:

good_events = annot_offset - _onsets >= chunk_duration + tol

By adding a small tol, we can avoid the situation mentioned above. If tol is sufficiently smaller than the frequency, it should not erroneously include intervals that are not supposed to be selected.
This is safe as long as use_rounding = True.

If use_rounding = False, the first data of the epoch could be overlapped with the previous epoch. (if tol = 0, such an epoch is not created.)
But even this case, I think it is better to keep the epoch.

By setting the default value of tol to 0, we can maintain the same results as currently obtained, as long as it is used in the same way as now.

Describe possible alternatives

Currently, I am using Annotations that are precisely segmented into 30-second intervals. By setting chunk_duration = 0, I can create indexes without needing to calculate the duration.

This approach ensures that all epochs are included in the output data.

However, sometimes the annotations have durations that are multiple times the length of an epoch. In these cases, we need to modify the annotations before applying them to the Raw data.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions