Skip to content

MRG: Brainvision events_from_annotations#6326

Merged
massich merged 10 commits intomne-tools:masterfrom
massich:6276_mod
May 17, 2019
Merged

MRG: Brainvision events_from_annotations#6326
massich merged 10 commits intomne-tools:masterfrom
massich:6276_mod

Conversation

@massich
Copy link
Copy Markdown
Contributor

@massich massich commented May 16, 2019

Closes #6267
Closes #6276

@larsoner
Copy link
Copy Markdown
Member

Can you quickly say what's different relative to #6276?

@codecov
Copy link
Copy Markdown

codecov bot commented May 16, 2019

Codecov Report

Merging #6326 into master will increase coverage by <.01%.
The diff coverage is 98.21%.

@@            Coverage Diff             @@
##           master    #6326      +/-   ##
==========================================
+ Coverage   89.23%   89.23%   +<.01%     
==========================================
  Files         417      417              
  Lines       75420    75460      +40     
  Branches    12417    12424       +7     
==========================================
+ Hits        67302    67339      +37     
- Misses       5231     5233       +2     
- Partials     2887     2888       +1

@massich
Copy link
Copy Markdown
Contributor Author

massich commented May 16, 2019

It simplifies the function and addresses some of the issues it had. But again this is the same as having a huge dict with the equivalences and return it.

@larsoner
Copy link
Copy Markdown
Member

@massich do you have more commits to push? If not, I can try and set it non-WIP

@jona-sassenhagen okay for you as requested by @cbrnr not to drop any of the events? I agree that just doing 2001, 2002, ... for all non-stim, non-resp BV markers in a sorted list would work. In my mind that's the one last unresolved issue. I can push a commit for this, too, if we agree it's useful (or @massich can, but it's post-work-hours in Europe now).

@jona-sassenhagen
Copy link
Copy Markdown
Contributor

Yes ok with me.

@larsoner larsoner marked this pull request as ready for review May 16, 2019 19:30
@larsoner larsoner changed the title alternative to integrate into #6276 MRG: Brainvision events_from_annotations May 16, 2019
@larsoner
Copy link
Copy Markdown
Member

@jona-sassenhagen @cbrnr @massich can you look and make sure that you're happy with the docstring describing the behavior of this? I've also modified tests that hopefully verify it works according to the description.

@larsoner larsoner added this to the 0.18 milestone May 16, 2019
@cbrnr
Copy link
Copy Markdown
Contributor

cbrnr commented May 16, 2019

The docstring looks good! If everything works as advertised, +1 for merge!

@larsoner
Copy link
Copy Markdown
Member

Just pushed a commit to shorten the docstring, but the changes are minor

@massich
Copy link
Copy Markdown
Contributor Author

massich commented May 16, 2019

@larsoner ok, I just saw what you did. The only fuzzy thing is why we need lambda on the default?
here https://github.com/mne-tools/mne-python/pull/6326/files#diff-3552c91528439a66cb184c8d9b9cfcb5R873

is it 'cos we need something that returns None once called? I find it more tricky to read but I like it.

@massich
Copy link
Copy Markdown
Contributor Author

massich commented May 16, 2019

Shall we backport this one to 0.17? Will we be releasing the last 0.17.X?

@larsoner
Copy link
Copy Markdown
Member

Usually after releasing a new version we stop backporting things, so no probably not. FYI here is the CircleCI rendering

https://13163-1301584-gh.circle-artifacts.com/0/dev/generated/mne.events_from_annotations.html

@larsoner
Copy link
Copy Markdown
Member

is it 'cos we need something that returns None once called? I find it more tricky to read but I like it.

Yes it's because we need the () at the end, it just saves us from a conditional branch (the conditional is implicit in the get)

@larsoner
Copy link
Copy Markdown
Member

There is a merge conflict now, feel free to rebase

'New Segment/', 'Stimulus/S253', 'Stimulus/S255', 'Stimulus/S254',
'Stimulus/S255', 'Stimulus/S254', 'Stimulus/S255', 'Stimulus/S253',
'Stimulus/S255', 'Response/R255', 'Stimulus/S254', 'Stimulus/S255',
'SyncStatus/Sync On', 'Optic/O 1'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also include an S 10 in there to ensure it's parsed correctly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean like 0f1759e ?

@cbrnr
Copy link
Copy Markdown
Contributor

cbrnr commented Jun 4, 2019

I don't know, I would not expect a FIF file to be parsed like a BV file, even if we're just talking about events. I'd prefer events_from_annotations(raw, 'brainvision') to get back the desired events.

@larsoner
Copy link
Copy Markdown
Member

larsoner commented Jun 4, 2019

I don't know, I would not expect a FIF file to be parsed like a BV file, even if we're just talking about events.

I agree for FIF files from the Neuromag scanner, but not for BV files that we have read in and then written to FIF format (mostly) just because it's the only output format we support. The problematic case that @jona-sassenhagen brings up seems to show the problem, basically:

raw = read_raw_brainvision(...)
raw.save(fname)
raw_read = read_raw_fif(fname)
events_from_annotations(raw_read) != events_from_annotations(raw)

This behavior is not so good.

In my mind the question is how big is the risk that someone using a non-BV file will add a BV-like event? Given the specificity of the event naming I think it's pretty low and we should just go to a any-BV-like parsing method.

@drammock
Copy link
Copy Markdown
Member

drammock commented Jun 4, 2019

+1 for the approach of looking at the annotations, and printing a warning if any of them look like a BV-style annotation and mode != 'bv'

candidate for backport?

@larsoner
Copy link
Copy Markdown
Member

larsoner commented Jun 4, 2019

Yeah we'll want to backport. Sounds like people prefer to stick with the class-based check / my original 3-step proposal above.

@massich can you look?

@agramfort
Copy link
Copy Markdown
Member

agramfort commented Jun 4, 2019 via email

@cbrnr
Copy link
Copy Markdown
Contributor

cbrnr commented Jun 5, 2019

@larsoner yes I think your approach is fine, but why do you need two parameters mode and event_id? Currently, mne.events_from_annotations only has an event_id parameter, which should suffice to implement what you suggest. Let's not add a new mode parameter if we don't really need it.

@agramfort
Copy link
Copy Markdown
Member

agramfort commented Jun 5, 2019 via email

@massich
Copy link
Copy Markdown
Contributor Author

massich commented Jun 5, 2019

let me rephrase based on @larsoner snippet:

raw = read_raw_brainvision(...)
raw.save(fname)
raw_read = read_raw_fif(fname)
events_from_annotations(raw_read, event_id='auto') != events_from_annotations(raw, event_id='auto')

and now we are proposing this:

def events_from_annotations(raw, event_id="auto",
                            regexp=r'^(?![Bb][Aa][Dd]|[Ee][Dd][Gg][Ee]).*$',
                            use_rounding=True, chunk_duration=None,
                            verbose=None):
...
    event_id = _check_event_id(event_id, raw)
...


def _check_event_id(event_id, raw):
    from mne.io.brainvision import _BVEventParser as bv_default
    from mne.io.brainvision import _check_raw_fif_from_original_bv as black_magic

    if event_id is None:
        return _DefaultEventParser()
    elif event_id == 'brainvision':
        return bv_default()
    elif event_id == 'auto':
        candidate = getattr(raw, '_get_auto_event_id', _DefaultEventParser)()

        if black_magic(raw) and candidate != bv_default:
            warn("you might want to try `event_id='brainvision'`)

        return candidate
    elif callable(event_id) or isinstance(event_id, dict):
        return event_id
    else:
        raise ValueError('Invalid input event_id')

So that this would happen:

raw = read_raw_brainvision(...)
events_when_raw_was_bv = events_from_annotations(raw)
raw.save(fname)
raw_read = read_raw_fif(fname)
events_from_annotations(raw_read, event_id='auto') != events_when_raw_was_bv
# you get a warining

events_from_annotations(raw_read, event_id='brainvision') == events_when_raw_was_bv

My opinion is that 'auto' is a bad idea. I think that events_from_annotations
requires event_id as much as requires raw. We cannot get away with
events_from_annotations(raw.annotations) because we need raw.sfreq. In the same
manner we cannot assign ids if we don't know them. I've never seen a problem with
a code like this:

from mne.io.brainvision import event_ids as bv_events
raw = read_raw_fif(fname_of_an_old_bv)
events, _ = events_from_annotations(raw, event_id=bv_events)

If you want to get brainvision events (or any other type) you need to pass a proper
event_id that translate to those. I don't see this as a complicate and obscure coding
pattern for newcomers (We have much worst).

Educating the users that you need event_id or you get _DefaultEventParser which gives
you consecutive numeric ids for each description is not the end of the world. Otherwise,
we update _DefaultEventParser to behave like mne.io.brainvision.bv_events as it was
in 0.16 and _DefaultEventParser would start giving you ids starting at (3000).

I could see people arguing that using strings is better (maybe, I'm not sure. I like the idea of having to import the translator. I find it more explicit):

raw = read_raw_fif(fname_of_an_old_bv)
events, _ = events_from_annotations(raw, event_id='brainvision')

But I fear that if we start introspecting and guessing which was the original format
we are opening a can of worms.

At this point I would just make 'auto' behave like None and deprecate it (because we released in 0.18).
And if we want 'brainvision' and 'bla' to be shorthands to avoid the imports thats fine with me.

@jona-sassenhagen
Copy link
Copy Markdown
Contributor

But I fear that if we start introspecting and guessing which was the original format we are opening a can of worms.

In prickle yes, but in practice BV stimulus events are extremely regular and specific, and everyone who uses events like "Stimulus/S100" will want it to be parsed to 100.

I've never seen a problem with a code like this:

Again, in principle yes, but right now a lot of people are working with mne.find_events still, some are making the transition right now, others might mostly use other formats than BV and so on.

It seems to me like the point of annotations is to make working with non-int events easier, but here we have something that the experimenter has programmed into their software as an int, is stored by BV as an int plus some predictable boiler plate, and is then universally parsed as an int. But we somehow make it hard to use? It's literally 8-bit integer codes!

To be frank, this should have been handled back when annotations were conceived ... I should have been more involved there, to represent BV. Now we will end up with a less than optimal solution either way.

At this point I would just make 'auto' behave like None and deprecate it (because we released in 0.18).

Ok, maybe? If we add big warning messages for every usage of events_from_annotations on RawBrainvision objs for a cycle or three, and tell every BV user to always use event_id='brain vision' in all BV-derived code.

@massich
Copy link
Copy Markdown
Contributor Author

massich commented Jun 5, 2019

def _check_event_id(event_id, raw):
    from mne.io.brainvision import _BVEventParser as bv_default

    if event_id is None:
        return getattr(raw, '_get_auto_event_id', _DefaultEventParser)()
    elif event_id == 'auto':
        warn("`event_id='auto'` is deprecated. Use  `event_id=None` instead",
              DeprecationWarning)
        return getattr(raw, '_get_auto_event_id', _DefaultEventParser)()
    elif event_id == 'brainvision':
        return bv_default()
    elif callable(event_id) or isinstance(event_id, dict):
        return event_id
    else:
        raise ValueError('Invalid input event_id')

@larsoner
Copy link
Copy Markdown
Member

larsoner commented Jun 5, 2019

elif event_id == 'auto':

-1 on this. I am with @jona-sassenhagen that this mode is useful for BV users and we can potentially extend it to others (e.g., CTF once it's done).

The proposal that @agramfort seems to support now is actually not the one above (about warning) but rather a modified version of the second proposal, which I think would be:

  • instance of RawBrainvision, use BV parser
  • instance of RawFIF with any BV-like event, use BV parser
  • (also probably want: instance of RawArray with any BV-like event, use BV parser)

I'm fine with this. It does not seem too fragile or risky, and simplifies things for BV users.

This can also be how we treat other readers. For example for the CTF annotations we are adding, if they end up having standard names -- instance of RawFIF/RawArray with CTF-like annotations, or RawCTF: use CTF event parsing. So it's a sustainable approach, too. (And in extreme corner cases where CTF and BV-like events are found, which should be pretty much never, we can raise an error or warn or something.)

@massich
Copy link
Copy Markdown
Contributor Author

massich commented Jun 5, 2019

To be frank, this should have been handled back when annotations were conceived ... I should have been more involved there, to represent BV. Now we will end up with a less than optimal solution either way.

It's not a problem of the annotation changes. Before 0.15 events_from_annotations was not general, had 'brainvision' logic embedded in it (which was not even complete. because if you had 'Foo/F100' and 'Bar/B100' both were mapped to 100). The only thing we did was make it explicit. You need to pass a dict, callable or anything that makes the events at your liking. What its true is that we changed the default from a half-implemented 'brainvision' specification to something that was giving incremental numbers. If we want to make the 'brainvision' specification the default just because this is the main use case for 99.99% of our users we can argue that. But again, the events_id would not be the same as in 0.15 because now we have a more complete implementation of the specification.

Despite having the opinion that:

  • 'brainvision' is only 1 of our 15 readers and thats why it should not be the default. The default should be something really simple and let every reader expose something to translate.

I will stay out of the pull. I'll be +0 in making 'barinvision' the default of events_from_annotations. (which is what @agramfort seems to be pointing at: if I've no idea i'll use 'brainvision' because raw_fif has no annotations)

@massich
Copy link
Copy Markdown
Contributor Author

massich commented Jun 5, 2019

@larsoner what are you -1 ? to the warning? we could keep both approaches:

def _check_event_id(event_id, raw):
    from mne.io.brainvision import _BVEventParser as bv_default

    if event_id is None or event_id='auto':
        return getattr(raw, '_get_auto_event_id', _DefaultEventParser)()
    elif event_id == 'brainvision':
        return bv_default()
    elif callable(event_id) or isinstance(event_id, dict):
        return event_id
    else:
        raise ValueError('Invalid input event_id')

@cbrnr
Copy link
Copy Markdown
Contributor

cbrnr commented Jun 5, 2019

This discussion shows that the underlying problem is really that we now have two things to represent discrete events - see #6018. Whatever we do here, it will never be perfect, because we decided to represent discrete events stored in files as annotations. Annotations contain everything you need, including string descriptions. I know that superficially it is easier to work with a NumPy array, but you could do everything with annotations. We'd only need to add support for annotations everywhere we currently accept events.

@larsoner
Copy link
Copy Markdown
Member

larsoner commented Jun 5, 2019

what are you -1 ? to the warning?

I am -1 to deprecating 'auto'. I am +1 to what @agramfort came around to, which is the proposal to check BV class, then check if RawFIF or RawArray with bv-like events. It seems cleanest to me.

We'd only need to add support for annotations everywhere we currently accept events.

Yes in this case the "only" includes critically mne.Epochs, and really we'd just end up with the same problem there (how to auto-create event_id to extract BV events). But let's keep discussion of this stuff contained to #6018 so we do not get derailed here.

@cbrnr
Copy link
Copy Markdown
Contributor

cbrnr commented Jun 5, 2019

Yes in this case the "only" includes critically mne.Epochs, and really we'd just end up with the same problem there (how to auto-create event_id to extract BV events). But let's keep discussion of this stuff contained to #6018 so we do not get derailed here.

Yes, let's discuss in #6018 - but in the case of Epochs and annotations, why would you need event IDs when the user would specify the string descriptions that should be used for epoch creation?

@massich
Copy link
Copy Markdown
Contributor Author

massich commented Jun 5, 2019

Yes, let's discuss in #6018 - but in the case of Epochs and annotations, why would you need event IDs when the user would specify the string descriptions that should be used for epoch creation?

raw = read_raw_brainvision(...)
raw.save(fname)
raw_read = read_raw_fif(fname)

events, event_id = events_from_annotations(raw)
# or any other of those:
#
# events, event_id = events_from_annotations(raw, event_id='auto')
# events, event_id = events_from_annotations(raw, event_id='brainvision')
# event_id = mne.io.brainvision.bla
# events, _ = events_from_annotations(raw, event_id=event_id)

epoch = Epochs(raw, events=events, event_id=event_id)

The thing is that until now Epochs require events and event_id to make the link.
Because all this events_from_annotations that takes and returns event_id is to call
Epochs.
I agree with @cbrnr that if this was possible, then there would be no discussion

raw = read_raw_brainvision(...)
raw.save(fname)
raw_read = read_raw_fif(fname)
my_descriptors = ['foo', 'bar']
epoch = Epochs(raw, my_descriptors)

@massich
Copy link
Copy Markdown
Contributor Author

massich commented Jun 5, 2019

The other thing that nobody mentioned is the elephant in the room: Why everything has to be fif. If I could save brainvision files I would have no trouble.

raw = read_raw_brainvision('name.vhdr')
raw.save('bla.vhdr')
raw2 = read_raw_brainvision('bla.vhdr')

and then suport read_raw:

raw = read_raw('name.vhdr')
raw.save('bla.vhdr')
raw2 = read_raw('bla.vhdr')

To me an image is an image, regardless of the format that once it was stored.

@agramfort
Copy link
Copy Markdown
Member

agramfort commented Jun 5, 2019 via email

@massich
Copy link
Copy Markdown
Contributor Author

massich commented Jun 5, 2019

@agramfort there's a small PR comming with those changes. To discuss in concrete terms.

@jona-sassenhagen
Copy link
Copy Markdown
Contributor

I’d think alex’ second proposal should work.

A BV writer I don’t like. .fif isn’t perfect, but a lot of work has gone into making it a standard. And you’ll save your epochs and Ica Objs as fif too.
More generally, instead of writing a raw BV using MNE, you could just stick to using the original raw file ...

@jona-sassenhagen
Copy link
Copy Markdown
Contributor

Great thanks @massich

@cbrnr
Copy link
Copy Markdown
Contributor

cbrnr commented Jun 6, 2019

I agree with @cbrnr that if this was possible, then there would be no discussion

raw = read_raw_brainvision(...)
raw.save(fname)
raw_read = read_raw_fif(fname)
my_descriptors = ['foo', 'bar']
epoch = Epochs(raw, my_descriptors)

And why shouldn't this be possible? We could check whether the second argument is an nd-array and treat it as events or a string/list of strings and get the onsets from raw.annotations. Doesn't look too difficult, but maybe I'm missing something (because I've never really used Epochs).

Re the elephant, I agree that FIFF is not a universal format. Few tools outside of MNE can even read FIFF, so if I really want to store some data I go for something else, e.g. the BIDS-EEG supported formats EDF or BrainVision, or even EEGLAB, which is just a MAT file many tools can read. (For the sake of reproducibility, it's always best to only use the raw data, but I know that there are situations where you want to store some intermediate results to disk.)

Currently, MNE can't export any of these three formats for different reasons, so I've started to implement that in MNELAB. Currently, I can export to EEGLAB and EDF/BDF, and I will add BrainVision export based on the implementation by @palday soon. We could discuss if any of these formats might be suitable for direct integration into MNE.

Re read_raw, this discussion comes up regularly, e.g. #1930 and #4675. I'd also be +1 for having a generic reader function, but it seems to be quite a lot of effort.

@agramfort
Copy link
Copy Markdown
Member

agramfort commented Jun 6, 2019 via email

@cbrnr
Copy link
Copy Markdown
Contributor

cbrnr commented Jun 6, 2019

that could work but a number of functions that take epochs as input will
use the epochs.events attribute. We need to be careful that certain epochs
will not work for certain functions. I am afraid of a rabbit hole here.

But we could start with adding this feature while not touching the old behavior at all (i.e. leaving the events-based interface as default).

@massich
Copy link
Copy Markdown
Contributor Author

massich commented Jun 6, 2019

I'm in favor of all those things. But we need a discussion and a roadmap.

@agramfort
Copy link
Copy Markdown
Member

agramfort commented Jun 6, 2019 via email

@hoechenberger
Copy link
Copy Markdown
Member

hoechenberger commented May 2, 2020

Re the elephant, I agree that FIFF is not a universal format. Few tools outside of MNE can even read FIFF, so if I really want to store some data I go for something else, e.g. the BIDS-EEG supported formats EDF or BrainVision, or even EEGLAB, which is just a MAT file many tools can read. (For the sake of reproducibility, it's always best to only use the raw data, but I know that there are situations where you want to store some intermediate results to disk.)

This has actually been a huge pain for me in the past too. None if my non-MNE-using colleagues can read those FIFF files, let alone knows which software could potentially be used to open them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BrainVision annotation parsing is needlessly complicated

7 participants