Skip to content

WIP: Finally Use Mffpy for read_raw_egi#12981

Draft
scott-huberty wants to merge 3 commits intomne-tools:mainfrom
scott-huberty:mffpy
Draft

WIP: Finally Use Mffpy for read_raw_egi#12981
scott-huberty wants to merge 3 commits intomne-tools:mainfrom
scott-huberty:mffpy

Conversation

@scott-huberty
Copy link
Copy Markdown
Contributor

@scott-huberty scott-huberty commented Nov 21, 2024

closes #6937
Maybe also #11380 ?
xref #8038

This is not ready for review(!) But I am using mffpy for a project and so I figured I would get this on the board..

Maybe a good target date for merge would be just after the MNE 1.9 release, so that devs can work with it for a few months before it is shipped in a stable release.

TODO:

  • 1 Make it work
  • Make it work with EEG data
  • Make it work with EEG + PNS data
  • Make it work with data with acquisition skips
  • Re-integrate support for reading in channel status (e.g. bad channels)
  • Get CI's green (specifically the MFF testing code).
  • 2 Make it pretty
  • Clean up the code
  • Address all TODO/XXX comments
  • Add tutorial for reading in calibration data directly with mffpy
  • 3 RFC:
  • Do we remove our custom reading code or keep it for legacy purposes?

@scott-huberty scott-huberty marked this pull request as draft November 21, 2024 18:24
- more functional
- annotate acquisition skips
1. the current EGI reader does this. 2. If we don't do this, then the sensor positions will appear way larger than the head during calls like plot_sensors, plot_components etc.

It seems like in the sensorLayout.xml file, that type "2" indicates a head shape point.
@aman-coder03
Copy link
Copy Markdown
Contributor

aman-coder03 commented Feb 21, 2026

I came across this PR while exploring the EGI MFF refactor (referenced in #6937 and #8038) and wanted to share some findings after running the existing test suite on this branch
After installing mffpy and defusedxml and downloading the testing dataset, I ran pytest mne/io/egi/tests/ and got 18 failures, 5 passed
The dominant failure appears to be in _get_eeg_data():

n_channels = np.sum(list(mff_reader.num_channels.values()))  # allocates for EEG + PNS
...
data_chunk, _ = mff_reader.get_physical_samples_from_epoch(epoch)["EEG"]  # only reads EEG
...
eeg[:, start:end] = this_chunk  # shape mismatch

This causes errors like ValueError: could not broadcast input array from shape (257,X) into shape (265,Y) for files with PNS data, since the array is sized for all channels but only EEG chunks are being assigned
There also seem to be off-by-one sample count issues from using a float interval in np.arange, causing mismatches like shape (257,336) into shape (257,335)
I would love to contribute fixes if you are open to it, either here or in a separate PR, whatever you'd prefer
@drammock happy to hear your thoughts too

@scott-huberty
Copy link
Copy Markdown
Contributor Author

I came across this PR while exploring the EGI MFF refactor (referenced in #6937 and #8038) and wanted to share some findings after running the existing test suite on this branch After installing mffpy and defusedxml and downloading the testing dataset, I ran pytest mne/io/egi/tests/ and got 18 failures, 5 passed The dominant failure appears to be in _get_eeg_data():

n_channels = np.sum(list(mff_reader.num_channels.values()))  # allocates for EEG + PNS
...
data_chunk, _ = mff_reader.get_physical_samples_from_epoch(epoch)["EEG"]  # only reads EEG
...
eeg[:, start:end] = this_chunk  # shape mismatch

This causes errors like ValueError: could not broadcast input array from shape (257,X) into shape (265,Y) for files with PNS data, since the array is sized for all channels but only EEG chunks are being assigned There also seem to be off-by-one sample count issues from using a float interval in np.arange, causing mismatches like shape (257,336) into shape (257,335) I would love to contribute fixes if you are open to it, either here or in a separate PR, whatever you'd prefer @drammock happy to hear your thoughts too

@aman-coder03 feel free to create a superceding PR from this branch. But a friendly word of caution, IMO it will be more difficult to tackle this migration if you do not work with EGI/Magstim data and are unfamiliar with the file format and hardware. If you do not already need to work with EGI data, this might not be the best issue to tackle if you are a new contributor.

I started this PR because I have an EGI dataset with nested event ID's (i.e. stimulus/0, stimulus/1), which MNE's in house reader does not consume, but mffpy does. Unfortunately I found that swapping in an mffpy backend while making sure all existing tests still pass would require more time than I had (There was a hard deadline for my analysis), so I ended up implementing a bespoke fix for our dataset. It has been some time but IIRC, I ran into some issues with mffpy that would require upstream PR's, in order to get the package to work with our existing test suite.

@aman-coder03
Copy link
Copy Markdown
Contributor

@scott-huberty thanks for the context! Could you share what specific upstream mffpy issues you ran into? That would help me understand the real scope.

@PragnyaKhandelwal
Copy link
Copy Markdown
Contributor

@scott-huberty I've spent the last 24 hours auditing this branch. I have locally resolved the ValueError broadcasting mismatch and the UnboundLocalError you and @aman-coder03 discussed. I've also identified the root of the datetime parsing failures. I am preparing a superceding PR to finalize this migration. Would you like to review my indexing fix before I open the new PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove / Replace EGI reader by https://pypi.org/project/mffpy/

3 participants