Skip to content

Add exclude_after_unique option to mne.io.read_raw_edf/read_raw_edf to search for exclude channels after making channel names unique#12518

Merged
larsoner merged 8 commits intomne-tools:mainfrom
rcmdnk:remain_original_ch_names
Mar 29, 2024
Merged

Add exclude_after_unique option to mne.io.read_raw_edf/read_raw_edf to search for exclude channels after making channel names unique#12518
larsoner merged 8 commits intomne-tools:mainfrom
rcmdnk:remain_original_ch_names

Conversation

@rcmdnk
Copy link
Copy Markdown
Contributor

@rcmdnk rcmdnk commented Mar 27, 2024

What does this implement/fix?

This pull request addresses an issue with handling EDF/BDF/GDF files that contain channels with duplicate names. Previously, channels with duplicate names were assigned a suffix (-N) after the process of including/excluding channels, which made it impossible to selectively include or exclude specific instances of these channels.

The proposed solution is to apply the _unique_channel_names function before the inclusion/exclusion process. Additionally, the original channel names (orig_ch_names) are now stored in Raw._raw_extras. This inclusion is particularly useful as it allows users to determine whether channel names have been modified and facilitates a more detailed examination of the raw data alongside other contents of _raw_extras.

Additional information

This modification could impact existing workflows. For instance, consider an EDF file with channels named ["A", "A", "B", "C"]. If a user attempts to exclude channels "A" and "C" using the following code:

raw = mne.io.read_raw_edf("my.edf", exclude=["A", "C"])

The behavior will differ pre and post-PR:

  • Before PR: ch_names = ["B"]
  • After PR: ch_names = ["A-0", "A-1", "B"]

If a channel is excluded using a string (interpreted as a regex), the outcome remains unchanged:

raw = mne.io.read_raw_edf("my.edf", exclude="A")
  • Both before and after PR: ch_names = ["B", "C"]

@larsoner
Copy link
Copy Markdown
Member

This modification could impact existing workflows.

This is a bit dangerous. Could we instead exclude before and after renaming? Then old code would work as would the new code you want.

Additionally, the original channel names (orig_ch_names) are now stored in Raw._raw_extras. This inclusion is particularly useful as it allows users to determine whether channel names have been modified and facilitates a more detailed examination of the raw data alongside other contents of _raw_extras.

Users should never make use of private attributes (those with a leading underscore) and we shouldn't facilitate them doing so. They can change at any time with no warning. So -1 on including them in _raw_extras. Along these lines, we should also remove any mention of private attributes in the changelog, and just talk about the behavior of exclude being improved to support excluding channels based on their post-renaming names in addition to their pre-renaming names.

@rcmdnk
Copy link
Copy Markdown
Contributor Author

rcmdnk commented Mar 28, 2024

@larsoner
Thank you for comments.

Could we instead exclude before and after renaming? Then old code would work as would the new code you want.

exclude_after_unique (default False) parameter is added to read_raw_edf and etc.
This keeps backward compatibility and able to choose one of duplicated channel by exclude_after_unique=True.

How is this?

Users should never make use of private attributes (those with a leading underscore) and we shouldn't facilitate them doing so. They can change at any time with no warning. So -1 on including them in _raw_extras. Along these lines, we should also remove any mention of private attributes in the changelog, and just talk about the behavior of exclude being improved to support excluding channels based on their post-renaming names in addition to their pre-renaming names.

I understand.
orig_ch_names is removed from this PR.

On the other hand, sometimes I actually need these information.
Is it ok to make another PR to add orig_ch_names?

@rcmdnk rcmdnk changed the title Add 'orig_ch_names' to Raw._raw_extras for EDF, BDF, GDF, and apply _unique_channel_names before checking include/exclude channels Add exclude_after_unique option to mne.io.read_raw_edf/read_raw_edf to search for exclude channels after making channel names unique Mar 28, 2024
@larsoner
Copy link
Copy Markdown
Member

On the other hand, sometimes I actually need these information. Is it ok to make another PR to add orig_ch_names?

Sure but we should think about how. Ideally it could live in raw.info somewhere but I can't think of a good place. So instead maybe a little utility function like mne.io.edf.read_channel_names(...) that extracts just the original channel names or something would work?

Copy link
Copy Markdown
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple if minor comments otherwise LGTM!

exclude_after_unique : bool
If True, exclude channels are searched for after they have been made
unique. This is useful to choose channels that have been made unique
by adding a suffix.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
by adding a suffix.
by adding a suffix. If False, the original names are checked.
.. versionadded:: 1.7

Comment on lines +1654 to +1657
exclude_after_unique : bool
If True, exclude channels are searched for after they have been made
unique. This is useful to choose channels that have been made unique
by adding a suffix.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY (don't repeat yourself) -- this should live in an entry in mne/utils/docs.py and be used all three places

@rcmdnk
Copy link
Copy Markdown
Contributor Author

rcmdnk commented Mar 29, 2024

So instead maybe a little utility function like mne.io.edf.read_channel_names(...) that extracts just the original channel names or something would work?

It sounds good.
I also always use n_samps and record_length in _raw_extras to calculate original frequencies.
These values are taken by using the private function _read_raw_header directly because I would like to know before making Raw object to divide into same frequency channels to keep the frequency.

I will try to make functions to read these information directly in another PR.

@rcmdnk
Copy link
Copy Markdown
Contributor Author

rcmdnk commented Mar 29, 2024

Just a couple if minor comments otherwise LGTM!

The doc for exclude_after_unique was updated and moved into mne/utils/docs.py.

@larsoner
Copy link
Copy Markdown
Member

Failures unrelated and comments addressed, in it goes, thanks @rcmdnk !

@larsoner larsoner merged commit 5a1d009 into mne-tools:main Mar 29, 2024
@rcmdnk rcmdnk deleted the remain_original_ch_names branch April 1, 2024 02:42
larsoner added a commit to larsoner/mne-python that referenced this pull request Apr 15, 2024
* upstream/main: (50 commits)
  ENH: Improve OPM auditory dataset and example (mne-tools#12539)
  MAINT: Bump to latest pydata-sphinx-theme (mne-tools#12228)
  MRG: Simplify manual installation instructions a little by dropping explicit mention of (lib)mamba (mne-tools#12362)
  fix PSD weights handling when bad annotations present (mne-tools#12538)
  Fix phase loading (mne-tools#12537)
  align FFT windows to good data spans in psd_array_welch (mne-tools#12536)
  explicitly disallow multitaper in presence of bad annotations (mne-tools#12535)
  MAINT: Clean up PyVista contexts (mne-tools#12533)
  MAINT: Complete API change of ordered (mne-tools#12534)
  MAINT: Reinstall statsmodels and improve logging (mne-tools#12532)
  MAINT: Remove scipy.signal.morlet2 (mne-tools#12531)
  Update README badge links (mne-tools#12529)
  BUG: Fix bug with reading his_id from snirf (mne-tools#12526)
  [pre-commit.ci] pre-commit autoupdate (mne-tools#12524)
  Fix file format check in _check_eeglab_fname function (mne-tools#12523)
  MAINT: Reenable picard in pre testing (mne-tools#12525)
  MAINT: Bump to large resource class (mne-tools#12522)
  MAINT: Restore 2 jobs on Windows (mne-tools#12520)
  Add exclude_after_unique option to mne.io.read_raw_edf/read_raw_edf to search for exclude channels after making channel names unique (mne-tools#12518)
  Improve consistency of sensor types in code and documentation (mne-tools#12509)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants