adding a bad channel detection method using LOF algorithm by vpKumaravel · Pull Request #11234 · mne-tools/mne-python

vpKumaravel · 2022-10-10T16:33:13Z

Reference issue

NA

What does this implement/fix?

This PR contains a new feature for bad channel detection using Local Outlier Factor (LOF) algorithm.
File name: mne/preprocessing/detect_bad_channels.py

Additional information

The proposed algorithm is used in Newborns EEG Artifact Removal (NEAR) pipeline published earlier this year. Recently, we analyzed data from adult subjects and the algorithm is adaptable to different populations. This contribution is a first step towards a fully-automated preprocessing pipeline based on MNE-Python. Your feedback is greatly appreciated.

Best regards,
Velu

welcome · 2022-10-10T16:33:15Z

Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴🏽‍♂️

drammock

Thanks for the contribution! We'll want to take a closer look at your paper before reviewing/merging... in the meantime here are some quick comments of things I noticed just skimming the docstring and code.

mne/preprocessing/detect_bad_channels.py

agramfort · 2022-10-17T19:57:16Z

@vpKumaravel what I would do is to take some MEG data in openneuro and see how this compares with find_bad_channels_maxwell. Just report the amount of overlap.

Now you should start with the data we use in our tutorials here. For example on MNE sample they are 2 clear bad channels (one grad and one EEG)

vpKumaravel · 2023-03-31T15:01:52Z

@vpKumaravel what I would do is to take some MEG data in openneuro and see how this compares with find_bad_channels_maxwell. Just report the amount of overlap.

Now you should start with the data we use in our tutorials here. For example on MNE sample they are 2 clear bad channels (one grad and one EEG)

@agramfort Sorry for the delay! I was busy with my PhD thesis defense.

Here are the notebook files in which I validated LOF against Maxwell in both MNE and an OpenNeuro MEG dataset. This is the first time, I validated LOF in MEG data. I think the results are promising. Please let me know what you think.

If you want, I can also do the same for EEG datasets and compare the results against "Auto-reject" as Maxwell does not deal with EEG. However, auto-reject considers epoched data, while LOF works in continuous data.

Notebook 1: https://colab.research.google.com/drive/15MiXEWGyExvpQxuM7HedxT0zogwwzjVk?usp=sharing
Notebook 2: https://colab.research.google.com/drive/17X7DJ3p231B20fNb6A75pwX1WSN0Qnm4?usp=sharing

Thanks for your time :)

agramfort · 2023-04-10T07:07:52Z

this looks like single dataset experiment and using our eyes to evaluate the method. I would say it needs a more quantitative evaluation on a relevant metric and computed over a few datasets to be fully convincing. You can maybe find inspiration on this in the autoreject paper?

…

On Fri, Mar 31, 2023 at 5:02 PM Velu Prabhakar Kumaravel < ***@***.***> wrote: @vpKumaravel <https://github.com/vpKumaravel> what I would do is to take some MEG data in openneuro and see how this compares with find_bad_channels_maxwell <https://mne.tools/dev/generated/mne.preprocessing.find_bad_channels_maxwell.html#mne.preprocessing.find_bad_channels_maxwell>. Just report the amount of overlap. Now you should start with the data we use in our tutorials here. For example on MNE sample they are 2 clear bad channels (one grad and one EEG) @agramfort <https://github.com/agramfort> Sorry for the delay! I was busy with my PhD thesis defense. Here are the notebook files in which I validated LOF against Maxwell in both MNE and an OpenNeuro MEG dataset. This is the first time, I validated LOF in MEG data. I think the results are promising. Please let me know what you think. If you want, I can also do the same for EEG datasets and compare the results against "Auto-reject" as Maxwell does not deal with EEG. However, auto-reject considers epoched data, while LOF works in continuous data. Notebook 1: https://colab.research.google.com/drive/15MiXEWGyExvpQxuM7HedxT0zogwwzjVk?usp=sharing Notebook 2: https://colab.research.google.com/drive/17X7DJ3p231B20fNb6A75pwX1WSN0Qnm4?usp=sharing Thanks for your time :) — Reply to this email directly, view it on GitHub <#11234 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABHKHHJNTIMXY4ABOOB3XDW63WWXANCNFSM6AAAAAARBQFCF4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

vpKumaravel · 2023-04-11T09:00:40Z

this looks like single dataset experiment and using our eyes to evaluate the method. I would say it needs a more quantitative evaluation on a relevant metric and computed over a few datasets to be fully convincing. You can maybe find inspiration on this in the autoreject paper?
…
On Fri, Mar 31, 2023 at 5:02 PM Velu Prabhakar Kumaravel < @.> wrote: @vpKumaravel https://github.com/vpKumaravel what I would do is to take some MEG data in openneuro and see how this compares with find_bad_channels_maxwell https://mne.tools/dev/generated/mne.preprocessing.find_bad_channels_maxwell.html#mne.preprocessing.find_bad_channels_maxwell. Just report the amount of overlap. Now you should start with the data we use in our tutorials here. For example on MNE sample they are 2 clear bad channels (one grad and one EEG) @agramfort https://github.com/agramfort Sorry for the delay! I was busy with my PhD thesis defense. Here are the notebook files in which I validated LOF against Maxwell in both MNE and an OpenNeuro MEG dataset. This is the first time, I validated LOF in MEG data. I think the results are promising. Please let me know what you think. If you want, I can also do the same for EEG datasets and compare the results against "Auto-reject" as Maxwell does not deal with EEG. However, auto-reject considers epoched data, while LOF works in continuous data. Notebook 1: https://colab.research.google.com/drive/15MiXEWGyExvpQxuM7HedxT0zogwwzjVk?usp=sharing Notebook 2: https://colab.research.google.com/drive/17X7DJ3p231B20fNb6A75pwX1WSN0Qnm4?usp=sharing Thanks for your time :) — Reply to this email directly, view it on GitHub <#11234 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABHKHHJNTIMXY4ABOOB3XDW63WWXANCNFSM6AAAAAARBQFCF4 . You are receiving this because you were mentioned.Message ID: @.>

Thanks, Alexandre!
For EEG datasets, I have already done such extensive analysis in the past, where we used F1 Score metric for comparative evaluation (results here).
I will try to look for MEG datasets with labeled ground truth for bad sensors to do the same.

-V

drammock · 2023-04-11T16:32:22Z

@agramfort let's discuss at our next dev meeting; on quick look the Sensors paper looks like a reasonable quantification of benefit for EEG at least.

agramfort · 2023-04-11T19:31:17Z

👌

…

On Tue, Apr 11, 2023 at 6:32 PM Daniel McCloy ***@***.***> wrote: @agramfort <https://github.com/agramfort> let's discuss at our next dev meeting; on quick look the *Sensors* paper looks like a reasonable quantification of benefit for EEG at least. — Reply to this email directly, view it on GitHub <#11234 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABHKHEI7DGIIV74ZNV52U3XAWBSDANCNFSM6AAAAAARBQFCF4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

drammock · 2023-04-14T21:04:51Z

@jasmainak I wonder if you could weigh in here (with your author-of-autoreject hat on)? @larsoner and @britta-wstnr and I were chatting today and wondering whether this bad channel detection method is a good candidate for inclusion in MNE, vs living in its own small package. Here is the paper with the comparison to other methods (EEG only): https://www.mdpi.com/1424-8220/22/19/7314/htm

jasmainak · 2023-04-15T01:46:29Z

autoreject is pretty "comprehensive" ... it detects artifacts at the level of single trials, then uses that to detect bad epochs and/or repair bad segments of the data using physics-based interpolation.

I quickly read the code of the LOF algorithm and it seems similar to the FASTER/RANSAC family of algorithms in that they work on the entire time period or epoch. The threshold of such algorithms is typically tuned on a large dataset, so they tend to work fine for a wide variety of cases and it can be useful as a "quick" preprocessing step for other algorithms like maxwell filtering, ICA etc.

However, I suspect there is a chance of losing "good" data by removing channels where all the data is not bad. Or conversely, if a channel has 20% of the trials bad, those 20% of trials will be completely removed during epoching if the channel is not marked bad. I think these algorithms are hard to test empirically because it depends how the performance metric is chosen/designed and what it is sensitive to. Also it might be worth considering whether there is an advantage for the user to bake domain-agnostic outlier detection methods that are already available in sklearn.

Having said that, if many users find such a function useful in their workflows, I would say why not.

vpKumaravel · 2023-04-17T08:57:09Z

autoreject is pretty "comprehensive" ... it detects artifacts at the level of single trials, then uses that to detect bad epochs and/or repair bad segments of the data using physics-based interpolation.

I quickly read the code of the LOF algorithm and it seems similar to the FASTER/RANSAC family of algorithms in that they work on the entire time period or epoch. The threshold of such algorithms is typically tuned on a large dataset, so they tend to work fine for a wide variety of cases and it can be useful as a "quick" preprocessing step for other algorithms like maxwell filtering, ICA etc.

However, I suspect there is a chance of losing "good" data by removing channels where all the data is not bad. Or conversely, if a channel has 20% of the trials bad, those 20% of trials will be completely removed during epoching if the channel is not marked bad. I think these algorithms are hard to test empirically because it depends how the performance metric is chosen/designed and what it is sensitive to. Also it might be worth considering whether there is an advantage for the user to bake domain-agnostic outlier detection methods that are already available in sklearn.

Having said that, if many users find such a function useful in their workflows, I would say why not.

Thanks, everyone, for the nice discussion.
If I may add, LOF is also a lot different compared to the FASTER algorithm: Firstly, LOF does not assume any distribution for the data (unlike FASTER) as it detects outliers based on the density of the clusters. Second, even if it considers the entire time period, LOF finds outliers within a "local" neighborhood determined by the k-nearest neighbors algorithm. In other words, LOF does not consider each channel separately, and it computes the outlier score based on the relative density distribution of different M/EEG channel clusters. However, I completely agree and suggest the users calibrate both LOF threshold (a default value of 1.5 seems a good starting point) and the number of nearest neighbors (a default value of 20 seems a good point to start).

Moreover, there are researchers who look for channel-wise bad channels rather than epoch-wise. I did a survey on Twitter over a year ago, and here are the responses (link)

larsoner · 2023-04-21T17:12:53Z

@vpKumaravel one sticking point will be performance on different channel types. MNE is used for EEG but also (quite often) MEG. Would it be possible for you to compare find_bad_channels_maxwell to your algorithm when applied to MEG data on some datasets (e.g., subjects from https://openneuro.org/datasets/ds000117 and maybe another MEG dataset?)? FYI you'll probably have to process magnetometer (meg='mag') and gradiometer (meg='grad') separately since they generally have very different scales (and units).

vpKumaravel · 2023-04-28T17:21:05Z

@vpKumaravel one sticking point will be performance on different channel types. MNE is used for EEG but also (quite often) MEG. Would it be possible for you to compare find_bad_channels_maxwell to your algorithm when applied to MEG data on some datasets (e.g., subjects from https://openneuro.org/datasets/ds000117 and maybe another MEG dataset?)? FYI you'll probably have to process magnetometer (meg='mag') and gradiometer (meg='grad') separately since they generally have very different scales (and units).

Hi @larsoner, thanks for your comment. I would actually be interested to see how LOF performs on MEG datasets and compare the results with the Maxwell function. But, as I stated somewhere earlier in this thread, I couldn't find open-source MEG datasets with annotated bad channels. If you know of any, kindly suggest them.

Nevertheless, I took the datasets you shared here and ran the LOF scripts just for subject 01. The results are stored in the CSV files here. You will also find the LOF outlier scores for each of these detected bad channels to interpret the results. I have set the threshold to 1.5 (empirically).

Surprisingly, none of the channels were found by the Maxwell method. If it helps, I filtered the data between 1 and 40 Hz.

Edit (4/5/23):

I add a couple of PSD plots before and after LOF preprocessing. To me, it seems LOF removes channels that contain high-frequency (EMG) and low-frequency (EOG) noise. These files represent run-01 and run-04, respectively.

sub-01_ses-meg_task-facerecognition_run-01_meg.pdf

sub-01_ses-meg_task-facerecognition_run-04_meg.pdf

Cheers,
Velu

larsoner · 2023-05-08T19:32:33Z

Having thought about the problem and the results a bit, I'm +1 for including this. I think:

The maintenance overhead will be low
Mostly we have to think about how to handle multichannel data and the details of the actual API
The multichannel use case (EEG, MEG-mag, MEG-grad) for example for the sample dataset should take care of itself in normal use cases, too, since neighbors should be found within a particular channel type automatically. So we might not even need scalings (though it might make sense to add it)
We probably do want a picks parameter, though, which should default to using the good data channels. And we should just return a list of (additional) bad channels rather than modifying raw itself

@drammock WDYT?

drammock · 2023-05-08T21:11:55Z

I'm still in favor of including this in MNE. Other thoughts:

As far as returns, I'd say a list of bads and a scores dict (optional, if return_scores=True), just like find_bad_channels_maxwell. Agree on not modifying the Raw object. This means:
- you probably don't need to make a copy of Raw anymore
- the function name should be find_bad_channels_lof (instead of mark_...)
The new file should not be called mne/preprocessing/detect_bad_channels.py. I would suggest mne/preprocessing/_lof.py
There are several unaddressed comments from my review last october (re: docstring formatting / content)

@vpKumaravel Note that we've recently started using black for code formatting and ruff for code linting (instead of flake8). Both of these are set up to run on pre-commit hooks too, so you may need to update your environment to get all that working. See this recently changed section of our contributor guide.

…thon into badChannelLOF

vpKumaravel · 2023-05-18T08:06:25Z

@drammock, could you please guide me how to fix the build error "ModuleNotFoundError: No module named 'sklearn'"?

agramfort · 2023-05-18T10:11:06Z

mne/preprocessing/_lof.py

+
+
+import numpy as np
+from sklearn.neighbors import LocalOutlierFactor


@vpKumaravel you need to import this inside the function. sklearn is only an dependency of mne.

larsoner · 2024-01-19T16:40:30Z

Hey @vpKumaravel do you have time to come back to this? It would be nice to get this in!

Merge remote-tracking branch 'upstream/main' into badChannelLOF

for more information, see https://pre-commit.ci

larsoner · 2024-01-26T20:14:38Z

@vpKumaravel there was a bit of cruft following the rebase, I pushed a commit to fix that and clean up / streamline some stuff. Along the way I added a picks option to the function. Can you see if the diff looks reasonable to you now?

vpKumaravel · 2024-03-01T14:20:46Z

@vpKumaravel there was a bit of cruft following the rebase, I pushed a commit to fix that and clean up / streamline some stuff. Along the way I added a picks option to the function. Can you see if the diff looks reasonable to you now?

Thank you very much @larsoner. Yes, it all looks good to me. And thanks for helping me clear the PR checks! Hope the contribution is useful.

larsoner · 2024-03-01T15:33:25Z

I think all of @drammock's concerns have been taken care of, so I'll merge main into this branch just to make sure everything is still okay then mark for merge-when-green -- thanks in advance @vpKumaravel !

* upstream/main: Fix default color of mne.viz.Brain.add_text (mne-tools#12470)

mne/preprocessing/_lof.py

mne/preprocessing/tests/test_lof.py

Co-authored-by: Daniel McCloy <dan@mccloy.info>

welcome · 2024-03-01T16:55:09Z

🎉 Congrats on merging your first pull request! 🥳 Looking forward to seeing more from you in the future! 💪

…11234) Co-authored-by: Velu Prabhakar Kumaravel <veluprabhakarkumaravel@Velus-MBP.lan> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Larson <larson.eric.d@gmail.com> Co-authored-by: Daniel McCloy <dan@mccloy.info>

adding a bad channel detection method using LOF algorithm

f07a316

drammock reviewed Oct 11, 2022

View reviewed changes

agramfort reviewed Oct 11, 2022

View reviewed changes

mne/preprocessing/detect_bad_channels.py Outdated Show resolved Hide resolved

Merge branch 'mne-tools:main' into badChannelLOF

d1bbd8d

larsoner mentioned this pull request Oct 21, 2022

Artifact Subspace Reconstruction #9302

Closed

drammock added the needs-discussion issues requiring a dev meeting discussion before the way forward is clear label Apr 11, 2023

Velu Prabhakar and others added 6 commits May 17, 2023 16:09

Fixing the reviews

31ee7c7

Fixing the reviews

cdd057e

Merge branch 'mne-tools:main' into badChannelLOF

1dc28fc

Merge branch 'badChannelLOF' of https://github.com/vpKumaravel/mne-py…

ffe4d17

…thon into badChannelLOF

removed the previous version of the file

73cba8f

fixing docstring issues and added scikit in the requirements.txt

c7087d4

agramfort reviewed May 18, 2023

View reviewed changes

fixing import module error

de18ec7

larsoner added this to the 1.5 milestone May 22, 2023

larsoner modified the milestones: 1.6, 1.7 Nov 7, 2023

drammock removed the needs-discussion issues requiring a dev meeting discussion before the way forward is clear label Jan 19, 2024

larsoner mentioned this pull request Jan 24, 2024

ENH: Add EEG support to find_bad_channels_maxwell #12384

Open

Fix CI issues in my pull request

0bbf90b

vpKumaravel requested review from dengemann and larsoner as code owners January 25, 2024 21:32

Velu Prabhakar Kumaravel and others added 5 commits January 25, 2024 23:16

git commit -m "Merge upstream changes and resolve conflicts"

bff7bdd

Merge remote-tracking branch 'upstream/main' into badChannelLOF

[pre-commit.ci] auto fixes from pre-commit.com hooks

6de89c3

for more information, see https://pre-commit.ci

Add changelog entry for PR mne-tools#11234

8d30ed3

Merge remote-tracking branch 'origin/badChannelLOF' into badChannelLOF

96b60a2

FIX: Cleanups

6dd0753

larsoner added 5 commits January 26, 2024 15:15

FIX: Better

57a454e

FIX: xref

d56908c

FIX: Document

fee68f0

DOC: numpydoc

60b9fcd

Merge branch 'main' into badChannelLOF

e56b6e0

larsoner added 2 commits March 1, 2024 10:33

Merge branch 'main' into badChannelLOF

e063ef9

Merge remote-tracking branch 'upstream/main' into badChannelLOF

1b0b732

* upstream/main: Fix default color of mne.viz.Brain.add_text (mne-tools#12470)

drammock reviewed Mar 1, 2024

View reviewed changes

mne/preprocessing/_lof.py Show resolved Hide resolved

mne/preprocessing/tests/test_lof.py Outdated Show resolved Hide resolved

Apply suggestions from code review

c92b95b

Co-authored-by: Daniel McCloy <dan@mccloy.info>

larsoner merged commit ff1cfdd into mne-tools:main Mar 1, 2024

drammock mentioned this pull request Jul 16, 2025

New Code Proposal: Add ARMBR blink artifact removal method to mne.preprocessing #13324

Open

larsoner mentioned this pull request Jul 24, 2025

Rename devel to dev #13349

Merged



		import numpy as np
		from sklearn.neighbors import LocalOutlierFactor

Uh oh!

Conversation

vpKumaravel commented Oct 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference issue

What does this implement/fix?

Additional information

Uh oh!

welcome bot commented Oct 10, 2022

Uh oh!

drammock left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

agramfort commented Oct 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vpKumaravel commented Mar 31, 2023

Uh oh!

agramfort commented Apr 10, 2023 via email

Uh oh!

vpKumaravel commented Apr 11, 2023

Uh oh!

drammock commented Apr 11, 2023

Uh oh!

agramfort commented Apr 11, 2023 via email

Uh oh!

drammock commented Apr 14, 2023

Uh oh!

jasmainak commented Apr 15, 2023

Uh oh!

vpKumaravel commented Apr 17, 2023

Uh oh!

larsoner commented Apr 21, 2023

Uh oh!

vpKumaravel commented Apr 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

larsoner commented May 8, 2023

Uh oh!

drammock commented May 8, 2023

Uh oh!

vpKumaravel commented May 18, 2023

Uh oh!

agramfort May 18, 2023

Choose a reason for hiding this comment

Uh oh!

larsoner commented Jan 19, 2024

Uh oh!

larsoner commented Jan 26, 2024

Uh oh!

vpKumaravel commented Mar 1, 2024

Uh oh!

larsoner commented Mar 1, 2024

Uh oh!

Uh oh!

Uh oh!

welcome bot commented Mar 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

vpKumaravel commented Oct 10, 2022 •

edited

Loading

agramfort commented Oct 17, 2022 •

edited

Loading

vpKumaravel commented Apr 28, 2023 •

edited

Loading