Skip to content

Permutation cluster test with TFCE: improvement of speed and memory usage in 2D#12609

Merged
larsoner merged 9 commits intomne-tools:mainfrom
nfourcau:tfce-optimization
May 17, 2024
Merged

Permutation cluster test with TFCE: improvement of speed and memory usage in 2D#12609
larsoner merged 9 commits intomne-tools:mainfrom
nfourcau:tfce-optimization

Conversation

@nfourcau
Copy link
Copy Markdown
Contributor

Reference issue

No ref, only a post in the forum

What does this implement/fix?

In mne.stats.cluster_level.py :
The use of TFCE with large 2D data implied a huge amount of memory because of the creation of as many boolean arrays of the same size of the data as the number of data points (with TFCE, each point is considered as a cluster - consisting of a single point - and need an array to describe it). For this case, I replace the large boolean array by a single index, and removed the clusters output when it was not necessary.

Additional information

It is my first ever PR, all comments are welcomed

@welcome
Copy link
Copy Markdown

welcome bot commented May 14, 2024

Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴

@nfourcau nfourcau changed the title cluster permutation test with TFCE: improvement of speed and memory usage in 2D WIP: cluster permutation test with TFCE: improvement of speed and memory usage in 2D May 14, 2024
@nfourcau nfourcau changed the title WIP: cluster permutation test with TFCE: improvement of speed and memory usage in 2D Cluster permutation test with TFCE: improvement of speed and memory usage in 2D May 14, 2024
@nfourcau nfourcau changed the title Cluster permutation test with TFCE: improvement of speed and memory usage in 2D Permutation cluster test with TFCE: improvement of speed and memory usage in 2D May 14, 2024
@nfourcau nfourcau marked this pull request as ready for review May 14, 2024 13:43
Comment on lines +377 to +378
clusters_out : bool
If True, clusters are returned, otherwise None is returned instead
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this param at all? Could we instead just effectively have clusters_out = not tfce?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation. Indeed the clusters_out point is to not make the long list of arrays at each call of _find_cluster during permutation computation.
Since 'clusters' with TFCE is simply a list of one cluster per point, we could just decide to make this construction (for TFCE only) outside of the _find_cluster function. This would still need to depend on the dimension of input (1d or 2d) and the type of output wanted 'indices' or 'mask'.
I do not know what is the simplest (or more compliant with MNE practices) between the current additional parameter and moving the TFCE specific lines (lines 494-505) in _permutation_cluster_test (somewhere around lines1021 to 1033). But yes I would agree with the moving.
Any advice ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for just doing the construction outside of the _find_cluster function wherever it's needed. As long as the public API output stays the same (and hopefully this is checked in a test already, if not then please add it!) then we should be okay to refactor however is cleanest

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I moved the "clusters" construction for TFCE out of _find_clusters.

Regarding the API, there is already a test test_output_equiv to which I added the threshold parameter to be tested.
However, I also noticed the False value for adjacency was not tested and indeed there are problems here:

  • adjacency=False runs only for 1D inputs (which I think is not expected)
  • for 1D input, the output is always "indices" whatever is out_type

I guess it is rarely used... Should this be corrected in the same or another PR?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is rarely used... Should this be corrected in the same or another PR?

Let's do it in another PR, I created #12613 so we don't forget

@larsoner
Copy link
Copy Markdown
Member

And merging main into your branch should fix CIs so I'll do it now

@nfourcau
Copy link
Copy Markdown
Contributor Author

nfourcau commented May 16, 2024

And merging main into your branch should fix CIs so I'll do it now

OK but with my new commits I need to choose a strategy (novice with this kind of stuff in git) :
git config pull.rebase false # merge (the default strategy)
git config pull.rebase true # rebase
git config pull.ff only # fast-forward only

What is the correct one? => I chose to merge (since it was was done to integrate the last commits of main in the current branch)

@larsoner
Copy link
Copy Markdown
Member

Pushed a tiny commit and merged main into your branch, marking for merge-when-green. Thanks in advance @nfourcau !

@larsoner larsoner enabled auto-merge (squash) May 16, 2024 13:32
@drammock
Copy link
Copy Markdown
Member

CI failure is due to edfio adding a post-checkout hook:
https://dev.azure.com/mne-tools/mne-python/_build/results?buildId=30068&view=logs&j=b9064c46-2375-5b70-72c1-f55d0d61c63a&t=e34ff71f-29f1-5601-0139-1a3a772fec70&l=496

@cbrnr do you know what that hook does / whether we should allow it to run in our CIs?

@cbrnr
Copy link
Copy Markdown
Contributor

cbrnr commented May 16, 2024

No, @hofaflo?

@hofaflo
Copy link
Copy Markdown
Contributor

hofaflo commented May 16, 2024

Hmm, strange! As far as I am aware, edfio did not add any hooks – didn't even know that this was possible on a non-local repo level. Opening .git/hooks/post-checkout locally shows this (which makes sense, as we're using LFS to store test files, but that has always been the case):

#!/bin/sh
command -v git-lfs >/dev/null 2>&1 || { echo >&2 "\nThis repository is configured for Git LFS but 'git-lfs' was not found on your path. If you no longer wish to use Git LFS, remove this hook by deleting the 'post-commit' file in the hooks directory (set by 'core.hookspath'; usually '.git/hooks').\n"; exit 2; }
git lfs post-commit "$@"

So no idea why this error is coming up now, sorry! 🤔

Edit: Found the relevant issue: git-lfs/git-lfs#5749

@larsoner
Copy link
Copy Markdown
Member

Weird, working on a workaround in #12615

@larsoner larsoner added this to the 1.8 milestone May 17, 2024
@larsoner larsoner merged commit cf0e12d into mne-tools:main May 17, 2024
@welcome
Copy link
Copy Markdown

welcome bot commented May 17, 2024

🎉 Congrats on merging your first pull request! 🥳 Looking forward to seeing more from you in the future! 💪

sharifhsn added a commit to sharifhsn/mne-python that referenced this pull request Mar 8, 2026
Add benchmark scripts and feasibility documentation for GPU-accelerating
the spatio-temporal cluster-based permutation test, which is the mne-tools#1
computational bottleneck for MNE researchers doing source-space analyses.

The connected-component labeling step in _get_components() consumes ~97%
of permutation test runtime. This adds:

- gpu_accel/benchmark_cluster_cpu.py: CPU baseline benchmark using the
  MNE sample dataset (fsaverage ico-5, ~20K vertices)
- gpu_accel/patch_cupy_poc.py: CuPy proof-of-concept that monkey-patches
  _get_components with GPU connected_components (NVIDIA)
- gpu_accel/FEASIBILITY.md: Full analysis of GPU CCL algorithms, hardware
  requirements, and a three-phase plan (CuPy PoC → wgpu+Rust → fused pipeline)
- CLAUDE.md: Development guide with uv setup instructions and GPU work context

Related: mne-tools#5439, mne-tools#12609, mne-tools#7784, mne-tools#8095, mne-tools#13175

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants