Permutation cluster test with TFCE: improvement of speed and memory usage in 2D#12609
Permutation cluster test with TFCE: improvement of speed and memory usage in 2D#12609larsoner merged 9 commits intomne-tools:mainfrom
Conversation
|
Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴 |
mne/stats/cluster_level.py
Outdated
| clusters_out : bool | ||
| If True, clusters are returned, otherwise None is returned instead |
There was a problem hiding this comment.
Do we need this param at all? Could we instead just effectively have clusters_out = not tfce?
There was a problem hiding this comment.
Good observation. Indeed the clusters_out point is to not make the long list of arrays at each call of _find_cluster during permutation computation.
Since 'clusters' with TFCE is simply a list of one cluster per point, we could just decide to make this construction (for TFCE only) outside of the _find_cluster function. This would still need to depend on the dimension of input (1d or 2d) and the type of output wanted 'indices' or 'mask'.
I do not know what is the simplest (or more compliant with MNE practices) between the current additional parameter and moving the TFCE specific lines (lines 494-505) in _permutation_cluster_test (somewhere around lines1021 to 1033). But yes I would agree with the moving.
Any advice ?
There was a problem hiding this comment.
+1 for just doing the construction outside of the _find_cluster function wherever it's needed. As long as the public API output stays the same (and hopefully this is checked in a test already, if not then please add it!) then we should be okay to refactor however is cleanest
There was a problem hiding this comment.
Done, I moved the "clusters" construction for TFCE out of _find_clusters.
Regarding the API, there is already a test test_output_equiv to which I added the threshold parameter to be tested.
However, I also noticed the False value for adjacency was not tested and indeed there are problems here:
adjacency=Falseruns only for 1D inputs (which I think is not expected)- for 1D input, the output is always
"indices"whatever isout_type
I guess it is rarely used... Should this be corrected in the same or another PR?
There was a problem hiding this comment.
I guess it is rarely used... Should this be corrected in the same or another PR?
Let's do it in another PR, I created #12613 so we don't forget
|
And merging |
OK but with my new commits I need to choose a strategy (novice with this kind of stuff in git) : What is the correct one? => I chose to merge (since it was was done to integrate the last commits of main in the current branch) |
|
Pushed a tiny commit and merged |
|
CI failure is due to @cbrnr do you know what that hook does / whether we should allow it to run in our CIs? |
|
No, @hofaflo? |
|
Hmm, strange! As far as I am aware, So no idea why this error is coming up now, sorry! 🤔 Edit: Found the relevant issue: git-lfs/git-lfs#5749 |
|
Weird, working on a workaround in #12615 |
|
🎉 Congrats on merging your first pull request! 🥳 Looking forward to seeing more from you in the future! 💪 |
Add benchmark scripts and feasibility documentation for GPU-accelerating the spatio-temporal cluster-based permutation test, which is the mne-tools#1 computational bottleneck for MNE researchers doing source-space analyses. The connected-component labeling step in _get_components() consumes ~97% of permutation test runtime. This adds: - gpu_accel/benchmark_cluster_cpu.py: CPU baseline benchmark using the MNE sample dataset (fsaverage ico-5, ~20K vertices) - gpu_accel/patch_cupy_poc.py: CuPy proof-of-concept that monkey-patches _get_components with GPU connected_components (NVIDIA) - gpu_accel/FEASIBILITY.md: Full analysis of GPU CCL algorithms, hardware requirements, and a three-phase plan (CuPy PoC → wgpu+Rust → fused pipeline) - CLAUDE.md: Development guide with uv setup instructions and GPU work context Related: mne-tools#5439, mne-tools#12609, mne-tools#7784, mne-tools#8095, mne-tools#13175 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reference issue
No ref, only a post in the forum
What does this implement/fix?
In mne.stats.cluster_level.py :
The use of TFCE with large 2D data implied a huge amount of memory because of the creation of as many boolean arrays of the same size of the data as the number of data points (with TFCE, each point is considered as a cluster - consisting of a single point - and need an array to describe it). For this case, I replace the large boolean array by a single index, and removed the clusters output when it was not necessary.
Additional information
It is my first ever PR, all comments are welcomed