Skip to content

MRG, ENH: Use numba to speed up summarize_clusters_stc#8095

Merged
larsoner merged 1 commit intomne-tools:masterfrom
yh-luo:clu
Aug 6, 2020
Merged

MRG, ENH: Use numba to speed up summarize_clusters_stc#8095
larsoner merged 1 commit intomne-tools:masterfrom
yh-luo:clu

Conversation

@yh-luo
Copy link
Copy Markdown
Contributor

@yh-luo yh-luo commented Aug 5, 2020

Use Numba to speed up converting spatiotemporal cluster results into SourceEstimate. This PR brings about 35% speedup.

Testing

left_auditory_vs_visual_0_to_None.npz (~148MB)

  • Processed from the sample dataset, duplicated to create 7 subjects
  • Used oct6 source space to compute source estimates then morphed to ico5 fsaverage
  • Sampling rate: 600 Hz
  • Time window: 0-500 ms after stimulus onset
import mne
import numpy as np

# processed from the sample dataset
clu_fname = 'left_auditory_vs_visual_0_to_None.npz'

tstep = 0.0016649601096532323

# allow_pickle=True when you are certain that the data are not malicious!
cluster_result = np.load(clu_fname, allow_pickle=True)

clu = (cluster_result['t_obs'], cluster_result['clusters'],
       cluster_result['cluster_pv'], cluster_result['H0'])

stc_all_cluster_vis = mne.stats.summarize_clusters_stc(
    clu,
    tstep=tstep * 1000,
    subject='fsaverage')

Before

%time
CPU times: user 1min 23s, sys: 7.17 s, total: 1min 31s
Wall time: 1min 32s

After

%time
CPU times: user 32.8 s, sys: 27.4 s, total: 1min
Wall time: 1min

~35% speedup (65% of current processing time)

When the cluster results are huge, the for-loop in mne.stats.summarize_clusters_stc is too time-consuming. I tried to speed up the entire for-loop but it's currently impossible (data typing problem). I hope this PR helps.

mne.sys_info

Platform:      Linux-5.6.18-200.fc31.x86_64-x86_64-with-fedora-31-Thirty_One
Python:        3.7.7 (default, Mar 13 2020, 10:23:39)  [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]
Executable:    /home/yuhan/env/mnedev/bin/python
CPU:           x86_64: 4 cores
Memory:        15.5 GB

mne:           0.21.dev0
numpy:         1.18.4 {blas=openblas, lapack=openblas}
scipy:         1.4.1
matplotlib:    3.2.1 {backend=Qt5Agg}

sklearn:       0.23.0
numba:         0.49.1
nibabel:       3.1.0
cupy:          Not found
pandas:        1.0.3
dipy:          1.1.1
mayavi:        4.7.1
pyvista:       0.24.2 {OpenGL 4.5.0 NVIDIA 440.82 via GeForce 940MX/PCIe/SSE2}
vtk:           8.1.2
PyQt5:         5.13.2

Copy link
Copy Markdown
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that a tiny and still readable change can make such a big difference!

Can you update latest.inc to mention this speedup?

@yh-luo
Copy link
Copy Markdown
Contributor Author

yh-luo commented Aug 5, 2020

Nice that a tiny and still readable change can make such a big difference!

Can you update latest.inc to mention this speedup?

Sure, I'll do that tomorrow (it's nighttime now in my timezone 😉 )

@larsoner larsoner added this to the 0.21 milestone Aug 5, 2020
@yh-luo yh-luo changed the title ENH: Use numba to speed up summarize_clusters_stc MRG, ENH: Use numba to speed up summarize_clusters_stc Aug 6, 2020
@larsoner larsoner merged commit 2565891 into mne-tools:master Aug 6, 2020
@larsoner
Copy link
Copy Markdown
Member

larsoner commented Aug 6, 2020

Thanks @yh-luo !

@yh-luo yh-luo deleted the clu branch August 11, 2020 01:22
sharifhsn added a commit to sharifhsn/mne-python that referenced this pull request Mar 8, 2026
Add benchmark scripts and feasibility documentation for GPU-accelerating
the spatio-temporal cluster-based permutation test, which is the mne-tools#1
computational bottleneck for MNE researchers doing source-space analyses.

The connected-component labeling step in _get_components() consumes ~97%
of permutation test runtime. This adds:

- gpu_accel/benchmark_cluster_cpu.py: CPU baseline benchmark using the
  MNE sample dataset (fsaverage ico-5, ~20K vertices)
- gpu_accel/patch_cupy_poc.py: CuPy proof-of-concept that monkey-patches
  _get_components with GPU connected_components (NVIDIA)
- gpu_accel/FEASIBILITY.md: Full analysis of GPU CCL algorithms, hardware
  requirements, and a three-phase plan (CuPy PoC → wgpu+Rust → fused pipeline)
- CLAUDE.md: Development guide with uv setup instructions and GPU work context

Related: mne-tools#5439, mne-tools#12609, mne-tools#7784, mne-tools#8095, mne-tools#13175

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants