DWI Preprocessing Pipeline

A Python pipeline for processing diffusion-weighted MRI data using FSL tools. Takes a directory containing two DWI NIfTI files acquired with opposite phase-encoding directions (AP/PA) and runs a full preprocessing and analysis workflow.

Installation

The pipeline requires a standard FSL installation along with a compatible graphics card and the corresponding accelerated tools, specifically mmorf, probtrackx2_gpu, bedpostx_gpu, and eddy_cuda. While NVIDIA users typically receive these via the standard FSL installer, macOS users must install the specific Metal-accelerated package to utilize eddy_metal on Apple Silicon. Python dependencies can be installed using pip install nibabel numpy scipy.

pip install -r requirements.txt

Usage

python3 dwipreproc.py <input_dir> [output_dir] [options]

If output_dir is omitted, defaults to <input_dir>_YYYYMMDD.

Examples

A minimal example is the quick demo which checks that everything is installed and running. This is a toy dataset which is not realistic of real world data.

python dwipreproc.py ./quick

The dwi demo is more typical of real world data.

python dwipreproc.py ./dwi

By default, the pipeline allocates N-1 logical processors (where N is the total system count) to balance high performance with system responsiveness.

For clustered computer nodes or shared environments, you may wish to explicitly limit this value to avoid resource contention. While additional cores follow Amdahl's Law—yielding diminishing returns as counts increase—providing a moderate number of threads is essential for maintaining high GPU utilization. You can explicitly set the thread count using the -nthr flag.

However, Eddy requires special consideration. The original FSL Eddy implementation for GPU does not support the --nthr option and is restricted to a single thread. In contrast, the optimized version allows for multi-threading, providing dramatic performance benefits that scale effectively up to the number of shells in your diffusion dataset. With the optimized version, if a thread count is not explicitly provided, the software will automatically determine an allocation based on available CPU threads and system RAM.

Due to these differences in Eddy versions, you should choose your command based on which implementation is available on your system.

For the original Eddy implementation, use:

python dwipreproc.py -nthr 4 ./dwi

For the optimized Eddy implementation, use:

python dwipreproc.py -nthr 5 -nthrEddy 4 ./quick

Performance

Several of these tests use optimized versions of mmorf, bedpost, probtrackx and eddy, for these we refer to the build date of these unreleased versions.

The dwi dataset provides a benchmark for processing a 140×140×81 diffusion dataset with 204 volumes.

python dwipreproc.py --keep-temp ./dwi ./my_benchmark_results

Here is the performance for version 20260311 on an 48Gb Apple M4 Pro:

Stage	Seconds
Topup	327
Eddy	509
MMORF non-linear registration	79
Bedpost	109
Probtrackx	111
TOTAL	1192

Here is the same test on a 128 Gb DGX Spark using the executables from FSL 6.0.7.21.

Stage	Seconds
Topup	331
Eddy	2901
MMORF non-linear registration	147
Bedpost	596
Probtrackx	300
TOTAL	4321

Here is the same test on the same 128 Gb DGX Spark using release 20260311.

Stage	Seconds
Topup	336
Eddy	905
MMORF non-linear registration	128
Bedpost	114
Probtrackx	97
TOTAL	1622

Here is the same test on a 128 Gb AMD 7995wx (96 cores, 192 threads) with RTX4090 GPU using the executables from FSL 6.0.7.21.

Stage	Seconds
Topup	441
Eddy	1102
MMORF non-linear registration	130
Bedpost	267
Probtrackx	162
TOTAL	2159

Here is the same test on the same 128 Gb 7995wx using release 20260311.

Stage	Seconds
Topup	448
Eddy	426
MMORF non-linear registration	116
Bedpost	64
Probtrackx	37
TOTAL	1135

Options

Flag	Description
`--force`	Re-run all steps even if outputs already exist
`--no-bedpost`	Skip bedpostx and downstream tractography steps
`--no-pad-odd`	Disable padding of odd spatial dimensions
`--keep-temp`	Keep temporary files after pipeline completes
`-nthr N`	Number of threads (default: CPU count - 1)
`--baseName NAME`	Base name for output files (default: `dwi`)
`--b0tolerance VAL`	b-value threshold for identifying b0 volumes (default: 40)

Pipeline Stages

Step 1: Input Validation

Scans the input directory for exactly two NIfTI files that each have companion .json, .bval, and .bvec sidecar files. Validates that TotalReadoutTime and PhaseEncodingDirection are present in the JSON sidecars. Auto-detects which acquisition has the negative phase-encoding direction (assigned as dtiR) and which has the positive direction (assigned as dti). Copies and stages all files into <output_dir>/tmp/, converting uncompressed .nii to .nii.gz if needed.

Step 2: Topup + Eddy

Corrects susceptibility-induced and eddy-current distortions:

Odd dimension padding — If any spatial dimension is odd, zero-pads it by one voxel using numpy so that topup/eddy can operate correctly. Disable with --no-pad-odd.
Extract b0 volumes — Loads both acquisitions once with nibabel, extracts b0 volumes (b-value < b0tolerance) by array indexing, and concatenates them into a single 4D file.
Acquisition parameters — Builds acq_param.txt from the phase-encoding directions and total readout time in the JSON sidecars.
Topup — Estimates the susceptibility-induced off-resonance field from the opposing-PE b0 volumes. Selects a configuration file (b02b0_4.cnf, b02b0.cnf, or b02b0_1.cnf) based on whether dimensions are divisible by 4, 2, or neither.
Brain extraction — Averages the topup-corrected b0 volumes and runs bet to create a brain mask.
Merge volumes — Concatenates the two full DWI datasets. B-values are rounded to the nearest 100 for eddy compatibility. B-vectors are concatenated and an index file is written mapping each volume to its acquisition parameters line.
Eddy — Runs eddy (preferring eddy_cuda > eddy_openmp > eddy) with --repol (outlier replacement) and --data_is_shelled to correct eddy currents, subject motion, and remaining susceptibility distortion.

Step 3: DTI Fit

Runs FSL dtifit on the eddy-corrected data using the rotated b-vectors and brain mask to compute the diffusion tensor at each voxel. Produces standard DTI maps including FA (fractional anisotropy), MD, and eigenvector images. Also creates a thresholded/eroded FA binary mask (_FA_thr.nii.gz) at FA > 0.15, used later as a brain mask for bedpostx.

Step 4: Rim Clean

Removes artifactually bright voxels at the brain boundary of the FA map. Uses a distance transform to identify the outermost shell of non-zero voxels (distance = 1) and their immediate interior neighbours (distance = 2). Any boundary voxel whose FA exceeds the median FA of its interior neighbours is zeroed out. Outputs a cleaned FA image (r<baseName>_FA.nii.gz).

Step 5: MMORF Atlas Registration

Registers the FMRIB58 FA template to the subject's native-space cleaned FA map using a two-stage approach:

FLIRT — 12-DOF affine registration of the template to the subject FA.
MMORF — Non-linear (Metal-accelerated) registration using a multi-resolution warp field, refining from 32mm down to high resolution.
Apply warps — Warps the FMRIB58 template (spline interpolation) and the HarvardOxford cortical atlas (nearest-neighbour interpolation) into subject space.

Step 6: Bedpostx

(Skipped with --no-bedpost)

Stages the eddy-corrected data, rotated b-vectors, b-values, and thresholded FA mask into <output_dir>/bedpost/. Auto-detects whether probtrackx2_gpu supports compressed NIfTI and sets FSLOUTPUTTYPE accordingly (defaults to NIFTI for uncompressed outputs to avoid gzip decompression overhead during probtrackx). Runs bedpostx (preferring GPU version) to estimate fibre orientation distributions at each voxel using a Bayesian multi-fibre model. Outputs are written to <output_dir>/bedpost.bedpostX/. This step can take many hours.

Step 7: Extract Masks

Splits the native-space HarvardOxford atlas into individual binary mask files — one per labelled region — saved as masks/<index>.nii in the output directory.

Step 8: Probtrackx

(Skipped with --no-bedpost)

Runs probabilistic tractography (probtrackx2_gpu) from each atlas region mask as a seed, using the bedpostx fibre orientations. Generates 5000 streamlines per seed voxel. If probtrackx2_gpu supports the --seedlist option, runs a single invocation with all seeds (loads bedpostx data once). Otherwise falls back to serial invocations via fsl_sub.

Step 9: Fiber Quantify

(Skipped with --no-bedpost)

Computes pairwise structural connectivity between all atlas regions from the probtrackx output. For each pair of regions (i, j), extracts the streamline density of region i's tractography within region j's mask and vice versa. Produces four symmetric connectivity matrices saved as both .npy and .tsv:

density — Normalised connection density (fiber count / total possible streamlines)
fiber_count — Raw bidirectional streamline counts
mean — Mean streamline probability across target mask voxels
max — Maximum streamline probability across target mask voxels

Dependencies

Python: nibabel, numpy, scipy
FSL: eddy (cuda/openmp), topup, flirt, fslmerge, fslmaths, bet, dtifit, applywarp, bedpostx (gpu/cpu), probtrackx2_gpu, fsl_sub
Other: mmorf
Environment: FSLDIR must be set for template/atlas lookup

Incremental Re-runs

All steps check for existing outputs before running (step_done()). To skip already-completed stages, simply re-run the pipeline on the same output directory. Use --force to re-run all steps regardless.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DWI Preprocessing Pipeline

Installation

Usage

Examples

Performance

Options

Pipeline Stages

Step 1: Input Validation

Step 2: Topup + Eddy

Step 3: DTI Fit

Step 4: Rim Clean

Step 5: MMORF Atlas Registration

Step 6: Bedpostx

Step 7: Extract Masks

Step 8: Probtrackx

Step 9: Fiber Quantify

Dependencies

Incremental Re-runs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
dwi		dwi
quick		quick
LICENSE		LICENSE
README.md		README.md
dwipreproc.py		dwipreproc.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

DWI Preprocessing Pipeline

Installation

Usage

Examples

Performance

Options

Pipeline Stages

Step 1: Input Validation

Step 2: Topup + Eddy

Step 3: DTI Fit

Step 4: Rim Clean

Step 5: MMORF Atlas Registration

Step 6: Bedpostx

Step 7: Extract Masks

Step 8: Probtrackx

Step 9: Fiber Quantify

Dependencies

Incremental Re-runs

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages