Skip to content
/ MambaTM Public

Official repo for the paper Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation on CVPR 2025 (Highlight)

Notifications You must be signed in to change notification settings

xg416/MambaTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MambaTM [CVPR 2025 Highlight🔥]

Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation

Xingguang Zhang, Nicholas Chimitt, Xijun Wang, Yu Yuan, Stanley H. Chan

Project Page | Paper

📑 Contents

🔨 Environment Installation

conda create -n MambaTM python=3.11
conda activate MambaTM
cd code
pip install -r requirements.txt

🧩 Prepare Training Datasets

First download and prepare the ATSyn dataset first.

To train the ReBlurNet, please download the LSDIR dataset.

🛠️ Training

Our model is trained in 2 stages: ReBlurNet and MambaTM

Train the ReBlurNet

You can directly use our pre-trained ReBlurNet at code/LPD_learning/model_zoo/NAF_decoder.pth, if you want to train it from scratch, run the following:

cd code/LPD_learning
python train_vae.py --iters ${number_of_iterations} -ps ${image_patch_size} -b ${batch_size} --LSDIR_path ${path_to_LSDIR_dataset} --ATSyn_dynamic_path ${path_to_ATSyn_dynamic_dataset} --ATSyn_static_path ${path_to_ATSyn_static_dataset} --exp_dir ${path_to_save_checkpoints}

When the training is finished, run

mv ${the_best_checkpoint} code/LPD_learning/model_zoo/NAF_decoder.pth

Train the MambaTM

For the training on dynamic scene data, run the following:

python train_MambaTM_dynamic.py --train_path ${your_training_data_path} --train_info ${the associated train_info.json} --val_path ${your_testing_data_path} --val_info ${the associated test_info.json} -f ${loaded_model_path (if you want to resume a pre-trained checkpoint)} 

We also provide support for DDP training for systems with Slurm Workload Manager, if you are using multiple GPUs, you can run with the following:

srun --ntasks={number of GPUs} --gpus-per-task=1  python train_MambaTM_dynamic_DDP.py --train_path ${your_training_data_path} --train_info ${the associated train_info.json} --val_path ${your_testing_data_path} --val_info ${the associated test_info.json} -f ${loaded_model_path (if you want to resume a pre-trained checkpoint)} 

Other arguments and hyperparameters for training are described in the train_MambaTM_dynamic.py file and train_MambaTM_dynamic_DDP.py, please refer to them for more flexible training. Use smaller patch_size and num_frames in the beginning phase of training can accelerate the entire process.

Later, you can start finetuning on the static scene images for the static scene model by running the following:

python train_MambaTM_static.py --train_path ${your_training_data_path} --val_path ${your_testing_data_path} -f ${pretrained_dynamic_scene_model_path} --start_over

Or:

srun --ntasks={number of GPUs} --gpus-per-task=1   python train_MambaTM_static.py --train_path ${your_training_data_path} --val_path ${your_testing_data_path} -f ${pretrained_dynamic_scene_model_path} --start_over

We injected a certain level of Gaussian noise during training in both modalities for better generalization on real-world data.

Our pre-trained models are in code/model_zoo

🚀 Performance Evaluation

Dynamic scene model on ATSyn_dynamic dataset:
python test_MambaTM_dynamic.py --data_path ${your_testing_data_path} --info_path ${the associated test_info.json} -result ${path_for_stored_output} -mp ${testing_model_path} 

Static scene model on ATSyn_static dataset:

python test_MambaTM_static.py --val_path ${your_testing_data_path} -result ${path_for_stored_output} -f ${testing_model_path} 

Inference on Turbulence Text dataset, we generate the central 4 frames for the text recognition:

python inference_MambaTM_text.py -f ${testing_static_scene_model_path} --n_frames 60 --resize 360

Restoration:

Link Zhang, Xingguang, Zhiyuan Mao, Nicholas Chimitt, and Stanley H. Chan. "Imaging through the atmosphere using turbulence mitigation transformer." IEEE Transactions on Computational Imaging (2024).

Link Ajay Jaiswal*, Xingguang Zhang*, Stanley H. Chan, Zhangyang Wang. "Physics-Driven Turbulence Image Restoration with Stochastic Refinement." Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 12170-12181

Link Xingguang Zhang, Nicholas Chimitt, Yiheng Chi, Zhiyuan Mao, Stanley H. Chan. "Spatio-Temporal Turbulence Mitigation: A Translational Perspective." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 2889-2899

Link Jin, Darui, Ying Chen, Yi Lu, Junzhang Chen, Peng Wang, Zichao Liu, Sheng Guo, and Xiangzhi Bai. "Neutralizing the impact of atmospheric turbulence on complex scene imaging via deep learning." Nature Machine Intelligence 3, no. 10 (2021): 876-884.

Datasets:

OTIS dataset | TSRWGAN data | Turbulence Text | Heat Chamber | TMT dataset | ATSyn dataset | BRIAR (Not public yet)

📘 Citation

Please consider citing our work as follows if it is helpful.
@InProceedings{zhang2025MambaTM,
    author={Zhang, Xingguang and Chimitt, Nicholas and Wang, Xijun and Yuan, Yu and Chan, Stanley H}, 
    title={Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month={June},
    year={2025}
}

🎫 License

This project is released under the MIT license. Please refer to the acknowledged repositories for their licenses.

About

Official repo for the paper Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation on CVPR 2025 (Highlight)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published