ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [NeurIPS 2025]

Linfeng Tang^1*, Yeda Wang^1*, Zhanchuan Cai², Junjun Jiang³, Jiayi Ma^1†

¹Wuhan University ²Macau University of Science and Technology ³Harbin Institute of Technology
^*Equal Contribution ^†Corresponding Author

✨ News:

[2026-02-21] Our paper VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion has been officially accepted by The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)! [Paper] [Code]
[2025-09-18] Our paper ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts has been officially accepted by Advances in Neural Information Processing Systems (NeurIPS 2025)! [Paper] [Code]
[2025-09-10] Our paper Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion has been officially accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)! [Paper] [Code]
[2025-03-15] Our paper C2RF: Bridging Multi-modal Image Registration and Fusion via Commonality Mining and Contrastive Learning has been officially accepted by the International Journal of Computer Vision (IJCV)! [Paper] [Code]
[2025-02-11] We released a large-scale dataset for infrared and visible video fusion: M3SVD: Multi-Modal Multi-Scene Video Dataset.

🔎 Method Overview

Motivation

Framework

Frequency Domain Comparison

🔧 Environment Setup

Clone this repository:

git clone https://github.com/Linfeng-Tang/ControlFusion.git
cd ControlFusion

Create a Conda environment (recommended):

conda create -n controlfusion python=3.8 -y
conda activate controlfusion

Install dependency packages:
```
pip install -r requirements.txt
```

📂 Dataset Construction

please refer to genDateset

📂 Dataset Download

Google Drive

📥 Pre-trained Weights

Download the pretrained model Mask-DiFuser from Baidu Drive, and put the weight into `pretrained_weights/`.

🧪 Inference

You can use the test.py script we provide to fuse pairs of images. Please make sure you have downloaded the pre-trained weights. You can modify ControlFusion.py to select text/auto control by:

text_features = self.get_text_feature(text.expand(b, -1)).to(inp_img_A.dtype)
text_features = imgfeature

🚂 Train

You can use the train.py script we provide to train. Make sure you have organized your train dataset correctly.

📷 Results

Visualization of fusion results in different degraded scenarios

Generalization results in the real world

🕵️‍♂️ Detection

🎓 Citations

If our work is useful for your research, please consider citing and give us a star ⭐:

@inproceedings{Tang2025ControlFusion,
  author={Linfeng Tang, Yeda Wang, Zhanchuan Cai, Junjun Jiang, and Jiayi Ma},
  title={ControlFusion: A Controllable Image Fusion Network with Language-Vision Degradation Prompts}, 
  booktitle={Advances in Neural Information Processing Systems},
  year={2025},
 }

🤝 Contact

Please feel free to contact: linfeng0419@gmail.com, wangyeda@whu.edu.cn. We are very pleased to communicate with you and will maintain this repository during our free time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [NeurIPS 2025]

✨ News:

🔎 Method Overview

Motivation

Framework

Frequency Domain Comparison

🔧 Environment Setup

📂 Dataset Construction

📂 Dataset Download

📥 Pre-trained Weights

Download the pretrained model Mask-DiFuser from Baidu Drive, and put the weight into `pretrained_weights/`.

🧪 Inference

🚂 Train

📷 Results

Visualization of fusion results in different degraded scenarios

Generalization results in the real world

🕵️‍♂️ Detection

🎓 Citations

🤝 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
assets		assets
data		data
dataset		dataset
genDateset		genDateset
model		model
pretrained_weights		pretrained_weights
scripts		scripts
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
transforms.py		transforms.py

Folders and files

Latest commit

History

Repository files navigation

ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [NeurIPS 2025]

✨ News:

🔎 Method Overview

Motivation

Framework

Frequency Domain Comparison

🔧 Environment Setup

📂 Dataset Construction

📂 Dataset Download

📥 Pre-trained Weights

Download the pretrained model Mask-DiFuser from Baidu Drive, and put the weight into pretrained_weights/.

🧪 Inference

🚂 Train

📷 Results

Visualization of fusion results in different degraded scenarios

Generalization results in the real world

🕵️‍♂️ Detection

🎓 Citations

🤝 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Download the pretrained model Mask-DiFuser from Baidu Drive, and put the weight into `pretrained_weights/`.

Packages