-
[2026-02-21] Our paper VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion has been officially accepted by The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)! [Paper] [Code]
-
[2025-09-18] Our paper ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts has been officially accepted by Advances in Neural Information Processing Systems (NeurIPS 2025)! [Paper] [Code]
-
[2025-09-10] Our paper Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion has been officially accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)! [Paper] [Code]
-
[2025-03-15] Our paper C2RF: Bridging Multi-modal Image Registration and Fusion via Commonality Mining and Contrastive Learning has been officially accepted by the International Journal of Computer Vision (IJCV)! [Paper] [Code]
-
[2025-02-11] We released a large-scale dataset for infrared and visible video fusion: M2VD: Multi-modal Multi-scene Video Dataset.
-
[2024-11-28] SeAFusion won the Information Fusion Best Paper Award 2024!
This is official Pytorch implementation of "Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network"
Welcome to follow the further work of our SeAFusion:Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity 【Paper】, 【Code】.
The overall framework of the proposed semantic-aware infrared and visible image fusion algorithm.
The architecture of the real-time infrared and visible image fusion network based on gradient residual dense block.
Run **CUDA_VISIBLE_DEVICES=0 python train.py** to train your model.
The training data are selected from the MFNet dataset. For convenient training, users can download the training dataset from here, in which the extraction code is: bvfl.
The MFNet dataset can be downloaded via the following link: https://drive.google.com/drive/folders/18BQFWRfhXzSuMloUmtiBRFrr6NSrf8Fw.
The MFNet project address is: https://www.mi.t.u-tokyo.ac.jp/static/projects/mil_multispectral/.
Run **CUDA_VISIBLE_DEVICES=0 python test.py** to test the model.
For quantitative assessments, please follow the instruction to modify and run . /Evaluation/test_evaluation.m .
- torch 1.7.1
- torchvision 0.8.2
- numpy 1.19.2
- pillow 8.0.1
Qualitative comparison of SeAFusion with 9 state-of-the-art methods on 00633D image from the MFNet dataset.
Segmentation results for infrared, visible and fused images from the MFNet dataset. The segmentation models are re-trained on infrared, visible and fused image sets.
Each two rows represent a scene.
Segmentation results for infrared, visible and fused images from the MFNet dataset. The segmentation model is Deeplabv3+, pre-trained on the Cityscapes dataset. Each
two rows represent a scene.
Object detection results for infrared, visible and fused images from the MFNet dataset. The YOLOv5 detector, pre-trained on the Coco dataset is deployed to achieve
object detection.
@article{Tang2024Mask-DiFuser,
author={Tang, Linfeng and Li, Chunyu and Ma, Jiayi},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion},
year={2025},
volume={},
number={},
pages={1-18},
}
@article{Tang2024C2RF,
title={C2RF: Bridging Multi-modal Image Registration and Fusion via Commonality Mining and Contrastive Learning},
author={Tang, Linfeng and Yan, Qinglong and Xiang, Xinyu and Fang, Leyuan and Ma, Jiayi},
journal={International Journal of Computer Vision},
pages={5262--5280},
volume={133},
year={2025},
}
@article{TANG202228SeAFusion,
title = {Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network},
journal = {Information Fusion},
volume = {82},
pages = {28-42},
year = {2022},
issn = {1566-2535}
}