SeAFusion

✨ News

[2026-02-21] Our paper VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion has been officially accepted by The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)! [Paper] [Code]
[2025-09-18] Our paper ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts has been officially accepted by Advances in Neural Information Processing Systems (NeurIPS 2025)! [Paper] [Code]
[2025-09-10] Our paper Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion has been officially accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)! [Paper] [Code]
[2025-03-15] Our paper C2RF: Bridging Multi-modal Image Registration and Fusion via Commonality Mining and Contrastive Learning has been officially accepted by the International Journal of Computer Vision (IJCV)! [Paper] [Code]
[2025-02-11] We released a large-scale dataset for infrared and visible video fusion: M2VD: Multi-modal Multi-scene Video Dataset.
[2024-11-28] SeAFusion won the Information Fusion Best Paper Award 2024!

This is official Pytorch implementation of "Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network"

Welcome to follow the further work of our SeAFusion：Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity 【Paper】, 【Code】.

Framework

The overall framework of the proposed semantic-aware infrared and visible image fusion algorithm.

Network Architecture

The architecture of the real-time infrared and visible image fusion network based on gradient residual dense block.

To Train

Run **CUDA_VISIBLE_DEVICES=0 python train.py** to train your model. The training data are selected from the MFNet dataset. For convenient training, users can download the training dataset from here, in which the extraction code is: bvfl.

The MFNet dataset can be downloaded via the following link: https://drive.google.com/drive/folders/18BQFWRfhXzSuMloUmtiBRFrr6NSrf8Fw.

The MFNet project address is: https://www.mi.t.u-tokyo.ac.jp/static/projects/mil_multispectral/.

To Test

Run **CUDA_VISIBLE_DEVICES=0 python test.py** to test the model.

For quantitative evaluation

For quantitative assessments, please follow the instruction to modify and run . /Evaluation/test_evaluation.m .

Recommended Environment

torch 1.7.1
torchvision 0.8.2
numpy 1.19.2
pillow 8.0.1

Fusion Example

Qualitative comparison of SeAFusion with 9 state-of-the-art methods on 00633D image from the MFNet dataset.

Segmentation Results

Segmentation results for infrared, visible and fused images from the MFNet dataset. The segmentation models are re-trained on infrared, visible and fused image sets. Each two rows represent a scene.

Segmentation results for infrared, visible and fused images from the MFNet dataset. The segmentation model is Deeplabv3+, pre-trained on the Cityscapes dataset. Each two rows represent a scene.

Detection Results

Object detection results for infrared, visible and fused images from the MFNet dataset. The YOLOv5 detector, pre-trained on the Coco dataset is deployed to achieve object detection.

If this work is helpful to you, please cite it as：

@article{Tang2024Mask-DiFuser,
  author={Tang, Linfeng and Li, Chunyu and Ma, Jiayi},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion}, 
  year={2025},
  volume={},
  number={},
  pages={1-18},
 }

@article{Tang2024C2RF,
	title={C2RF: Bridging Multi-modal Image Registration and Fusion via Commonality Mining and Contrastive Learning}, 
	author={Tang, Linfeng and Yan, Qinglong and Xiang, Xinyu and Fang, Leyuan and Ma, Jiayi},
	journal={International Journal of Computer Vision}, 
	pages={5262--5280},
	volume={133},
	year={2025},
}

@article{TANG202228SeAFusion,
title = {Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network},
journal = {Information Fusion},
volume = {82},
pages = {28-42},
year = {2022},
issn = {1566-2535}
}

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.idea		.idea
Evaluation		Evaluation
Figure		Figure
Fusion_results		Fusion_results
datasets/MSRS		datasets/MSRS
model/Fusion		model/Fusion
test_imgs		test_imgs
FusionNet.py		FusionNet.py
LICENSE		LICENSE
MSRS.py		MSRS.py
README.md		README.md
TaskFusion_dataset.py		TaskFusion_dataset.py
cityscapes.py		cityscapes.py
cityscapes_info.json		cityscapes_info.json
datasets.py		datasets.py
evaluate.py		evaluate.py
logger.py		logger.py
loss.py		loss.py
model_TII.py		model_TII.py
optimizer.py		optimizer.py
readme.md		readme.md
resnet.py		resnet.py
test.py		test.py
train.py		train.py
transform.py		transform.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SeAFusion

✨ News

Welcome to follow the further work of our SeAFusion：Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity 【Paper】, 【Code】.

Framework

Network Architecture

To Train

To Test

For quantitative evaluation

Recommended Environment

Fusion Example

Segmentation Results

Detection Results

If this work is helpful to you, please cite it as：

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SeAFusion

✨ News

Welcome to follow the further work of our SeAFusion：Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity 【Paper】, 【Code】.

Framework

Network Architecture

To Train

To Test

For quantitative evaluation

Recommended Environment

Fusion Example

Segmentation Results

Detection Results

If this work is helpful to you, please cite it as：

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages