Skip to content

VoyageWang/SAM2LOVE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes

image


1Tsinghua University 2ZJU

πŸ“– Overview

  1. We propose a novel framework, SAM2-LOVE that firstly leverages SAM2 to achieve pixel-wise understanding in the LAVS by designing a multimodal fusion module.

  2. We develop creative token propagation and accumulation strategies to improve spatio-temporal comprehension of the promtable token.

  3. Extensive experiments on Ref-AVS dataset demonstrate the superiority of our method, with ablation studies highlighting the simplicity and effectiveness of its modules.

image

🌹 Acknowledgement

Our work is primarily based on EVF-SAM, SAM2, Ref-AVS. We are sincerely grateful for their excellent works.

πŸ“š Citation

If you find our paper and code helpful for your research, please consider starring our repository ⭐ and citing our work ✏️.

@inproceedings{wang2025sam2,
  title={SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes},
  author={Wang, Yuji and Xu, Haoran and Liu, Yong and Li, Jiaze and Tang, Yansong},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={28932--28941},
  year={2025}
}

About

[CVPR2025] The repository of SAM2-LOVE paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors