FindTrack

This is the official PyTorch implementation of our paper:

Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation, ICCVW 2025
Suhwan Cho*, Seunghoon Lee*, Minhyeok Lee, Jungho Lee, Sangyoun Lee
Link: [ICCVW] [arXiv]

You can also explore other related works at awesome-video-object segmentation.

Demo Video

demo.mp4

Abstract

Existing referring VOS methods typically fuse visual and textual features in a highly entangled manner, processing multi-modal information jointly. However, this entanglement often leads to challenges in resolving ambiguous target identification and maintaining consistent mask propagation across frames. To address these issues, we propose a decoupled framework that explicitly separates object identification from mask propagation. The key frame is adaptively selected based on segmentation confidence and vision-text alignment, establishing a reliable anchor for propagation.

Setup

1. Download the datasets: Ref-YouTube-VOS, Ref-DAVIS17, MeViS.

2. Download Alpha-CLIP weights and place it in the weights/ directory.

Running

Training (optional)

FindTrack works well in a training-free manner, but fine-tuning on specific datasets can improve performance further.

For Ref-YouTube-VOS dataset:

deepspeed --num_gpus 4 train_ytvos.py

For MeViS dataset:

deepspeed --num_gpus 4 train_mevis.py

Testing

For Ref-YouTube-VOS dataset:

python run_ytvos.py

For MeViS dataset:

python run_mevis.py

Verify the following before running:
✅ Testing dataset selection
✅ GPU availability and configuration
✅ Pre-trained model path

Gradio Demo

You can use the web demo with your own video!

Run the Gradio demo with:

python demo.py

Attachments

Pre-computed results

Contact

Code and models are only available for non-commercial research purposes.
For questions or inquiries, feel free to contact:

E-mail: suhwanx@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
alphaclip		alphaclip
cutie		cutie
datasets		datasets
evfsam		evfsam
models		models
sample		sample
weights		weights
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt
run_mevis.py		run_mevis.py
run_ytvos.py		run_ytvos.py
train_mevis.py		train_mevis.py
train_ytvos.py		train_ytvos.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FindTrack

Demo Video

Abstract

Setup

Running

Training (optional)

Testing

Gradio Demo

Attachments

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FindTrack

Demo Video

Abstract

Setup

Running

Training (optional)

Testing

Gradio Demo

Attachments

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages