The official repository for Unified Domain Adaptive Semantic Segmentation (TPAMI 2025).
| Resource | Link |
|---|---|
| 📄 IEEE Xplore | IEEE Xplore: 10972076 |
| 📄 Supplemental Video and PDF | IEEE Xplore |
| 🎬 Video Demo | Demo Video (Google Drive) |
| 🎬 Video Demo | Demo Video (Bilibili) |
| 📄 Arxiv | ArXiv link |
| 📄 Other link | Other link |
| 💡 中文解读 | 中文解读 |
Source code for image-based UDA-SS is located in the /image_udass directory.
For setup and training instructions, refer to the image_udass/README.md.
Source code for video-based UDA-SS is located in the /video_udass directory.
For setup and training instructions, refer to the video_udass/README.md.
- 🎉 2025.07.12: We have released the complete source code! Feel free to contact us if you have any questions, we are happy to discuss!
- 🎉 If you find UDASS helpful in your research, please consider giving us a ⭐ on GitHub and citing our work in your publications!
- 🎉 2025.04.25: Our paper has been accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025!
Unsupervised Domain Adaptive Semantic Segmentation (UDA-SS) aims to transfer supervision from a labeled source domain to an unlabeled target domain. Most existing UDA-SS works focus on images, while recent studies have extended this to video by modeling the temporal dimension. Despite sharing the core challenge, i.e. overcoming domain distribution shift, the two research lines have largely developed in isolation. This separation introduces several limitations:
- Insights remain fragmented, lacking a unified understanding of the problem and potential solutions.
- Unified methods, techniques, and best practices cannot be established, causing redundant efforts and missed opportunities.
- Advances in one domain (image or video) cannot effectively transfer to the other, leading to suboptimal performance.
Our motivation: We advocate for unifying the study of UDA-SS across image and video settings, enabling comprehensive understanding, synergistic advances, and efficient knowledge sharing.
To this end, we introduce a general data augmentation perspective as a unifying conceptual framework. Specifically, we propose Quad-directional Mixup (QuadMix), which performs intra- and inter-domain mixing in feature space through four directional paths. To address temporal shifts in video, we incorporate optical flow-guided spatio-temporal aggregation for fine-grained domain alignment.
Extensive experiments on four challenging UDA-SS benchmarks show that our method outperforms state-of-the-art approaches by a large margin.
Keywords: Unified domain adaptation, semantic segmentation, QuadMix, flow-guided spatio-temporal aggregation.
You can also find the demo video on:
💡 Please select HD (1080p) for clearer visualizations.
If you find UDASS helpful in your research, please consider giving us a ⭐ on GitHub and citing our work in your publications!
@ARTICLE{10972076,
author={Zhang, Zhe and Wu, Gaochang and Zhang, Jing and Zhu, Xiatian and Tao, Dacheng and Chai, Tianyou},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Unified Domain Adaptive Semantic Segmentation},
year={2025},
volume={47},
number={8},
pages={6731-6748},
keywords={Videos;Semantics;Optical flow;Training;Adaptation models;Transformers;Optical mixing;Artificial intelligence;Semantic segmentation;Minimization;Unsupervised domain adaptation;semantic segmentation;unified adaptation;domain mixup},
doi={10.1109/TPAMI.2025.3562999}}