E-Ro Nguyen* Yichi Zhang*, Kanchana Ranasinghe, Xiang Li, Michael S Ryoo
Stony Brook University
*Equal contribution
DAWN a unified diffusion-based framework for language-conditioned robotic manipulation that bridges high-level motion intent and low-level robot action via structured pixel motion representation.
- [Feb 2026] π Our paper has been accepted to CVPR 2026!
- [Sep 2025] π Initial arXiv release.
π Code & pretrained checkpoints will be released soon!
If you find our work useful, please consider citing:
@article{nguyen2025dawn,
title = {Pixel Motion Diffusion is What We Need for Robot Control},
author = {Nguyen, E-Ro and Zhang, Yichi and Ranasinghe, Kanchana and Li, Xiang and Ryoo, Michael S},
journal = {arXiv preprint arXiv:2509.22652},
year = {2025}
}