Skip to content

BeingBeyond/JALA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Joint-Aligned Latent Action: Towards Scalable VLA Pretraining in the Wild

Hao Luo1,3, Ye Wang2,3, Wanpeng Zhang1,3, Haoqi Yuan1,3, Yicheng Feng1,3, Haiweng Xu3,
Sipeng Zheng3, Zongqing Lu1,3โ€ 

1Peking University ย ย  2Renmin University of China ย ย  3BeingBeyond

Website arXiv License

JALA is a Transformer-based VLA pretraining framework that turns large-scale human manipulation videos into action-centric supervision without pixel-level reconstruction, bridging lab-annotated motion data and in-the-wild diversity via Joint Alignment.

News

  • [2026-02-28]: JALA accepted to CVPR 2026. Project page is live.

Citation

If you find our work useful, please consider citing us and give a star to our repository! ๐ŸŒŸ๐ŸŒŸ๐ŸŒŸ

@inproceedings{luo2026jointalignedlatentaction,
  title={Joint-Aligned Latent Action: Towards Scalable VLA Pretraining in the Wild},
  author={Hao Luo and Ye Wang and Wanpeng Zhang and Haoqi Yuan and Yicheng Feng and Haiweng Xu and Sipeng Zheng and Zongqing Lu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2026}
}

About

Joint-Aligned Latent Action: Towards Scalable VLA Pretraining in the Wild (CVPR 2026)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages