Yechi Ma1,2 · Yanan Li2 · Wei Hua2,1 · Shu Kong3,4,*
1Zhejiang University 2Zhejiang Lab 3University of Macau 4Institute of Collaborative Innovation
Pro3D is a novel vision-based roadside monocular 3D object detector that establishes new state-of-the-art performance. On the DAIR-V2X-I benchmark, Pro3D demonstrates significant improvements over BEVSpread with margins of 6.4% (vehicle), 9.8% (cyclist), and 9.3% (pedestrian) across respective classes.
- [2025/11/25] : arXiv paper released.
- [2025/11/22] : Pro3D is accepted to WACV 2026.
- Training Code
- Checkpoints
- Inference Code
- Jupter Notebok Demo
- Initialization
Before proceeding with full pipeline implementation, we strongly recommend exploring our pre-configured demonstration notebook:
📚
This interactive notebook provides:
- End-to-end inference pipeline visualization
- Sample detection results with 3D bounding boxes
- Core feature demonstrations
- Environment validation checks
⚠️ Note: The complete production codebase is currently undergoing active development. While the demo reflects current capabilities, the full implementation will receive significant architectural improvements and expanded functionality in upcoming releases.
Contents
- Installation Guide (GPU environment setup)
- Dataset Preparation (DAIR-V2X-I/Rope3D conversion)
python scripts/gen_scene_prior.py
python [EXP_PATH] --gpus 8 -b 32
python [EXP_PATH] --ckpt_path [CKPT_PATH] --gpus 1 -e
This project leverages foundational work from these critical repositories:
| Project | Purpose | Link |
|---|---|---|
| BEVSpread | Voxel pooling innovation | GitHub |
| BEVHeight | Height-aware feature learning | GitHub |
| BEVDepth | Reliable depth estimation | GitHub |
| DAIR-V2X | Real-world roadside dataset | GitHub |
| Rope3D | Challenging 3D detection dataset | GitHub |
Development Status: The codebase is actively evolving. Major architecture improvements and additional features will be released in subsequent versions. Current implementations reflect our validated research baseline.
If you use Pro3D in your research, please cite our work:
@inproceedings{ma2025pro3d,
title={Roadside Monocular 3D Detection Prompted by 2D Detection},
author={Yechi Ma and Yanan Li and Wei Hua and Shu Kong},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
year={2026}
}
- Performance comparisons based on DAIR-V2X-I benchmark (CVPR 2024)
- All cited projects contain their respective citation requirements
