MonoASRH: Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3-D Object Detection
This repository hosts the official implementation of Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3-D Object Detection.
The official results in the paper on KITTI Val Set:
| Models | Val, AP3D|R40 | ||
| Easy | Mod. | Hard | |
| MonoASRH | 28.35% | 20.75% | 17.56% |
This repo results on KITTI Val Set:
| Models | Val, AP3D|R40 | Checkpoint | HF Ckpt | ||
| Easy | Mod. | Hard | |||
| MonoASRH | 28.28% | 21.04% | 17.76% | ckpt | hf ckpt |
| 28.29% | 21.11% | 17.84% | ckpt | hf ckpt | |
-
Clone this project and create a conda environment:
git clone https://github.com/WYFDUT/MonoASRH.git cd MonoASRH conda create -n monoasrh python=3.9 conda activate monoasrh -
Install pytorch and torchvision matching your CUDA version:
# For example, We adopt torch 1.11.0+cu113 pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113 -
Install requirements:
pip install -r requirements.txt
-
Download KITTI datasets and prepare the directory structure as:
│MonoASRH/ ├──... │data/kitti/ ├──ImageSets/ ├──training/ │ ├──image_2 │ ├──label_2 │ ├──calib ├──testing/ │ ├──image_2 │ ├──calib
You can also change the data path at "dataset/root_dir" in
lib/kitti.yaml.
You can modify the settings of models and training in lib/kitti.yaml:
python tools/train_val.pypython tools/train_val.py -eThe best checkpoint will be evaluated as default. You can change it at "tester/resume_model" in lib/kitti.yaml:
python tools/train_val.py -tIf you use this code in your research, please cite:
@ARTICLE{11395320,
author={Wang, Yifan and Yang, Xiaochen and Pu, Fanqi and Liao, Qingmin and Yang, Wenming},
journal={IEEE Transactions on Intelligent Transportation Systems},
title={Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3-D Object Detection},
year={2026},
volume={},
number={},
pages={1-14},
keywords={3D object detection;monocular;scale-aware;scene understanding;autonomous driving},
doi={10.1109/TITS.2026.3659175}}This repo benefits from the excellent work GUPNet and MonoLSS
