Skip to main content
Log in

RTF-YOLO: an efficient object detection framework for autonomous vehicles

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

The perception system is the core component of self-driving cars, and a reliable perception system is a key condition for the secure and reliable operation of autonomous vehicles in various complex situations. To improve object detection performance in complex scenes, this study proposes a lightweight and efficient detection framework by integrating three key components. First, Receptive Field Attention Convolution (RFAConv) is introduced to optimize spatial feature representation within the receptive field, enhancing the model’s ability to focus on target regions and improving detection accuracy for small and cluttered objects. Second, the Top-K Sparse Kernel Attention (TKSA) mechanism selectively attends to the most relevant feature areas in the backbone, improving long-range perception while maintaining computational efficiency. Finally, Focaler-IoU is adopted as the bounding box regression loss to dynamically adjust the contribution of samples based on their IoU, improving stability and accelerating training convergence. Together, these enhancements significantly boost detection accuracy and robustness, particularly in challenging environments. Assessments performed on the KITTI datasets show that our detection algorithms achieve mAP@0.5 and mAP@0.5:0.95 of 83% and 55.4%, respectively. In comparison with the baseline model YOLOv11n, our approach delivers notable performance gains, with improvements of 5.2% and 3.9%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from €37.37 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Fig. 1
The alternative text for this image may have been generated using AI.
Fig. 2
The alternative text for this image may have been generated using AI.
Fig. 3
The alternative text for this image may have been generated using AI.
Fig. 4
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

References

  1. M. Milford, S. Anthony, W. Scheirer, Self-driving vehicles: Key technical challenges and progress off the road, IEEE Potentials, 39 (2019) 37-45. https://doi.org/10.1109/MPOT.2019.2939376

  2. B. Mahaur, K. Mishra, A. Kumar, An improved lightweight small object detection framework applied to real-time autonomous driving, Expert Systems with Applications, 2023. https://doi.org/10.1016/j.eswa.2023.121036

  3. Z. Song, L. Liu, F. Jia, Y. Luo, C. Jia, G. Zhang, L. Yang, L. Wang, Robustness-aware 3d object detection in autonomous driving: A review and outlook, IEEE Transactions on Intelligent Transportation Systems, 2024. https://doi.org/10.48550/arXiv.2401.06542

  4. E. Alpaydin, C. Kaynak, Cascading classifiers, Kybernetika, 1998.

  5. R. Girshick, Fast r-cnn, in: Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440-1448. https://doi.org/10.1109/ICCV.2015.169

  6. K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask r-cnn, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961-2969. https://doi.org/10.1109/ICCV.2017.322

  7. S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE transactions on pattern analysis machine intelligence, 2016. https://doi.org/10.1109/TPAMI.2016.2577031

  8. J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779-788. https://doi.org/10.1109/CVPR.2016.91

  9. J. Redmon, A. Farhadi, Yolov3: An incremental improvement, arXiv preprint arXiv:.02767, 2018. https://doi.org/10.48550/arXiv.1804.02767

  10. C.-Y. Wang, A. Bochkovskiy, H.-Y.M. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 7464-7475. https://doi.org/10.1109/CVPR52729.2023.00721

  11. A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, J. Han, Yolov10: Real-time end-to-end object detection, Advances in Neural Information Processing Systems, 2024. https://doi.org/10.48550/arXiv.2405.14458

  12. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A.C. Berg, Ssd: Single shot multibox detector, in: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, Springer, 2016, pp. 21-37. https://doi.org/10.48550/arXiv.1512.02325

  13. L. Xu, W. Yan, J. Ji, The research of a novel WOG-YOLO algorithm for autonomous driving object detection, Scientific reports, 2023. https://doi.org/10.1038/s41598-023-30409-1

  14. M. Li, X. Liu, S. Chen, L. Yang, Q. Du, Z. Han, J. Wang, MST-YOLO: Small Object Detection Model for Autonomous Driving, Sensors, 2024. https://doi.org/10.3390/s24227347

  15. R. Chaudhry, SD-YOLO-AWDNet: A hybrid approach for smart object detection in challenging weather for self-driving cars, Expert Systems with Applications, 2024. https://doi.org/10.1016/j.eswa.2024.124942

  16. L. Wang, S. Hua, C. Zhang, G. Yang, J. Ren, J. Li, YOLOdrive: A Lightweight Autonomous Driving Single-Stage Target Detection Approach, IEEE Internet of Things Journal, 2024. https://doi.org/10.1109/JIOT.2024.3439863

  17. B. Yu, Z. Li, Y. Cao, C. Wu, J. Qi, L. Wu, YOLO-MPAM: Efficient real-time neural networks based on multi-channel feature fusion, Expert Systems with Applications, 2024. https://doi.org/10.1016/j.eswa.2024.124282

  18. J. Ren, J. Yang, W. Zhang, K. Cai, RBS-YOLO: A vehicle detection algorithm based on multi-scale feature extraction, Signal, Image Video Processing, 2024. https://doi.org/10.1007/s11760-024-03007-5

  19. T. Xue, Z. Liu, S. Lan, Q. Zhang, A. Yang, J. Li, YOLO-FSE: An Improved Target Detection Algorithm for Vehicles in Autonomous Driving, IEEE Internet of Things Journal, 2025. https://doi.org/10.1109/JIOT.2025.3526224

  20. Q. Fan, Y. Li, M. Deveci, K. Zhong, S. Kadry, LUD-YOLO: A novel lightweight object detection network for unmanned aerial vehicle, Information Sciences, 2025. https://doi.org/10.1016/j.ins.2024.121366

  21. H. Wang, J. Liu, J. Zhao, J. Zhang, D. Zhao, Precision and speed: LSOD-YOLO for lightweight small object detection, Expert Systems with Applications, 2025. https://doi.org/10.1016/j.eswa.2025.126440

  22. R. Khanam, M. Hussain, Yolov11: An overview of the key architectural enhancements, arXiv preprint arXiv:.17725, 2024. https://doi.org/10.48550/arXiv.2410.17725

  23. X. Zhang, C. Liu, D. Yang, T. Song, Y. Ye, K. Li, Y. Song, RFAConv: Innovating spatial attention and standard convolutional operation, arXiv preprint arXiv:.03198, 2023. https://doi.org/10.48550/arXiv.2304.03198

  24. X. Chen, H. Li, M. Li, J. Pan, Learning a sparse transformer network for effective image deraining, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 5896-5905. https://doi.org/10.1109/CVPR52729.2023.00571

  25. H. Zhang, S. Zhang, Focaler-iou: More focused intersection over union loss, arXiv preprint arXiv:.10525, 2024. https://doi.org/10.48550/arXiv.2401.10525

  26. A. Geiger, P. Lenz, C. Stiller, R. Urtasun, Vision meets robotics: The kitti dataset, The international journal of robotics research, 2013. http://dx.doi.org/10.1177/0278364913491297

  27. F. Yu, H. Chen, X. Wang, W. Xian, Y. Chen, F. Liu, V. Madhavan, T. Darrell, Bdd100k: A diverse driving dataset for heterogeneous multitask learning, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2636-2645. https://doi.org/10.1109/CVPR42600.2020.00271

  28. C. Wang, W. He, Y. Nie, J. Guo, C. Liu, Y. Wang, K. Han, Gold-YOLO: Efficient object detector via gather-and-distribute mechanism, Advances in Neural Information Processing Systems, 2023. https://doi.org/10.48550/arXiv.2309.11331

  29. Y. Zhao, W. Lv, S. Xu, J. Wei, G. Wang, Q. Dang, Y. Liu, J. Chen, Detrs beat yolos on real-time object detection, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 16965-16974. https://doi.org/10.1109/CVPR52733.2024.01605

Download references

Author information

Authors and Affiliations

Authors

Contributions

Author 1 was responsible for the project design, manuscript review, and editing; Author 2 contributed key conceptual ideas, methodology, analysis of experimental data, and manuscript writing; Author 3 handled data acquisition and the review of relevant literature; Author 4 was in charge of software environment configuration and the review and validation of experimental results; Author 5 organized the experimental data and created the Tables 1, 2, 3, 4, 5 and 6; Author 6 organized the experimental data and prepared Figs. 1, 2, 3 and 4. All authors reviewed and approved the manuscript.

Corresponding author

Correspondence to Mengyao Zhen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, Y., Zhen, M., Ma, Y. et al. RTF-YOLO: an efficient object detection framework for autonomous vehicles. SIViP 19, 993 (2025). https://doi.org/10.1007/s11760-025-04585-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1007/s11760-025-04585-8

Keywords