Ashish Patel 🇮🇳’s Post

𝗗𝗮𝘆-𝟮𝟯𝟯 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗙𝗮𝘀𝘁 𝗖𝗼𝗻𝘃𝗲𝗿𝗴𝗲𝗻𝗰𝗲 𝗼𝗳 𝗗𝗘𝗧𝗥 𝘄𝗶𝘁𝗵 𝗦𝗽𝗮𝘁𝗶𝗮𝗹𝗹𝘆 𝗠𝗼𝗱𝘂𝗹𝗮𝘁𝗲𝗱 𝗖𝗼-𝗔𝘁𝘁𝗲𝗻𝘁𝗶𝗼𝗻 by SenseTime 商汤科技 Research Follow me for a similar post:  🇮🇳 Ashish Patel Interesting Facts : 🔸 This paper is published in ICCV2021with 8 citation. ------------------------------------------------------------------- 𝗔𝗺𝗮𝘇𝗶𝗻𝗴 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 : https://lnkd.in/eYANMiiA Code : https://lnkd.in/eZCXxP-j ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 🚀 The recently proposed Detection Transformer (DETR) model successfully applies Transformer to objects detection and achieves comparable performance with two-stage object detection frameworks, such as Faster-RCNN. 🚀However, DETR suffers from its slow convergence. Training DETR \cite{carion2020end} from scratch needs 500 epochs to achieve a high accuracy. 🚀To accelerate its convergence, we propose a simple yet effective scheme for improving the DETR framework, namely Spatially Modulated Co-Attention (SMCA) mechanism. 🚀The core idea of SMCA is to conduct regression-aware co-attention in DETR by constraining co-attention responses to be high near initially estimated bounding box locations. Our proposed SMCA increases DETR's convergence speed by replacing the original co-attention mechanism in the decoder while keeping other operations in DETR unchanged. 🚀Furthermore, by integrating multi-head and scale-selection attention designs into SMCA, our fully-fledged SMCA can achieve better performance compared to DETR with a dilated convolution-based backbone (45.6 mAP at 108 epochs vs. 43.3 mAP at 500 epochs). We perform extensive ablation studies on COCO dataset to validate the effectiveness of the proposed SMCA. #computervision #artificialintelligence #deeplearning #india

  • diagram

To view or add a comment, sign in

Explore content categories