Ashish Patel 🇮🇳’s Post

𝗗𝗮𝘆-𝟮𝟬𝟬 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗜𝗻𝘃𝗼𝗹𝘂𝘁𝗶𝗼𝗻: Inverting the Inherence of Convolution for Visual Recognition by The Hong Kong University of Science and Technology, ByteDance and Beijing University of Posts and Telecommunications Follow me for a similar post:  🇮🇳 Ashish Patel Interesting Facts : 🔸 This is a paper in #CVPR2021 with over 9 citations. 🔸 It outperforms ------------------------------------------------------------------- 𝗔𝗺𝗮𝘇𝗶𝗻𝗴 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 : https://lnkd.in/ekYnAJB Code : https://lnkd.in/ehHTDcc ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 🔸 Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic and channel-specific. 🔸 Instead, present a novel atomic operation for deep neural networks by inverting the aforementioned design principles of convolution, coined as involution. Additionally demystify the recent popular self-attention operator and subsume it into our involution family as an over-complicated instantiation. 🔸 The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including ImageNet classification, COCO detection and segmentation, together with Cityscapes segmentation. 🔸 Involution-based models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely while compressing the computational cost to 66%, 65%, 72%, and 57% on the above benchmarks, respectively. #computervision #artificialintelligence #data

  • diagram
See more comments

To view or add a comment, sign in

Explore content categories