Visual Attention Network | Ashish Patel 🇮🇳

Oracle•105K followers

𝗗𝗮𝘆-𝟰𝟭𝟲 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 Visual Attention Network by Tsinghua University Follow me for a similar post: Ashish Patel ------------------------------------------------------------------- 𝗜𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝗙𝗮𝗰𝘁𝘀 : 🔸 This paper is published arxiv 2022. ✔️ Github : https://lnkd.in/gXrYhctf 👉 Novel visual attention LKA which combines the advantages of convolution and self-attention. Based on LKA, we build a vision backbone VAN that achieves the state-of-the-art performance in some visual tasks, including image classification, object detection, semantic segmentation, etc. ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 ✔️ While originally designed for natural language processing (NLP) tasks, the self-attention mechanism has recently taken various computer vision areas by storm. However, the 2D nature of images brings three challenges for applying self-attention in computer vision. ✔️ (1) Treating images as 1D sequences neglects their 2D structures. ✔️ (2) The quadratic complexity is too expensive for high-resolution images. ✔️ (3) It only captures spatial adaptability but ignores channel adaptability. ✔️ In this paper, we propose a novel large kernel attention (LKA) module to enable self-adaptive and long-range correlations in self-attention while avoiding the above issues. ✔️ We further introduce a novel neural network based on LKA, namely Visual Attention Network (VAN). ✔️ While extremely simple and efficient, VAN outperforms the state-of-the-art vision transformers and convolutional neural networks with a large margin in extensive experiments, including image classification, object detection, semantic segmentation, instance segmentation, etc. #computervision #artificialintelligence #data