Ashish Patel 🇮🇳’s Post

𝗗𝗮𝘆-𝟰𝟭𝟲 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 Visual Attention Network by Tsinghua University Follow me for a similar post: Ashish Patel ------------------------------------------------------------------- 𝗜𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝗙𝗮𝗰𝘁𝘀 : 🔸 This paper is published arxiv 2022. ✔️ Github : https://lnkd.in/gXrYhctf 👉 Novel visual attention LKA which combines the advantages of convolution and self-attention. Based on LKA, we build a vision backbone VAN that achieves the state-of-the-art performance in some visual tasks, including image classification, object detection, semantic segmentation, etc.  ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 ✔️ While originally designed for natural language processing (NLP) tasks, the self-attention mechanism has recently taken various computer vision areas by storm. However, the 2D nature of images brings three challenges for applying self-attention in computer vision.  ✔️ (1) Treating images as 1D sequences neglects their 2D structures.  ✔️ (2) The quadratic complexity is too expensive for high-resolution images.  ✔️ (3) It only captures spatial adaptability but ignores channel adaptability.  ✔️ In this paper, we propose a novel large kernel attention (LKA) module to enable self-adaptive and long-range correlations in self-attention while avoiding the above issues.  ✔️ We further introduce a novel neural network based on LKA, namely Visual Attention Network (VAN).  ✔️ While extremely simple and efficient, VAN outperforms the state-of-the-art vision transformers and convolutional neural networks with a large margin in extensive experiments, including image classification, object detection, semantic segmentation, instance segmentation, etc. #computervision #artificialintelligence #data

Pooja Jain

Wavicle Data Solutions193K followers

4y

Interesting👍 Thanks for sharing!!

See more comments

To view or add a comment, sign in

Explore content categories