Ashish Patel 🇮🇳’s Post

𝗗𝗮𝘆-𝟰𝟳𝟬 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 Neighborhood Attention Transformer by SHI Lab, Picsart AI Research, Facebook Research. Follow me for a similar post: Ashish Patel ------------------------------------------------------------------- 𝗜𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝗙𝗮𝗰𝘁𝘀 : 🔸 This paper is published in Arxiv2022. 🔸 Official: https://lnkd.in/gcMzukN2 ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 🌻 We present Neighborhood Attention Transformer (NAT), an efficient, accurate and scalable hierarchical transformer that works well on both image classification and downstream vision tasks.  🌷 It is built upon Neighborhood Attention (NA), a simple and flexible attention mechanism that localizes the receptive field for each query to its nearest neighboring pixels.  🌹 NA is a localization of self-attention, and approaches it as the receptive field size increases. It is also equivalent in FLOPs and memory usage to Swin Transformer's shifted window attention given the same receptive field size, while being less constrained.  🌺 Furthermore, NA includes local inductive biases, which eliminate the need for extra operations such as pixel shifts.  ☘️ Experimental results on NAT are competitive; NAT-Tiny reaches 83.2% top-1 accuracy on ImageNet with only 4.3 GFLOPs and 28M parameters, 51.4% mAP on MS-COCO, and 48.4% mIoU on ADE20k. #computervision #artificialintelligence  #Transformers

  • No alternative text description for this image

To view or add a comment, sign in

Explore content categories