#computervision #artificialintelligence #deeplearning

4y Edited

𝗗𝗮𝘆-𝟮𝟭𝟲 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗖𝗕𝗡𝗲𝘁𝗩𝟮: A Novel Composite Backbone Network Architecture for Object Detection by Stony Brook University, USA Follow me for a similar post: 🇮🇳 Ashish Patel Interesting Facts : 🔸 This is a paper in ICML2021 with over 23 citations. ------------------------------------------------------------------- 𝗔𝗺𝗮𝘇𝗶𝗻𝗴 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 : https://lnkd.in/gFkZ9z7g code : https://lnkd.in/gbcDug6f ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 🔸 Modern top-performing object detectors depend heavily on backbone networks, whose advances bring consistent performance gains through exploring more effective network structures. 🔸 In this paper, we propose a novel and flexible backbone framework, namely CBNetV2, to construct high-performance detectors using existing open-sourced pre-trained backbones under the pre-training fine-tuning paradigm. 🔸 In particular, CBNetV2 architecture groups multiple identical backbones, which are connected through composite connections. Specifically, it integrates the high- and low-level features of multiple backbone networks and gradually expand the receptive field to more efficiently perform object detection. 🔸 We also propose a better training strategy with assistant supervision for CBNet-based detectors. Without additional pre-training of the composite backbone, CBNetV2 can be adapted to various backbones (CNN-based vs. Transformer-based) and head designs of most mainstream detectors (one-stage vs. two-stage, anchor-based vs. anchor-free-based). 🔸 Experiments provide strong evidence that, compared with simply increasing the depth and width of the network, CBNetV2 introduces a more efficient, effective, and resource-friendly way to build high-performance backbone networks. 🔸 Particularly, our Dual-Swin-L achieves 59.4% box AP and 51.6% mask AP on COCO test-dev under the single-model and single-scale testing protocol, which is significantly better than the state-of-the-art result (57.7% box AP and 50.2% mask AP) achieved by Swin-L, while the training schedule is reduced by 6× 🔸 With multi-scale testing, we push the current best single model result to a new record of 60.1% box AP and 52.3% mask AP without using extra training data. #computervision #artificialintelligence #deeplearning

3 Comments

Charles N. John 4y

Hello Ashish, thanks for sharing the post. I was trying to get the information of the authors, would you mind sharing where did you get the information that they are from SBU? I have searched but could not find that information.

1 Reaction

Ashish Patel 🇮🇳 4y

https://github.com/ashishpatel26/365-Days-Computer-Vision-Learning-Linkedin-Post

1 Reaction

See more comments

To view or add a comment, sign in

LinkedIn respects your privacy

Ashish Patel 🇮🇳’s Post

More from this author

How I Read This Book on DeepSeek — And Where Each Chapter Actually Helped Me in the Real World

From Concept to Scalable LLM: Exploring the Power of Model Context Protocol

90% of Top Companies Are Implementing AI Agents—Don’t Get Left Behind

Explore content categories