Ashish Patel 🇮🇳’s Post

𝗗𝗮𝘆-𝟮𝟵𝟬 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗔𝗻𝘁𝗶𝗰𝗶𝗽𝗮𝘁𝗶𝘃𝗲 𝗩𝗶𝗱𝗲𝗼 𝗧𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗲𝗿 by Facebook AI Follow me for a similar post: 🇮🇳 Ashish Patel Interesting Facts : 🔸 This paper is published CVPR 2021. Ranked #1 in CVPR'21 EPIC-Kitchens-100 Action Anticipation challenge. ------------------------------------------------------------------- 𝗔𝗺𝗮𝘇𝗶𝗻𝗴 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 : https://lnkd.in/eaHKKB6d Code: https://lnkd.in/eWqDiPNZ ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 🔸 We propose Anticipative Video Transformer (AVT), an end-to-end attention-based video modeling architecture that attends to the previously observed video in order to anticipate future actions. 🔸We train the model jointly to predict the next action in a video sequence, while also learning frame feature encoders that are predictive of successive future frames' features. 🔸Compared to existing temporal aggregation strategies, AVT has the advantage of both maintaining the sequential progression of observed actions while still capturing long-range dependencies--both critical for the anticipation task. 🔸Through extensive experiments, we show that AVT obtains the best reported performance on four popular action anticipation benchmarks: EpicKitchens-55, EpicKitchens-100, EGTEA Gaze+, and 50-Salads, including outperforming all submissions to the EpicKitchens-100 CVPR'21 challenge. #computervision #artificialintelligence #innovation

  • graphical user interface, application
See more comments

To view or add a comment, sign in

Explore content categories