#computervision #artificialintelligence #innovation | Ashish Patel 🇮🇳

LinkedIn respects your privacy

LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Ashish Patel 🇮🇳

4y

𝗗𝗮𝘆-𝟯𝟲𝟰 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗦𝘁𝘆𝗹𝗲𝗚𝗔𝗡-𝗩: 𝗔 𝗖𝗼𝗻𝘁𝗶𝗻𝘂𝗼𝘂𝘀 𝗩𝗶𝗱𝗲𝗼 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗼𝗿 𝘄𝗶𝘁𝗵 𝘁𝗵𝗲 𝗣𝗿𝗶𝗰𝗲, 𝗜𝗺𝗮𝗴𝗲 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗮𝗻𝗱 𝗣𝗲𝗿𝗸𝘀 𝗼𝗳 𝗦𝘁𝘆𝗹𝗲𝗚𝗔𝗡2 by KAUST (King Abdullah University of Science and Technology) Follow me for a similar post: 🇮🇳 Ashish Patel ------------------------------------------------------------------- 𝗜𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝗙𝗮𝗰𝘁𝘀 : 🔸 Paper: 𝗦𝘁𝘆𝗹𝗲𝗚𝗔𝗡-𝗩: 𝗔 𝗖𝗼𝗻𝘁𝗶𝗻𝘂𝗼𝘂𝘀 𝗩𝗶𝗱𝗲𝗼 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗼𝗿 𝘄𝗶𝘁𝗵 𝘁𝗵𝗲 𝗣𝗿𝗶𝗰𝗲, 𝗜𝗺𝗮𝗴𝗲 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗮𝗻𝗱 𝗣𝗲𝗿𝗸𝘀 𝗼𝗳 𝗦𝘁𝘆𝗹𝗲𝗚𝗔𝗡2 🔸This paper is published arxiv2021. 🔸 Provided a different perspective on time for video synthesis and built a continuous video generator using the paradigm of neural representations. For this, researchers developed motion representations through the lens of posi- tional embeddings, explored sparse training of video gen- erators and redesigned a typical dual structure of a video discriminator. this model is built on top of StyleGAN2 and features a lot of its perks, like efficient training, good image quality and editable latent space. ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 🔸 Videos show continuous events, yet most - if not all - video synthesis frameworks treat them discretely in time. 🔸In this work, we think of videos of what they should be - time-continuous signals, and extend the paradigm of neural representations to build a continuous-time video generator. 🔸For this, we first design continuous motion representations through the lens of positional embeddings. 🔸Then, we explore the question of training on very sparse videos and demonstrate that a good generator can be learned by using as few as 2 frames per clip. 🔸After that, we rethink the traditional image and video discriminators pair and propose to use a single hypernetwork-based one. 🔸This decreases the training cost and provides richer learning signal to the generator, making it possible to train directly on 10242 videos for the first time. 🔸We build our model on top of StyleGAN2 and it is just 5% more expensive to train at the same resolution while achieving almost the same image quality. 🔸Moreover, our latent space features similar properties, enabling spatial manipulations that our method can propagate in time. 🔸We can generate arbitrarily long videos at arbitrary high frame rate, while prior work struggles to generate even 64 frames at a fixed rate. Our model achieves state-of-the-art results on four modern 2562 video synthesis benchmarks and one 10242 resolution one. #computervision #artificialintelligence #innovation

3 Comments

Naga Tharun, graphic

Naga Tharun 4y

congrats for one year posts day by day which learns new things ...

Naga Tharun, graphic

Naga Tharun 4y

Love this

Sanjeeth Boddinagula, graphic

Sanjeeth Boddinagula 4y

Waiting for day-365

See more comments

To view or add a comment, sign in

Ashish Patel 🇮🇳

105,203 followers

View Profile Connect

More from this author

Explore content categories