𝗗𝗮𝘆-𝟯𝟲𝟰 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗦𝘁𝘆𝗹𝗲𝗚𝗔𝗡-𝗩: 𝗔 𝗖𝗼𝗻𝘁𝗶𝗻𝘂𝗼𝘂𝘀 𝗩𝗶𝗱𝗲𝗼 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗼𝗿 𝘄𝗶𝘁𝗵 𝘁𝗵𝗲 𝗣𝗿𝗶𝗰𝗲, 𝗜𝗺𝗮𝗴𝗲 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗮𝗻𝗱 𝗣𝗲𝗿𝗸𝘀 𝗼𝗳 𝗦𝘁𝘆𝗹𝗲𝗚𝗔𝗡2 by KAUST (King Abdullah University of Science and Technology) Follow me for a similar post: 🇮🇳 Ashish Patel ------------------------------------------------------------------- 𝗜𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝗙𝗮𝗰𝘁𝘀 : 🔸 Paper: 𝗦𝘁𝘆𝗹𝗲𝗚𝗔𝗡-𝗩: 𝗔 𝗖𝗼𝗻𝘁𝗶𝗻𝘂𝗼𝘂𝘀 𝗩𝗶𝗱𝗲𝗼 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗼𝗿 𝘄𝗶𝘁𝗵 𝘁𝗵𝗲 𝗣𝗿𝗶𝗰𝗲, 𝗜𝗺𝗮𝗴𝗲 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗮𝗻𝗱 𝗣𝗲𝗿𝗸𝘀 𝗼𝗳 𝗦𝘁𝘆𝗹𝗲𝗚𝗔𝗡2 🔸This paper is published arxiv2021. 🔸 Provided a different perspective on time for video synthesis and built a continuous video generator using the paradigm of neural representations. For this, researchers developed motion representations through the lens of posi- tional embeddings, explored sparse training of video gen- erators and redesigned a typical dual structure of a video discriminator. this model is built on top of StyleGAN2 and features a lot of its perks, like efficient training, good image quality and editable latent space. ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 🔸 Videos show continuous events, yet most - if not all - video synthesis frameworks treat them discretely in time. 🔸In this work, we think of videos of what they should be - time-continuous signals, and extend the paradigm of neural representations to build a continuous-time video generator. 🔸For this, we first design continuous motion representations through the lens of positional embeddings. 🔸Then, we explore the question of training on very sparse videos and demonstrate that a good generator can be learned by using as few as 2 frames per clip. 🔸After that, we rethink the traditional image and video discriminators pair and propose to use a single hypernetwork-based one. 🔸This decreases the training cost and provides richer learning signal to the generator, making it possible to train directly on 10242 videos for the first time. 🔸We build our model on top of StyleGAN2 and it is just 5% more expensive to train at the same resolution while achieving almost the same image quality. 🔸Moreover, our latent space features similar properties, enabling spatial manipulations that our method can propagate in time. 🔸We can generate arbitrarily long videos at arbitrary high frame rate, while prior work struggles to generate even 64 frames at a fixed rate. Our model achieves state-of-the-art results on four modern 2562 video synthesis benchmarks and one 10242 resolution one. #computervision #artificialintelligence #innovation
Love this
Waiting for day-365
congrats for one year posts day by day which learns new things ...