𝗗𝗮𝘆-𝟰𝟴𝟵 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 Introducing BlobGAN: Spatially Disentangled Scene Representations by UC Berkeley and Adobe Research Follow me for a similar post: Ashish Patel ------------------------------------------------------------------- 𝗜𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝗙𝗮𝗰𝘁𝘀 : 🔸 This paper is published ARXIV2022. 🔸 By changing properties of individual blobs, we can manipulate discovered entities in images of scenes. ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 👉 Authors propose an unsupervised, mid-level representation for a generative model of scenes. 👉 The representation is mid-level in that it is neither per-pixel nor per-image; rather, scenes are modeled as a collection of spatial, depth-ordered "blobs" of features. 👉 Blobs are differentiably placed onto a feature grid that is decoded into an image by a generative adversarial network. 👉 Due to the spatial uniformity of blobs and the locality inherent to convolution, our network learns to associate different blobs with different entities in a scene and to arrange these blobs to capture scene layout. 👉 We demonstrate this emergent behavior by showing that, despite training without any supervision, our method enables applications such as easy manipulation of objects within a scene (e.g., moving, removing, and restyling furniture), creation of feasible scenes given constraints (e.g., plausible rooms with drawers at a particular location), and parsing of real-world images into constituent parts. 👉 On a challenging multi-category dataset of indoor scenes, BlobGAN outperforms StyleGAN2 in image quality as measured by FID. #computervision #artificialintelligence #deeplearning #machinelearning #datascience #data

See more comments

To view or add a comment, sign in

Explore content categories