Related papers: Attribute2Image: Conditional Image Generation from…

Conditional Image Generation with PixelCNN Decoders

This work explores conditional image generation with a new image density model based on the PixelCNN architecture. The model can be conditioned on any vector, including descriptive labels or tags, or latent embeddings created by other…

Computer Vision and Pattern Recognition · Computer Science 2016-06-21 Aaron van den Oord , Nal Kalchbrenner , Oriol Vinyals , Lasse Espeholt , Alex Graves , Koray Kavukcuoglu

Attribute-guided image generation from layout

Recent approaches have achieved great success in image generation from structured inputs, e.g., semantic segmentation, scene graph or layout. Although these methods allow specification of objects and their locations at image-level, they…

Computer Vision and Pattern Recognition · Computer Science 2020-08-28 Ke Ma , Bo Zhao , Leonid Sigal

Multi-View Data Generation Without View Supervision

The development of high-dimensional generative models has recently gained a great surge of interest with the introduction of variational auto-encoders and generative adversarial neural networks. Different variants have been proposed where…

Computer Vision and Pattern Recognition · Computer Science 2019-04-18 Mickaël Chen , Ludovic Denoyer , Thierry Artières

Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data

Conditional image generation is effective for diverse tasks including training data synthesis for learning-based computer vision. However, despite the recent advances in generative adversarial networks (GANs), it is still a challenging task…

Computer Vision and Pattern Recognition · Computer Science 2018-11-30 Yutaro Miyauchi , Yusuke Sugano , Yasuyuki Matsushita

Latent Constraints: Learning to Generate Conditionally from Unconditional Generative Models

Deep generative neural networks have proven effective at both conditional and unconditional modeling of complex data distributions. Conditional generation enables interactive control, but creating new controls often requires expensive…

Machine Learning · Computer Science 2017-12-25 Jesse Engel , Matthew Hoffman , Adam Roberts

Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts

Automatic image synthesis research has been rapidly growing with deep networks getting more and more expressive. In the last couple of years, we have observed images of digits, indoor scenes, birds, chairs, etc. being automatically…

Computer Vision and Pattern Recognition · Computer Science 2016-12-02 Levent Karacan , Zeynep Akata , Aykut Erdem , Erkut Erdem

3D-aware Image Generation and Editing with Multi-modal Conditions

3D-consistent image generation from a single 2D semantic label is an important and challenging research topic in computer graphics and computer vision. Although some related works have made great progress in this field, most of the existing…

Computer Vision and Pattern Recognition · Computer Science 2024-03-12 Bo Li , Yi-ke Li , Zhi-fen He , Bin Liu , Yun-Kun Lai

Learning to Manipulate Individual Objects in an Image

We describe a method to train a generative model with latent factors that are (approximately) independent and localized. This means that perturbing the latent variables affects only local regions of the synthesized image, corresponding to…

Computer Vision and Pattern Recognition · Computer Science 2020-04-14 Yanchao Yang , Yutong Chen , Stefano Soatto

Text2Layer: Layered Image Generation using Latent Diffusion Model

Layer compositing is one of the most popular image editing workflows among both amateurs and professionals. Motivated by the success of diffusion models, we explore layer compositing from a layered image generation perspective. Instead of…

Computer Vision and Pattern Recognition · Computer Science 2023-07-20 Xinyang Zhang , Wentian Zhao , Xin Lu , Jeff Chien

Disentangling Generative Factors in Natural Language with Discrete Variational Autoencoders

The ability of learning disentangled representations represents a major step for interpretable NLP systems as it allows latent linguistic features to be controlled. Most approaches to disentanglement rely on continuous variables, both for…

Computation and Language · Computer Science 2021-09-16 Giangiacomo Mercatali , André Freitas

Authoring image decompositions with generative models

We show how to extend traditional intrinsic image decompositions to incorporate further layers above albedo and shading. It is hard to obtain data to learn a multi-layer decomposition. Instead, we can learn to decompose an image into layers…

Computer Vision and Pattern Recognition · Computer Science 2016-12-06 Jason Rock , Theerasit Issaranon , Aditya Deshpande , David Forsyth

Efficient inference in occlusion-aware generative models of images

We present a generative model of images based on layering, in which image layers are individually generated, then composited from front to back. We are thus able to factor the appearance of an image into the appearance of individual objects…

Machine Learning · Computer Science 2016-02-17 Jonathan Huang , Kevin Murphy

Unpriortized Autoencoder For Image Generation

In this paper, we treat the image generation task using an autoencoder, a representative latent model. Unlike many studies regularizing the latent variable's distribution by assuming a manually specified prior, we approach the image…

Machine Learning · Computer Science 2021-08-27 Jaeyoung Yoo , Hojun Lee , Nojun Kwak

Visual Object Networks: Image Generation with Disentangled 3D Representation

Recent progress in deep generative models has led to tremendous breakthroughs in image generation. However, while existing models can synthesize photorealistic images, they lack an understanding of our underlying 3D world. We present a new…

Computer Vision and Pattern Recognition · Computer Science 2018-12-07 Jun-Yan Zhu , Zhoutong Zhang , Chengkai Zhang , Jiajun Wu , Antonio Torralba , Joshua B. Tenenbaum , William T. Freeman

Learning Disentangled Representations with Reference-Based Variational Autoencoders

Learning disentangled representations from visual data, where different high-level generative factors are independently encoded, is of importance for many computer vision tasks. Solving this problem, however, typically requires to…

Computer Vision and Pattern Recognition · Computer Science 2019-01-25 Adria Ruiz , Oriol Martinez , Xavier Binefa , Jakob Verbeek

Diverse Image Generation via Self-Conditioned GANs

We introduce a simple but effective unsupervised method for generating realistic and diverse images. We train a class-conditional GAN model without using manually annotated class labels. Instead, our model is conditional on labels…

Computer Vision and Pattern Recognition · Computer Science 2022-02-11 Steven Liu , Tongzhou Wang , David Bau , Jun-Yan Zhu , Antonio Torralba

Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry

In this paper, we propose Img2CAD, the first approach to our knowledge that uses 2D image inputs to generate CAD models with editable parameters. Unlike existing AI methods for 3D model generation using text or image inputs often rely on…

Computer Vision and Pattern Recognition · Computer Science 2024-10-07 Tianrun Chen , Chunan Yu , Yuanqi Hu , Jing Li , Tao Xu , Runlong Cao , Lanyun Zhu , Ying Zang , Yong Zhang , Zejian Li , Linyun Sun

Object-Centric Relational Representations for Image Generation

Conditioning image generation on specific features of the desired output is a key ingredient of modern generative models. However, existing approaches lack a general and unified way of representing structural and semantic conditioning at…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Luca Butera , Andrea Cini , Alberto Ferrante , Cesare Alippi

A Variational U-Net for Conditional Appearance and Shape Generation

Deep generative models have demonstrated great performance in image synthesis. However, results deteriorate in case of spatial deformations, since they generate images of objects directly, rather than modeling the intricate interplay of…

Computer Vision and Pattern Recognition · Computer Science 2018-04-16 Patrick Esser , Ekaterina Sutter , Björn Ommer

Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment

How does audio describe the world around us? In this work, we propose a method for generating images of visual scenes from diverse in-the-wild sounds. This cross-modal generation task is challenging due to the significant information gap…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Kim Sung-Bin , Arda Senocak , Hyunwoo Ha , Tae-Hyun Oh