Related papers: Compositional Transformers for Scene Generation

Generative Adversarial Transformers

We introduce the GANformer, a novel and efficient type of transformer, and explore it for the task of visual generative modeling. The network employs a bipartite structure that enables long-range interactions across the image, while…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Drew A. Hudson , C. Lawrence Zitnick

A Layer-Based Sequential Framework for Scene Generation with GANs

The visual world we sense, interpret and interact everyday is a complex composition of interleaved physical entities. Therefore, it is a very challenging task to generate vivid scenes of similar complexity using computers. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2019-02-05 Mehmet Ozgur Turkoglu , William Thong , Luuk Spreeuwers , Berkay Kicanaoglu

Text2Scene: Generating Compositional Scenes from Textual Descriptions

In this paper, we propose Text2Scene, a model that generates various forms of compositional scene representations from natural language descriptions. Unlike recent works, our method does NOT use Generative Adversarial Networks (GANs).…

Computer Vision and Pattern Recognition · Computer Science 2019-06-11 Fuwen Tan , Song Feng , Vicente Ordonez

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields

Deep generative models allow for photorealistic image synthesis at high resolutions. But for many applications, this is not enough: content creation also needs to be controllable. While several recent works investigate how to disentangle…

Computer Vision and Pattern Recognition · Computer Science 2021-04-30 Michael Niemeyer , Andreas Geiger

gCoRF: Generative Compositional Radiance Fields

3D generative models of objects enable photorealistic image synthesis with 3D control. Existing methods model the scene as a global scene representation, ignoring the compositional aspect of the scene. Compositional reasoning can enable a…

Graphics · Computer Science 2022-11-01 Mallikarjun BR , Ayush Tewari , Xingang Pan , Mohamed Elgharib , Christian Theobalt

Compositional GAN: Learning Image-Conditional Binary Composition

Generative Adversarial Networks (GANs) can produce images of remarkable complexity and realism but are generally structured to sample from a single latent source ignoring the explicit spatial interaction between multiple entities that could…

Computer Vision and Pattern Recognition · Computer Science 2019-04-01 Samaneh Azadi , Deepak Pathak , Sayna Ebrahimi , Trevor Darrell

Compositional Scene Understanding through Inverse Generative Modeling

Generative models have demonstrated remarkable abilities in generating high-fidelity visual content. In this work, we explore how generative models can further be used not only to synthesize visual content but also to understand the…

Computer Vision and Pattern Recognition · Computer Science 2025-06-25 Yanbo Wang , Justin Dauwels , Yilun Du

Investigating Object Compositionality in Generative Adversarial Networks

Deep generative models seek to recover the process with which the observed data was generated. They may be used to synthesize new samples or to subsequently extract representations. Successful approaches in the domain of images are driven…

Computer Vision and Pattern Recognition · Computer Science 2020-07-27 Sjoerd van Steenkiste , Karol Kurach , Jürgen Schmidhuber , Sylvain Gelly

Iterative Scene Graph Generation with Generative Transformers

Scene graphs provide a rich, structured representation of a scene by encoding the entities (objects) and their spatial relationships in a graphical format. This representation has proven useful in several tasks, such as question answering,…

Computer Vision and Pattern Recognition · Computer Science 2022-12-01 Sanjoy Kundu , Sathyanarayanan N. Aakur

Learning to navigate image manifolds induced by generative adversarial networks for unsupervised video generation

In this work, we introduce a two-step framework for generative modeling of temporal data. Specifically, the generative adversarial networks (GANs) setting is employed to generate synthetic scenes of moving objects. To do so, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2019-02-01 Isabela Albuquerque , João Monteiro , Tiago H. Falk

GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations

Generative latent-variable models are emerging as promising tools in robotics and reinforcement learning. Yet, even though tasks in these domains typically involve distinct objects, most state-of-the-art generative models do not explicitly…

Machine Learning · Computer Science 2020-11-24 Martin Engelcke , Adam R. Kosiorek , Oiwi Parker Jones , Ingmar Posner

Transforming Image Generation from Scene Graphs

Generating images from semantic visual knowledge is a challenging task, that can be useful to condition the synthesis process in complex, subtle, and unambiguous ways, compared to alternatives such as class labels or text descriptions.…

Computer Vision and Pattern Recognition · Computer Science 2022-07-04 Renato Sortino , Simone Palazzo , Concetto Spampinato

Transformer-based Image Generation from Scene Graphs

Graph-structured scene descriptions can be efficiently used in generative models to control the composition of the generated image. Previous approaches are based on the combination of graph convolutional networks and adversarial methods for…

Computer Vision and Pattern Recognition · Computer Science 2023-03-09 Renato Sortino , Simone Palazzo , Concetto Spampinato

CC3D: Layout-Conditioned Generation of Compositional 3D Scenes

In this work, we introduce CC3D, a conditional generative model that synthesizes complex 3D scenes conditioned on 2D semantic scene layouts, trained using single-view images. Different from most existing 3D GANs that limit their…

Computer Vision and Pattern Recognition · Computer Science 2023-09-12 Sherwin Bahmani , Jeong Joon Park , Despoina Paschalidou , Xingguang Yan , Gordon Wetzstein , Leonidas Guibas , Andrea Tagliasacchi

A Generative Model for Volume Rendering

We present a technique to synthesize and analyze volume-rendered images using generative models. We use the Generative Adversarial Network (GAN) framework to compute a model from a large collection of volume renderings, conditioned on (1)…

Graphics · Computer Science 2019-07-18 Matthew Berger , Jixian Li , Joshua A. Levine

High-Resolution Complex Scene Synthesis with Transformers

The use of coarse-grained layouts for controllable synthesis of complex scene images via deep generative models has recently gained popularity. However, results of current approaches still fall short of their promise of high-resolution…

Computer Vision and Pattern Recognition · Computer Science 2021-05-14 Manuel Jahn , Robin Rombach , Björn Ommer

Styleformer: Transformer based Generative Adversarial Networks with Style Vector

We propose Styleformer, which is a style-based generator for GAN architecture, but a convolution-free transformer-based generator. In our paper, we explain how a transformer can generate high-quality images, overcoming the disadvantage that…

Computer Vision and Pattern Recognition · Computer Science 2022-04-06 Jeeseung Park , Younggeun Kim

Using latent space regression to analyze and leverage compositionality in GANs

In recent years, Generative Adversarial Networks have become ubiquitous in both research and public perception, but how GANs convert an unstructured latent code to a high quality output is still an open question. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2021-06-07 Lucy Chai , Jonas Wulff , Phillip Isola

Composing People Together: Iterative Pose-Image Generation for Multi-Person Interaction Scenes

Despite recent progress, text-to-image models still struggle to generate semantically diverse and compositionally accurate multi-person interaction scenes, often collapsing to repetitive layouts, stereotypical poses, and poorly grounded…

Computer Vision and Pattern Recognition · Computer Science 2026-05-25 Wenxuan Peng , Bharath Hariharan , Hadar Averbuch-Elor

Deep Generative Modeling for Scene Synthesis via Hybrid Representations

We present a deep generative scene modeling technique for indoor environments. Our goal is to train a generative model using a feed-forward neural network that maps a prior distribution (e.g., a normal distribution) to the distribution of…

Computer Vision and Pattern Recognition · Computer Science 2018-08-08 Zaiwei Zhang , Zhenpei Yang , Chongyang Ma , Linjie Luo , Alexander Huth , Etienne Vouga , Qixing Huang