Related papers: Attribute-guided image generation from layout

AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style

Conditional image synthesis from layout has recently attracted much interest. Previous approaches condition the generator on object locations as well as class labels but lack fine-grained control over the diverse appearance aspects of…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Stanislav Frolov , Avneesh Sharma , Jörn Hees , Tushar Karayil , Federico Raue , Andreas Dengel

InstanceGen: Image Generation with Instance-level Instructions

Despite rapid advancements in the capabilities of generative models, pretrained text-to-image models still struggle in capturing the semantics conveyed by complex prompts that compound multiple objects and instance-level attributes.…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Etai Sella , Yanir Kleiman , Hadar Averbuch-Elor

Image Generation from Layout

Despite significant recent progress on generative models, controlled generation of images depicting multiple and complex object layouts is still a difficult problem. Among the core challenges are the diversity of appearance a given object…

Computer Vision and Pattern Recognition · Computer Science 2019-10-16 Bo Zhao , Lili Meng , Weidong Yin , Leonid Sigal

Attribute2Image: Conditional Image Generation from Visual Attributes

This paper investigates a novel problem of generating images from visual attributes. We model the image as a composite of foreground and background and develop a layered generative model with disentangled latent variables that can be…

Machine Learning · Computer Science 2016-10-11 Xinchen Yan , Jimei Yang , Kihyuk Sohn , Honglak Lee

Specifying Object Attributes and Relations in Interactive Scene Generation

We introduce a method for the generation of images from an input scene graph. The method separates between a layout embedding and an appearance embedding. The dual embedding leads to generated images that better match the scene graph, have…

Computer Vision and Pattern Recognition · Computer Science 2019-11-19 Oron Ashual , Lior Wolf

Generating Multiple Objects at Spatially Distinct Locations

Recent improvements to Generative Adversarial Networks (GANs) have made it possible to generate realistic images in high resolution based on natural language descriptions such as image captions. Furthermore, conditional GANs allow us to…

Computer Vision and Pattern Recognition · Computer Science 2019-01-04 Tobias Hinz , Stefan Heinrich , Stefan Wermter

Object-Centric Relational Representations for Image Generation

Conditioning image generation on specific features of the desired output is a key ingredient of modern generative models. However, existing approaches lack a general and unified way of representing structural and semantic conditioning at…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Luca Butera , Andrea Cini , Alberto Ferrante , Cesare Alippi

Controlling Style and Semantics in Weakly-Supervised Image Generation

We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene. We exploit sparse semantic maps to control object shapes and classes, as well as…

Computer Vision and Pattern Recognition · Computer Science 2020-11-23 Dario Pavllo , Aurelien Lucchi , Thomas Hofmann

Exploiting Relationship for Complex-scene Image Generation

The significant progress on Generative Adversarial Networks (GANs) has facilitated realistic single-object image generation based on language input. However, complex-scene generation (with various interactions among multiple objects) still…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Tianyu Hua , Hongdong Zheng , Yalong Bai , Wei Zhang , Xiao-Ping Zhang , Tao Mei

Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

Large language models benefit from training with a large amount of unlabeled text, which gives them increasingly fluent and diverse generation capabilities. However, using these models for text generation that takes into account target…

Computation and Language · Computer Science 2021-09-16 Dian Yu , Zhou Yu , Kenji Sagae

Image Generation from Scene Graphs

To truly understand the visual world our models should be able not only to recognize images but also generate them. To this end, there has been exciting recent progress on generating images from natural language descriptions. These methods…

Computer Vision and Pattern Recognition · Computer Science 2018-04-06 Justin Johnson , Agrim Gupta , Li Fei-Fei

Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data

Conditional image generation is effective for diverse tasks including training data synthesis for learning-based computer vision. However, despite the recent advances in generative adversarial networks (GANs), it is still a challenging task…

Computer Vision and Pattern Recognition · Computer Science 2018-11-30 Yutaro Miyauchi , Yusuke Sugano , Yasuyuki Matsushita

Generating Compositional Scenes via Text-to-image RGBA Instance Generation

Text-to-image diffusion generative models can generate high quality images at the cost of tedious prompt engineering. Controllability can be improved by introducing layout conditioning, however existing methods lack layout editing ability…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Alessandro Fontanella , Petru-Daniel Tudosiu , Yongxin Yang , Shifeng Zhang , Sarah Parisot

Sketch-Guided Scene Image Generation

Text-to-image models are showcasing the impressive ability to create high-quality and diverse generative images. Nevertheless, the transition from freehand sketches to complex scene images remains challenging using diffusion models. In this…

Computer Vision and Pattern Recognition · Computer Science 2024-07-10 Tianyu Zhang , Xiaoxuan Xie , Xusheng Du , Haoran Xie

Cluster-guided Image Synthesis with Unconditional Models

Generative Adversarial Networks (GANs) are the driving force behind the state-of-the-art in image generation. Despite their ability to synthesize high-resolution photo-realistic images, generating content with on-demand conditioning of…

Computer Vision and Pattern Recognition · Computer Science 2021-12-28 Markos Georgopoulos , James Oldfield , Grigorios G Chrysos , Yannis Panagakis

Semantic Palette: Guiding Scene Generation with Class Proportions

Despite the recent progress of generative adversarial networks (GANs) at synthesizing photo-realistic images, producing complex urban scenes remains a challenging problem. Previous works break down scene generation into two consecutive…

Computer Vision and Pattern Recognition · Computer Science 2021-06-04 Guillaume Le Moing , Tuan-Hung Vu , Himalaya Jain , Patrick Pérez , Matthieu Cord

Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Generating new images with desired properties (e.g. new view/poses) from source images has been enthusiastically pursued recently, due to its wide range of potential applications. One way to ensure high-quality generation is to use multiple…

Computer Vision and Pattern Recognition · Computer Science 2022-02-03 Jiawei Lu , He Wang , Tianjia Shao , Yin Yang , Kun Zhou

Text2Street: Controllable Text-to-image Generation for Street Views

Text-to-image generation has made remarkable progress with the emergence of diffusion models. However, it is still a difficult task to generate images for street views based on text, mainly because the road topology of street scenes is…

Computer Vision and Pattern Recognition · Computer Science 2024-02-08 Jinming Su , Songen Gu , Yiting Duan , Xingyue Chen , Junfeng Luo

Camera Control for Text-to-Image Generation via Learning Viewpoint Tokens

Current text-to-image models struggle to provide precise camera control using natural language alone. In this work, we present a framework for precise camera control with global scene understanding in text-to-image generation by learning…

Computer Vision and Pattern Recognition · Computer Science 2026-04-23 Xinxuan Lu , Charless Fowlkes , Alexander C. Berg

Griffin: Generative Reference and Layout Guided Image Composition

Text-to-image models have achieved a level of realism that enables the generation of highly convincing images. However, text-based control can be a limiting factor when more explicit guidance is needed. Defining both the content and its…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Aryan Mikaeili , Amirhossein Alimohammadi , Negar Hassanpour , Ali Mahdavi-Amiri , Andrea Tagliasacchi