Related papers: Text2Layer: Layered Image Generation using Latent …

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model

Despite the success of generating high-quality images given any text prompts by diffusion-based generative models, prior works directly generate the entire images, but cannot provide object-wise manipulation capability. To support wider…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Runhui Huang , Kaixin Cai , Jianhua Han , Xiaodan Liang , Renjing Pei , Guansong Lu , Songcen Xu , Wei Zhang , Hang Xu

From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition

Images can be viewed as layered compositions, foreground objects over background, with potential occlusions. This layered representation enables independent editing of elements, offering greater flexibility for content creation. Despite the…

Computer Vision and Pattern Recognition · Computer Science 2026-03-25 Jingxi Chen , Yixiao Zhang , Xiaoye Qian , Zongxia Li , Cornelia Fermuller , Caren Chen , Yiannis Aloimonos

LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors

Large-scale diffusion models have achieved remarkable success in generating high-quality images from textual descriptions, gaining popularity across various applications. However, the generation of layered content, such as transparent…

Computer Vision and Pattern Recognition · Computer Science 2024-12-06 Yusuf Dalva , Yijun Li , Qing Liu , Nanxuan Zhao , Jianming Zhang , Zhe Lin , Pinar Yanardag

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

Text-guided image editing has recently experienced rapid development. However, simultaneously performing multiple editing actions on a single image, such as background replacement and specific subject attribute changes, while maintaining…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Pengzhi Li , QInxuan Huang , Yikang Ding , Zhiheng Li

Generative Image Layer Decomposition with Visual Effects

Recent advancements in large generative models, particularly diffusion-based methods, have significantly enhanced the capabilities of image editing. However, achieving precise control over image composition tasks remains a challenge.…

Computer Vision and Pattern Recognition · Computer Science 2024-11-28 Jinrui Yang , Qing Liu , Yijun Li , Soo Ye Kim , Daniil Pakhomov , Mengwei Ren , Jianming Zhang , Zhe Lin , Cihang Xie , Yuyin Zhou

Transparent Image Layer Diffusion using Latent Transparency

We present LayerDiffuse, an approach enabling large-scale pretrained latent diffusion models to generate transparent images. The method allows generation of single transparent images or of multiple transparent layers. The method learns a…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Lvmin Zhang , Maneesh Agrawala

LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge

Layers have become indispensable tools for professional artists, allowing them to build a hierarchical structure that enables independent control over individual visual elements. In this paper, we propose LayeringDiff, a novel pipeline for…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Kyoungkook Kang , Gyujin Sim , Geonung Kim , Donguk Kim , Seungho Nam , Sunghyun Cho

DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode

Text-driven image generation using diffusion models has recently gained significant attention. To enable more flexible image manipulation and editing, recent research has expanded from single image generation to transparent layer generation…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Junjia Huang , Pengxiang Yan , Jinhang Cai , Jiyang Liu , Zhao Wang , Yitong Wang , Xinglong Wu , Guanbin Li

PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment

Transparent image layer generation plays a significant role in digital art and design workflows. Existing methods typically decompose transparent layers from a single RGB image using a set of tools or generate multiple transparent layers…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Dingbang Huang , Wenbo Li , Yifei Zhao , Xinyu Pan , Chun Wang , Yanhong Zeng , Bo Dai

Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models

Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would…

Computer Vision and Pattern Recognition · Computer Science 2023-06-14 Krishna Sri Ipsit Mantri , Nevasini Sasikumar

Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Latent diffusion models (LDMs) dominate high-quality image generation, yet integrating representation learning with generative modeling remains a challenge. We introduce a novel generative image modeling framework that seamlessly bridges…

Computer Vision and Pattern Recognition · Computer Science 2026-01-23 Theodoros Kouzelis , Efstathios Karypidis , Ioannis Kakogeorgiou , Spyros Gidaris , Nikos Komodakis

Blended Latent Diffusion

The tremendous progress in neural image generation, coupled with the emergence of seemingly omnipotent vision-language models has finally enabled text-based interfaces for creating and editing images. Handling generic images requires a…

Computer Vision and Pattern Recognition · Computer Science 2023-07-27 Omri Avrahami , Ohad Fried , Dani Lischinski

High-Resolution Image Synthesis with Latent Diffusion Models

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a…

Computer Vision and Pattern Recognition · Computer Science 2022-04-14 Robin Rombach , Andreas Blattmann , Dominik Lorenz , Patrick Esser , Björn Ommer

Text-driven Visual Synthesis with Latent Diffusion Prior

There has been tremendous progress in large-scale text-to-image synthesis driven by diffusion models enabling versatile downstream applications such as 3D object synthesis from texts, image editing, and customized generation. We present a…

Computer Vision and Pattern Recognition · Computer Science 2023-04-05 Ting-Hsuan Liao , Songwei Ge , Yiran Xu , Yao-Chih Lee , Badour AlBahar , Jia-Bin Huang

LaMD: Latent Motion Diffusion for Image-Conditional Video Generation

The video generation field has witnessed rapid improvements with the introduction of recent diffusion models. While these models have successfully enhanced appearance quality, they still face challenges in generating coherent and natural…

Computer Vision and Pattern Recognition · Computer Science 2025-04-21 Yaosi Hu , Zhenzhong Chen , Chong Luo

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

Automatically generating high-quality real world 3D scenes is of enormous interest for applications such as virtual reality and robotics simulation. Towards this goal, we introduce NeuralField-LDM, a generative model capable of synthesizing…

Computer Vision and Pattern Recognition · Computer Science 2023-04-20 Seung Wook Kim , Bradley Brown , Kangxue Yin , Karsten Kreis , Katja Schwarz , Daiqing Li , Robin Rombach , Antonio Torralba , Sanja Fidler

3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models

Diffusion models have shown great promise for image generation, beating GANs in terms of generation diversity, with comparable image quality. However, their application to 3D shapes has been limited to point or voxel representations that…

Computer Vision and Pattern Recognition · Computer Science 2022-12-16 Gimin Nam , Mariem Khlifi , Andrew Rodriguez , Alberto Tono , Linqi Zhou , Paul Guerrero

Move Anything with Layered Scene Diffusion

Diffusion models generate images with an unprecedented level of quality, but how can we freely rearrange image layouts? Recent works generate controllable scenes via learning spatially disentangled latent codes, but these methods do not…

Computer Vision and Pattern Recognition · Computer Science 2024-04-11 Jiawei Ren , Mengmeng Xu , Jui-Chieh Wu , Ziwei Liu , Tao Xiang , Antoine Toisoul

Thinking Outside the BBox: Unconstrained Generative Object Compositing

Compositing an object into an image involves multiple non-trivial sub-tasks such as object placement and scaling, color/lighting harmonization, viewpoint/geometry adjustment, and shadow/reflection generation. Recent generative image…

Computer Vision and Pattern Recognition · Computer Science 2024-09-12 Gemma Canet Tarrés , Zhe Lin , Zhifei Zhang , Jianming Zhang , Yizhi Song , Dan Ruta , Andrew Gilbert , John Collomosse , Soo Ye Kim

Text-to-image Diffusion Models in Generative AI: A Survey

This survey reviews the progress of diffusion models in generating images from text, ~\textit{i.e.} text-to-image diffusion models. As a self-contained work, this survey starts with a brief introduction of how diffusion models work for…

Computer Vision and Pattern Recognition · Computer Science 2024-11-11 Chenshuang Zhang , Chaoning Zhang , Mengchun Zhang , In So Kweon , Junmo Kim