Related papers: LayerDiff: Exploring Text-guided Multi-layered Com…

LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge

Layers have become indispensable tools for professional artists, allowing them to build a hierarchical structure that enables independent control over individual visual elements. In this paper, we propose LayeringDiff, a novel pipeline for…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Kyoungkook Kang , Gyujin Sim , Geonung Kim , Donguk Kim , Seungho Nam , Sunghyun Cho

Text2Layer: Layered Image Generation using Latent Diffusion Model

Layer compositing is one of the most popular image editing workflows among both amateurs and professionals. Motivated by the success of diffusion models, we explore layer compositing from a layered image generation perspective. Instead of…

Computer Vision and Pattern Recognition · Computer Science 2023-07-20 Xinyang Zhang , Wentian Zhao , Xin Lu , Jeff Chien

DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode

Text-driven image generation using diffusion models has recently gained significant attention. To enable more flexible image manipulation and editing, recent research has expanded from single image generation to transparent layer generation…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Junjia Huang , Pengxiang Yan , Jinhang Cai , Jiyang Liu , Zhao Wang , Yitong Wang , Xinglong Wu , Guanbin Li

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

Text-guided image editing has recently experienced rapid development. However, simultaneously performing multiple editing actions on a single image, such as background replacement and specific subject attribute changes, while maintaining…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Pengzhi Li , QInxuan Huang , Yikang Ding , Zhiheng Li

PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment

Transparent image layer generation plays a significant role in digital art and design workflows. Existing methods typically decompose transparent layers from a single RGB image using a set of tools or generate multiple transparent layers…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Dingbang Huang , Wenbo Li , Yifei Zhao , Xinyu Pan , Chun Wang , Yanhong Zeng , Bo Dai

Generative Image Layer Decomposition with Visual Effects

Recent advancements in large generative models, particularly diffusion-based methods, have significantly enhanced the capabilities of image editing. However, achieving precise control over image composition tasks remains a challenge.…

Computer Vision and Pattern Recognition · Computer Science 2024-11-28 Jinrui Yang , Qing Liu , Yijun Li , Soo Ye Kim , Daniil Pakhomov , Mengwei Ren , Jianming Zhang , Zhe Lin , Cihang Xie , Yuyin Zhou

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. Starting from random noise, such text-to-image diffusion models gradually synthesize images in an iterative fashion…

Computer Vision and Pattern Recognition · Computer Science 2023-03-15 Yogesh Balaji , Seungjun Nah , Xun Huang , Arash Vahdat , Jiaming Song , Qinsheng Zhang , Karsten Kreis , Miika Aittala , Timo Aila , Samuli Laine , Bryan Catanzaro , Tero Karras , Ming-Yu Liu

LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors

Large-scale diffusion models have achieved remarkable success in generating high-quality images from textual descriptions, gaining popularity across various applications. However, the generation of layered content, such as transparent…

Computer Vision and Pattern Recognition · Computer Science 2024-12-06 Yusuf Dalva , Yijun Li , Qing Liu , Nanxuan Zhao , Jianming Zhang , Zhe Lin , Pinar Yanardag

Transparent Image Layer Diffusion using Latent Transparency

We present LayerDiffuse, an approach enabling large-scale pretrained latent diffusion models to generate transparent images. The method allows generation of single transparent images or of multiple transparent layers. The method learns a…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Lvmin Zhang , Maneesh Agrawala

Layered Rendering Diffusion Model for Controllable Zero-Shot Image Synthesis

This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries. We first introduce vision guidance as a foundational spatial cue within the perturbed distribution. This…

Computer Vision and Pattern Recognition · Computer Science 2024-12-02 Zipeng Qi , Guoxi Huang , Chenyang Liu , Fei Ye

MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation

Recent advances in text-to-image generation with diffusion models present transformative capabilities in image quality. However, user controllability of the generated image, and fast adaptation to new tasks still remains an open challenge,…

Computer Vision and Pattern Recognition · Computer Science 2023-02-17 Omer Bar-Tal , Lior Yariv , Yaron Lipman , Tali Dekel

ChangeDiff: A Multi-Temporal Change Detection Data Generator with Flexible Text Prompts via Diffusion Model

Data-driven deep learning models have enabled tremendous progress in change detection (CD) with the support of pixel-level annotations. However, collecting diverse data and manually annotating them is costly, laborious, and…

Computer Vision and Pattern Recognition · Computer Science 2024-12-23 Qi Zang , Jiayi Yang , Shuang Wang , Dong Zhao , Wenjun Yi , Zhun Zhong

DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability

Recently, large-scale diffusion models, e.g., Stable diffusion and DallE2, have shown remarkable results on image synthesis. On the other hand, large-scale cross-modal pre-trained models (e.g., CLIP, ALIGN, and FILIP) are competent for…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Runhui Huang , Jianhua Han , Guansong Lu , Xiaodan Liang , Yihan Zeng , Wei Zhang , Hang Xu

PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions

Diffusion-based generative models have shown promise in synthesizing histopathology images to address data scarcity caused by privacy constraints. Diagnostic text reports provide high-level semantic descriptions, and masks offer…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Mahesh Bhosale , Abdul Wasi , Yuanhao Zhai , Yunjie Tian , Samuel Border , Nan Xi , Pinaki Sarder , Junsong Yuan , David Doermann , Xuan Gong

DiLaDiff: Distilled Latent-Augmented Diffusion for Language Modeling

Diffusion language models intrinsically fail to capture correlations between decoded tokens, which leads to a harsh trade-off between sampling quality and throughput. To solve this issue, we propose DiLaDiff, a variant of masked diffusion…

Machine Learning · Computer Science 2026-05-25 Jean-Marie Lemercier , Tomas Geffner , Karsten Kreis , Morteza Mardani , Arash Vahdat , Ante Jukić

Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Latent diffusion models (LDMs) dominate high-quality image generation, yet integrating representation learning with generative modeling remains a challenge. We introduce a novel generative image modeling framework that seamlessly bridges…

Computer Vision and Pattern Recognition · Computer Science 2026-01-23 Theodoros Kouzelis , Efstathios Karypidis , Ioannis Kakogeorgiou , Spyros Gidaris , Nikos Komodakis

Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion

Digital art synthesis is receiving increasing attention in the multimedia community because of engaging the public with art effectively. Current digital art synthesis methods usually use single-modality inputs as guidance, thereby limiting…

Computer Vision and Pattern Recognition · Computer Science 2022-09-29 Nisha Huang , Fan Tang , Weiming Dong , Changsheng Xu

Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models

Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would…

Computer Vision and Pattern Recognition · Computer Science 2023-06-14 Krishna Sri Ipsit Mantri , Nevasini Sasikumar

Collaborative Diffusion for Multi-Modal Face Generation and Editing

Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further…

Computer Vision and Pattern Recognition · Computer Science 2023-04-21 Ziqi Huang , Kelvin C. K. Chan , Yuming Jiang , Ziwei Liu

DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models

Recent data-driven image colorization methods have enabled automatic or reference-based colorization, while still suffering from unsatisfactory and inaccurate object-level color control. To address these issues, we propose a new method…

Computer Vision and Pattern Recognition · Computer Science 2023-08-04 Jianxin Lin , Peng Xiao , Yijun Wang , Rongju Zhang , Xiangxiang Zeng