Related papers: Referring Layer Decomposition

Controllable Layer Decomposition for Reversible Multi-Layer Image Generation

This work presents Controllable Layer Decomposition (CLD), a method for achieving fine-grained and controllable multi-layer separation of raster images. In practical workflows, designers typically generate and edit each RGBA layer…

Graphics · Computer Science 2025-11-26 Zihao Liu , Zunnan Xu , Shi Shu , Jun Zhou , Ruicheng Zhang , Zhenchao Tang , Xiu Li

RevealLayer: Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition

Recent diffusion-based approaches have made substantial progress in image layer decomposition. However, accurately decomposing complex natural images remains challenging due to difficulties in occlusion completion, robust layer…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 Binhao Wang , Shihao Zhao , Bo Cheng , Qiuyu Ji , Yuhang Ma , Liebucha Wu , Shanyuan Liu , Dawei Leng , Yuhui Yin

Object-Driven Multi-Layer Scene Decomposition From a Single Image

We present a method that tackles the challenge of predicting color and depth behind the visible content of an image. Our approach aims at building up a Layered Depth Image (LDI) from a single RGB input, which is an efficient representation…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Helisa Dhamo , Nassir Navab , Federico Tombari

Generative Image Layer Decomposition with Visual Effects

Recent advancements in large generative models, particularly diffusion-based methods, have significantly enhanced the capabilities of image editing. However, achieving precise control over image composition tasks remains a challenge.…

Computer Vision and Pattern Recognition · Computer Science 2024-11-28 Jinrui Yang , Qing Liu , Yijun Li , Soo Ye Kim , Daniil Pakhomov , Mengwei Ren , Jianming Zhang , Zhe Lin , Cihang Xie , Yuyin Zhou

LayerD: Decomposing Raster Graphic Designs into Layers

Designers craft and edit graphic designs in a layer representation, but layer-based editing becomes impossible once composited into a raster image. In this work, we propose LayerD, a method to decompose raster graphic designs into layers…

Graphics · Computer Science 2025-09-30 Tomoyuki Suzuki , Kang-Jun Liu , Naoto Inoue , Kota Yamaguchi

Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition

Recent visual generative models often struggle with consistency during image editing due to the entangled nature of raster images, where all visual content is fused into a single canvas. In contrast, professional design tools employ layered…

Computer Vision and Pattern Recognition · Computer Science 2025-12-18 Shengming Yin , Zekai Zhang , Zecheng Tang , Kaiyuan Gao , Xiao Xu , Kun Yan , Jiahao Li , Yilei Chen , Yuxiang Chen , Heung-Yeung Shum , Lionel M. Ni , Jingren Zhou , Junyang Lin , Chenfei Wu

LEAD: Layer-wise Expert-aligned Decoding for Faithful Radiology Report Generation

Radiology Report Generation (RRG) aims to produce accurate and coherent diagnostics from medical images. Although large vision language models (LVLM) improve report fluency and accuracy, they exhibit hallucinations, generating plausible yet…

Computation and Language · Computer Science 2026-02-05 Ruixiao Yang , Yuanhe Tian , Xu Yang , Huiqi Li , Yan Song

LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation

Zero-shot referring image segmentation aims to locate and segment the target region based on a referring expression, with the primary challenge of aligning and matching semantics across visual and textual modalities without training.…

Computer Vision and Pattern Recognition · Computer Science 2025-05-02 Jiachen Li , Qing Xie , Renshu Gu , Jinyu Xu , Yongjian Liu , Xiaohan Yu

Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models

Pretrained vision-language models, such as CLIP, have demonstrated strong generalization capabilities, making them promising tools in the realm of zero-shot visual recognition. Visual relation detection (VRD) is a typical task that…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Lin Li , Jun Xiao , Guikun Chen , Jian Shao , Yueting Zhuang , Long Chen

HyperNVD: Accelerating Neural Video Decomposition via Hypernetworks

Decomposing a video into a layer-based representation is crucial for easy video editing for the creative industries, as it enables independent editing of specific layers. Existing video-layer decomposition models rely on implicit neural…

Computer Vision and Pattern Recognition · Computer Science 2025-03-24 Maria Pilligua , Danna Xue , Javier Vazquez-Corral

Recursive Decomposition with Dependencies for Generic Divide-and-Conquer Reasoning

Reasoning tasks are crucial in many domains, especially in science and engineering. Although large language models (LLMs) have made progress in reasoning tasks using techniques such as chain-of-thought and least-to-most prompting, these…

Artificial Intelligence · Computer Science 2025-05-06 Sergio Hernández-Gutiérrez , Minttu Alakuijala , Alexander V. Nikitin , Pekka Marttinen

DPL: Decoupled Prompt Learning for Vision-Language Models

Prompt learning has emerged as an efficient and effective approach for transferring foundational Vision-Language Models (e.g., CLIP) to downstream tasks. However, current methods tend to overfit to seen categories, thereby limiting their…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Chen Xu , Yuhan Zhu , Guozhen Zhang , Haocheng Shen , Yixuan Liao , Xiaoxin Chen , Gangshan Wu , Limin Wang

NeRD: Neural Reflectance Decomposition from Image Collections

Decomposing a scene into its shape, reflectance, and illumination is a challenging but important problem in computer vision and graphics. This problem is inherently more challenging when the illumination is not a single light source under…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Mark Boss , Raphael Braun , Varun Jampani , Jonathan T. Barron , Ce Liu , Hendrik P. A. Lensch

DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers

Diffusion models have recently motivated great success in many generation tasks like object removal. Nevertheless, existing image decomposition methods struggle to disentangle semi-transparent or transparent layer occlusions due to mask…

Computer Vision and Pattern Recognition · Computer Science 2025-09-03 Zitong Wang , Hang Zhao , Qianyu Zhou , Xuequan Lu , Xiangtai Li , Yiren Song

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

We propose DepR, a depth-guided single-view scene reconstruction framework that integrates instance-level diffusion within a compositional paradigm. Instead of reconstructing the entire scene holistically, DepR generates individual objects…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Qingcheng Zhao , Xiang Zhang , Haiyang Xu , Zeyuan Chen , Jianwen Xie , Yuan Gao , Zhuowen Tu

Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning

We present Stable-Layers, a reinforcement learning framework that eliminates the need for paired supervision by fine-tuning a pretrained layer decomposition model using only feedback from a vision-language model (VLM). Starting from…

Computer Vision and Pattern Recognition · Computer Science 2026-05-29 Ciara Rowles , Reshinth Adithyan , Nikhil Pinnaparaju , Vikram Voleti , Mark Boss

Referring Change Detection in Remote Sensing Imagery

Change detection in remote sensing imagery is essential for applications such as urban planning, environmental monitoring, and disaster management. Traditional change detection methods typically identify all changes between two temporal…

Computer Vision and Pattern Recognition · Computer Science 2025-12-15 Yilmaz Korkmaz , Jay N. Paranjape , Celso M. de Melo , Vishal M. Patel

RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

While latent diffusion models (LDMs), such as Stable Diffusion, are designed for high-resolution (HR) image generation, they often struggle with significant structural distortions when generating images at resolutions higher than their…

Computer Vision and Pattern Recognition · Computer Science 2025-12-09 Boyuan Cao , Jiaxin Ye , Yujie Wei , Hongming Shan

Retrieval-augmented Decoding for Improving Truthfulness in Open-ended Generation

Ensuring truthfulness in large language models (LLMs) remains a critical challenge for reliable text generation. While supervised fine-tuning and reinforcement learning with human feedback have shown promise, they require a substantial…

Machine Learning · Computer Science 2026-03-17 Manh Nguyen , Sunil Gupta , Hung Le

Diff-Restorer: Unleashing Visual Prompts for Diffusion-based Universal Image Restoration

Image restoration is a classic low-level problem aimed at recovering high-quality images from low-quality images with various degradations such as blur, noise, rain, haze, etc. However, due to the inherent complexity and non-uniqueness of…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Yuhong Zhang , Hengsheng Zhang , Xinning Chai , Zhengxue Cheng , Rong Xie , Li Song , Wenjun Zhang