Related papers: DiffPop: Plausibility-Guided Object Placement Diff…

OPA: Object Placement Assessment Dataset

Image composition aims to generate realistic composite image by inserting an object from one image into another background image, where the placement (e.g., location, size, occlusion) of inserted object may be unreasonable, which would…

Computer Vision and Pattern Recognition · Computer Science 2022-06-22 Liu Liu , Zhenchen Liu , Bo Zhang , Jiangtong Li , Li Niu , Qingyang Liu , Liqing Zhang

DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation

Denoising diffusion probabilistic models that were initially proposed for realistic image generation have recently shown success in various perception tasks (e.g., object detection and image segmentation) and are increasingly gaining…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Runyang Feng , Yixing Gao , Tze Ho Elden Tse , Xueqing Ma , Hyung Jin Chang

Diffusion Model for Camouflaged Object Detection

Camouflaged object detection is a challenging task that aims to identify objects that are highly similar to their background. Due to the powerful noise-to-image denoising capability of denoising diffusion models, in this paper, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Zhennan Chen , Rongrong Gao , Tian-Zhu Xiang , Fan Lin

Flexible Geometric Guidance for Probabilistic Human Pose Estimation with Diffusion Models

3D human pose estimation from 2D images is a challenging problem due to depth ambiguity and occlusion. Because of these challenges the task is underdetermined, where there exists multiple -- possibly infinite -- poses that are plausible…

Computer Vision and Pattern Recognition · Computer Science 2026-02-04 Francis Snelgar , Ming Xu , Stephen Gould , Liang Zheng , Akshay Asthana

HiddenObjects: Scalable Diffusion-Distilled Spatial Priors for Object Placement

We propose a method to learn explicit, class-conditioned spatial priors for object placement in natural scenes by distilling the implicit placement knowledge encoded in text-conditioned diffusion models. Prior work relies either on manually…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Marco Schouten , Ioannis Siglidis , Serge Belongie , Dim P. Papadopoulos

Person Image Synthesis via Denoising Diffusion Model

The pose-guided person image generation task requires synthesizing photorealistic images of humans in arbitrary poses. The existing approaches use generative adversarial networks that do not necessarily maintain realistic textures or need…

Computer Vision and Pattern Recognition · Computer Science 2023-03-02 Ankan Kumar Bhunia , Salman Khan , Hisham Cholakkal , Rao Muhammad Anwer , Jorma Laaksonen , Mubarak Shah , Fahad Shahbaz Khan

pOps: Photo-Inspired Diffusion Operators

Text-guided image generation enables the creation of visual content from textual descriptions. However, certain visual concepts cannot be effectively conveyed through language alone. This has sparked a renewed interest in utilizing the CLIP…

Computer Vision and Pattern Recognition · Computer Science 2024-06-04 Elad Richardson , Yuval Alaluf , Ali Mahdavi-Amiri , Daniel Cohen-Or

Advancing Image Classification with Discrete Diffusion Classification Modeling

Image classification is a well-studied task in computer vision, and yet it remains challenging under high-uncertainty conditions, such as when input images are corrupted or training data are limited. Conventional classification approaches…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Omer Belhasin , Shelly Golan , Ran El-Yaniv , Michael Elad

DAG: Depth-Aware Guidance with Denoising Diffusion Probabilistic Models

Generative models have recently undergone significant advancement due to the diffusion models. The success of these models can be often attributed to their use of guidance techniques, such as classifier or classifier-free guidance, which…

Computer Vision and Pattern Recognition · Computer Science 2023-01-31 Gyeongnyeon Kim , Wooseok Jang , Gyuseong Lee , Susung Hong , Junyoung Seo , Seungryong Kim

DiffuSAM: Diffusion Guided Zero-Shot Object Grounding for Remote Sensing Imagery

Diffusion models have emerged as powerful tools for a wide range of vision tasks, including text-guided image generation and editing. In this work, we explore their potential for object grounding in remote sensing imagery. We propose a…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Geet Sethi , Panav Shah , Ashutosh Gandhe , Soumitra Darshan Nayak

DiffUHaul: A Training-Free Method for Object Dragging in Images

Text-to-image diffusion models have proven effective for solving many image editing tasks. However, the seemingly straightforward task of seamlessly relocating objects within a scene remains surprisingly challenging. Existing methods…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Omri Avrahami , Rinon Gal , Gal Chechik , Ohad Fried , Dani Lischinski , Arash Vahdat , Weili Nie

DDP: Diffusion Model for Dense Visual Prediction

We propose a simple, efficient, yet powerful framework for dense visual predictions based on the conditional diffusion pipeline. Our approach follows a "noise-to-map" generative paradigm for prediction by progressively removing noise from a…

Computer Vision and Pattern Recognition · Computer Science 2023-05-16 Yuanfeng Ji , Zhe Chen , Enze Xie , Lanqing Hong , Xihui Liu , Zhaoqiang Liu , Tong Lu , Zhenguo Li , Ping Luo

Imagining the Unseen: Generative Location Modeling for Object Placement

Location modeling, or determining where non-existing objects could feasibly appear in a scene, has the potential to benefit numerous computer vision tasks, from automatic object insertion to scene creation in virtual reality. Yet, this…

Computer Vision and Pattern Recognition · Computer Science 2025-10-08 Jooyeol Yun , Davide Abati , Mohamed Omran , Jaegul Choo , Amirhossein Habibian , Auke Wiggers

RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models

Diffusion models have achieved remarkable advancements in text-to-image generation. However, existing models still have many difficulties when faced with multiple-object compositional generation. In this paper, we propose RealCompo, a new…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Xinchen Zhang , Ling Yang , Yaqi Cai , Zhaochen Yu , Kai-Ni Wang , Jiake Xie , Ye Tian , Minkai Xu , Yong Tang , Yujiu Yang , Bin Cui

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation

Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy due to challenges such as occlusions and cluttered backgrounds. Meanwhile, diffusion models have shown appealing performance in generating…

Computer Vision and Pattern Recognition · Computer Science 2024-03-25 Li Xu , Haoxuan Qu , Yujun Cai , Jun Liu

DiffPlace: Street View Generation via Place-Controllable Diffusion Model Enhancing Place Recognition

Generative models have advanced significantly in realistic image synthesis, with diffusion models excelling in quality and stability. Recent multi-view diffusion models improve 3D-aware street view generation, but they struggle to produce…

Computer Vision and Pattern Recognition · Computer Science 2026-02-13 Ji Li , Zhiwei Li , Shihao Li , Zhenjiang Yu , Boyang Wang , Haiou Liu

ObjectComposer: Consistent Generation of Multiple Objects Without Fine-tuning

Recent text-to-image generative models can generate high-fidelity images from text prompts. However, these models struggle to consistently generate the same objects in different contexts with the same appearance. Consistent object…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Alec Helbling , Evan Montoya , Duen Horng Chau

Denoising Functional Maps: Diffusion Models for Shape Correspondence

Estimating correspondences between pairs of deformable shapes remains a challenging problem. Despite substantial progress, existing methods lack broad generalization capabilities and require category-specific training data. To address these…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 Aleksei Zhuravlev , Zorah Lähner , Vladislav Golyanik

Object Pose Estimation via the Aggregation of Diffusion Features

Estimating the pose of objects from images is a crucial task of 3D scene understanding, and recent approaches have shown promising results on very large benchmarks. However, these methods experience a significant performance drop when…

Computer Vision and Pattern Recognition · Computer Science 2024-10-21 Tianfu Wang , Guosheng Hu , Hongguang Wang

DiffMOD: Progressive Diffusion Point Denoising for Moving Object Detection in Remote Sensing

Moving object detection (MOD) in remote sensing is significantly challenged by low resolution, extremely small object sizes, and complex noise interference. Current deep learning-based MOD methods rely on probability density estimation,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-15 Jinyue Zhang , Xiangrong Zhang , Zhongjian Huang , Tianyang Zhang , Yifei Jiang , Licheng Jiao