Related papers: A Diffusion-Based Framework for Occluded Object Mo…

Diffusion Model for Camouflaged Object Detection

Camouflaged object detection is a challenging task that aims to identify objects that are highly similar to their background. Due to the powerful noise-to-image denoising capability of denoising diffusion models, in this paper, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Zhennan Chen , Rongrong Gao , Tian-Zhu Xiang , Fan Lin

D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition

Applications of diffusion models for visual tasks have been quite noteworthy. This paper targets making classification models more robust to occlusions for the task of object recognition by proposing a pipeline that utilizes a frozen…

Computer Vision and Pattern Recognition · Computer Science 2025-04-14 Rupayan Mallick , Sibo Dong , Nataniel Ruiz , Sarah Adel Bargal

O$^2$-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model

Occlusion is a common issue in 3D reconstruction from RGB-D videos, often blocking the complete reconstruction of objects and presenting an ongoing problem. In this paper, we propose a novel framework, empowered by a 2D diffusion-based…

Computer Vision and Pattern Recognition · Computer Science 2024-03-20 Yubin Hu , Sheng Ye , Wang Zhao , Matthieu Lin , Yuze He , Yu-Hui Wen , Ying He , Yong-Jin Liu

ReorientDiff: Diffusion Model based Reorientation for Object Manipulation

The ability to manipulate objects in a desired configurations is a fundamental requirement for robots to complete various practical applications. While certain goals can be achieved by picking and placing the objects of interest directly,…

Robotics · Computer Science 2023-09-18 Utkarsh A. Mishra , Yongxin Chen

Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators

Diffusion models are capable of generating impressive images conditioned on text descriptions, and extensions of these models allow users to edit images at a relatively coarse scale. However, the ability to precisely edit the layout,…

Computer Vision and Pattern Recognition · Computer Science 2024-02-01 Daniel Geng , Andrew Owens

Using Diffusion Priors for Video Amodal Segmentation

Object permanence in humans is a fundamental cue that helps in understanding persistence of objects, even when they are fully occluded in the scene. Present day methods in object segmentation do not account for this amodal nature of the…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Kaihua Chen , Deva Ramanan , Tarasha Khurana

DiffUHaul: A Training-Free Method for Object Dragging in Images

Text-to-image diffusion models have proven effective for solving many image editing tasks. However, the seemingly straightforward task of seamlessly relocating objects within a scene remains surprisingly challenging. Existing methods…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Omri Avrahami , Rinon Gal , Gal Chechik , Ohad Fried , Dani Lischinski , Arash Vahdat , Weili Nie

DiffusionTrack: Diffusion Model For Multi-Object Tracking

Multi-object tracking (MOT) is a challenging vision task that aims to detect individual objects within a single frame and associate them across multiple frames. Recent MOT approaches can be categorized into two-stage tracking-by-detection…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Run Luo , Zikai Song , Lintao Ma , Jinlin Wei , Wei Yang , Min Yang

Sequential Amodal Segmentation via Cumulative Occlusion Learning

To fully understand the 3D context of a single image, a visual system must be able to segment both the visible and occluded regions of objects, while discerning their occlusion order. Ideally, the system should be able to handle any object…

Computer Vision and Pattern Recognition · Computer Science 2024-05-10 Jiayang Ao , Qiuhong Ke , Krista A. Ehinger

Move Anything with Layered Scene Diffusion

Diffusion models generate images with an unprecedented level of quality, but how can we freely rearrange image layouts? Recent works generate controllable scenes via learning spatially disentangled latent codes, but these methods do not…

Computer Vision and Pattern Recognition · Computer Science 2024-04-11 Jiawei Ren , Mengmeng Xu , Jui-Chieh Wu , Ziwei Liu , Tao Xiang , Antoine Toisoul

Object-Centric Diffusion for Efficient Video Editing

Diffusion-based video editing have reached impressive quality and can transform either the global style, local structure, and attributes of given video inputs, following textual edit prompts. However, such solutions typically incur heavy…

Computer Vision and Pattern Recognition · Computer Science 2024-09-02 Kumara Kahatapitiya , Adil Karjauv , Davide Abati , Fatih Porikli , Yuki M. Asano , Amirhossein Habibian

Bi-CamoDiffusion: A Boundary-informed Diffusion Approach for Camouflaged Object Detection

Bi-CamoDiffusion is introduced, an evolution of the CamoDiffusion framework for camouflaged object detection. It integrates edge priors into early-stage embeddings via a parameter-free injection process, which enhances boundary sharpness…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Patricia L. Suarez , Leo Thomas Ramos , Angel D. Sappa

PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor

Generative image editing has recently witnessed extremely fast-paced growth. Some works use high-level conditioning such as text, while others use low-level conditioning. Nevertheless, most of them lack fine-grained control over the…

Computer Vision and Pattern Recognition · Computer Science 2024-04-10 Vidit Goel , Elia Peruzzo , Yifan Jiang , Dejia Xu , Xingqian Xu , Nicu Sebe , Trevor Darrell , Zhangyang Wang , Humphrey Shi

Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D

Diffusion Handles is a novel approach to enabling 3D object edits on diffusion images. We accomplish these edits using existing pre-trained diffusion models, and 2D image depth estimation, without any fine-tuning or 3D object retrieval. The…

Computer Vision and Pattern Recognition · Computer Science 2023-12-08 Karran Pandey , Paul Guerrero , Matheus Gadelha , Yannick Hold-Geoffroy , Karan Singh , Niloy Mitra

MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World

Diffusion models have become central to various image editing tasks, yet they often fail to fully adhere to physical laws, particularly with effects like shadows, reflections, and occlusions. In this work, we address the challenge of…

Computer Vision and Pattern Recognition · Computer Science 2025-04-23 Ankit Dhiman , Manan Shah , R Venkatesh Babu

DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera

Combining sparse IMUs and a monocular camera is a new promising setting to perform real-time human motion capture. This paper proposes a diffusion-based solution to learn human motion priors and fuse the two modalities of signals together…

Computer Vision and Pattern Recognition · Computer Science 2025-08-11 Shaohua Pan , Xinyu Yi , Yan Zhou , Weihua Jian , Yuan Zhang , Pengfei Wan , Feng Xu

DiffusionDet: Diffusion Model for Object Detection

We propose DiffusionDet, a new framework that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. During the training stage, object boxes diffuse from ground-truth boxes to random distribution, and…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Shoufa Chen , Peize Sun , Yibing Song , Ping Luo

Image Motion Blur Removal in the Temporal Dimension with Video Diffusion Models

Most motion deblurring algorithms rely on spatial-domain convolution models, which struggle with the complex, non-linear blur arising from camera shake and object motion. In contrast, we propose a novel single-image deblurring approach that…

Image and Video Processing · Electrical Eng. & Systems 2025-01-23 Wang Pang , Zhihao Zhan , Xiang Zhu , Yechao Bai

TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models

Despite remarkable achievements in video synthesis, achieving granular control over complex dynamics, such as nuanced movement among multiple interacting objects, still presents a significant hurdle for dynamic world modeling, compounded by…

Computer Vision and Pattern Recognition · Computer Science 2024-03-21 Pengxiang Li , Kai Chen , Zhili Liu , Ruiyuan Gao , Lanqing Hong , Guo Zhou , Hua Yao , Dit-Yan Yeung , Huchuan Lu , Xu Jia

DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction

In Multiple Object Tracking, objects often exhibit non-linear motion of acceleration and deceleration, with irregular direction changes. Tacking-by-detection (TBD) trackers with Kalman Filter motion prediction work well in…

Computer Vision and Pattern Recognition · Computer Science 2024-03-21 Weiyi Lv , Yuhang Huang , Ning Zhang , Ruei-Sung Lin , Mei Han , Dan Zeng