Related papers: TransNormal: Dense Visual Semantics for Diffusion-…

StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal

This work addresses the challenge of high-quality surface normal estimation from monocular colored inputs (i.e., images and videos), a field which has recently been revolutionized by repurposing diffusion priors. However, previous attempts…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Chongjie Ye , Lingteng Qiu , Xiaodong Gu , Qi Zuo , Yushuang Wu , Zilong Dong , Liefeng Bo , Yuliang Xiu , Xiaoguang Han

TransDiff: Diffusion-Based Method for Manipulating Transparent Objects Using a Single RGB-D Image

Manipulating transparent objects presents significant challenges due to the complexities introduced by their reflection and refraction properties, which considerably hinder the accurate estimation of their 3D shapes. To address these…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Haoxiao Wang , Kaichen Zhou , Binrui Gu , Zhiyuan Feng , Weijie Wang , Peilin Sun , Yicheng Xiao , Jianhua Zhang , Hao Dong

FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion

Due to the high potential for abuse of GenAI systems, the task of detecting synthetic images has recently become of great interest to the research community. Unfortunately, existing image-space detectors quickly become obsolete as new…

Computer Vision and Pattern Recognition · Computer Science 2024-06-14 George Cazenavette , Avneesh Sud , Thomas Leung , Ben Usman

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Transparent objects remain notoriously hard for perception systems: refraction, reflection and transmission break the assumptions behind stereo, ToF and purely discriminative monocular depth, causing holes and temporally unstable estimates.…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Shaocong Xu , Songlin Wei , Qizhe Wei , Zheng Geng , Hong Li , Licheng Shen , Qianpu Sun , Shu Han , Bin Ma , Bohan Li , Chongjie Ye , Yuhang Zheng , Nan Wang , Saining Zhang , Hao Zhao

DidSee: Diffusion-Based Depth Completion for Material-Agnostic Robotic Perception and Manipulation

Commercial RGB-D cameras often produce noisy, incomplete depth maps for non-Lambertian objects. Traditional depth completion methods struggle to generalize due to the limited diversity and scale of training data. Recent advances exploit…

Computer Vision and Pattern Recognition · Computer Science 2025-06-30 Wenzhou Lyu , Jialing Lin , Wenqi Ren , Ruihao Xia , Feng Qian , Yang Tang

DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection

Anomaly detection has garnered extensive applications in real industrial manufacturing due to its remarkable effectiveness and efficiency. However, previous generative-based models have been limited by suboptimal reconstruction quality,…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Hui Zhang , Zheng Wang , Dan Zeng , Zuxuan Wu , Yu-Gang Jiang

D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation

Depth sensing is an important problem for 3D vision-based robotics. Yet, a real-world active stereo or ToF depth camera often produces noisy and incomplete depth which bottlenecks robot performances. In this work, we propose D3RoMa, a…

Robotics · Computer Science 2024-09-26 Songlin Wei , Haoran Geng , Jiayi Chen , Congyue Deng , Wenbo Cui , Chengyang Zhao , Xiaomeng Fang , Leonidas Guibas , He Wang

TransFusion: Transcribing Speech with Multinomial Diffusion

Diffusion models have shown exceptional scaling properties in the image synthesis domain, and initial attempts have shown similar benefits for applying diffusion to unconditional text synthesis. Denoising diffusion models attempt to…

Audio and Speech Processing · Electrical Eng. & Systems 2022-10-17 Matthew Baas , Kevin Eloff , Herman Kamper

Conditional Denoising Diffusion Model-Based Robust MR Image Reconstruction from Highly Undersampled Data

Magnetic Resonance Imaging (MRI) is a critical tool in modern medical diagnostics, yet its prolonged acquisition time remains a critical limitation, especially in time-sensitive clinical scenarios. While undersampling strategies can…

Image and Video Processing · Electrical Eng. & Systems 2025-10-09 Mohammed Alsubaie , Wenxi Liu , Linxia Gu , Ovidiu C. Andronesi , Sirani M. Perera , Xianqi Li

DeepInv: A Novel Self-supervised Learning Approach for Fast and Accurate Diffusion Inversion

Diffusion inversion is a task of recovering the noise of an image in a diffusion model, which is vital for controllable diffusion image editing. At present, diffusion inversion still remains a challenging task due to the lack of viable…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Ziyue Zhang , Luxi Lin , Xiaolin Hu , Chao Chang , HuaiXi Wang , Yiyi Zhou , Rongrong Ji

DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation

Monocular depth estimation is a challenging task that predicts the pixel-wise depth from a single 2D image. Current methods typically model this problem as a regression or classification task. We propose DiffusionDepth, a new approach that…

Computer Vision and Pattern Recognition · Computer Science 2023-08-30 Yiqun Duan , Xianda Guo , Zheng Zhu

Real-World Denoising via Diffusion Model

Real-world image denoising is an extremely important image processing problem, which aims to recover clean images from noisy images captured in natural environments. In recent years, diffusion models have achieved very promising results in…

Computer Vision and Pattern Recognition · Computer Science 2023-05-09 Cheng Yang , Lijing Liang , Zhixun Su

TODE-Trans: Transparent Object Depth Estimation with Transformer

Transparent objects are widely used in industrial automation and daily life. However, robust visual recognition and perception of transparent objects have always been a major challenge. Currently, most commercial-grade depth cameras are…

Computer Vision and Pattern Recognition · Computer Science 2022-09-20 Kang Chen , Shaochen Wang , Beihao Xia , Dongxu Li , Zhen Kan , Bin Li

ClearDepth: Enhanced Stereo Perception of Transparent Objects for Robotic Manipulation

Transparent object depth perception poses a challenge in everyday life and logistics, primarily due to the inability of standard 3D sensors to accurately capture depth on transparent or reflective surfaces. This limitation significantly…

Robotics · Computer Science 2026-03-10 Kaixin Bai , Huajian Zeng , Lei Zhang , Yiwen Liu , Hongli Xu , Zhaopeng Chen , Jianwei Zhang

Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise

Recently, research on denoising diffusion models has expanded its application to the field of image restoration. Traditional diffusion-based image restoration methods utilize degraded images as conditional input to effectively guide the…

Computer Vision and Pattern Recognition · Computer Science 2024-10-25 Zhenning Shi , Haoshuai Zheng , Chen Xu , Changsheng Dong , Bin Pan , Xueshuo Xie , Along He , Tao Li , Huazhu Fu

PanoNormal: Monocular Indoor 360{\deg} Surface Normal Estimation

The presence of spherical distortion in equirectangular projection (ERP) images presents a persistent challenge in dense regression tasks such as surface normal estimation. Although it may appear straightforward to repurpose architectures…

Computer Vision and Pattern Recognition · Computer Science 2026-01-26 Kun Huang , Fanglue Zhang , Neil Dodgson

Stable Diffusion is a Natural Cross-Modal Decoder for Layered AI-generated Image Compression

Recent advances in Artificial Intelligence Generated Content (AIGC) have garnered significant interest, accompanied by an increasing need to transmit and compress the vast number of AI-generated images (AIGIs). However, there is a…

Image and Video Processing · Electrical Eng. & Systems 2024-12-18 Ruijie Chen , Qi Mao , Zhengxue Cheng

Scene-Agnostic Traversability Labeling and Estimation via a Multimodal Self-supervised Framework

Traversability estimation is critical for enabling robots to navigate across diverse terrains and environments. While recent self-supervised learning methods achieve promising results, they often fail to capture the characteristics of…

Robotics · Computer Science 2025-08-26 Zipeng Fang , Yanbo Wang , Lei Zhao , Weidong Chen

RFTrans: Leveraging Refractive Flow of Transparent Objects for Surface Normal Estimation and Manipulation

Transparent objects are widely used in our daily lives, making it important to teach robots to interact with them. However, it's not easy because the reflective and refractive effects can make depth cameras fail to give accurate geometry…

Computer Vision and Pattern Recognition · Computer Science 2024-02-20 Tutian Tang , Jiyu Liu , Jieyi Zhang , Haoyuan Fu , Wenqiang Xu , Cewu Lu

Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion

The emergence of generative AI and controllable diffusion has made image-to-image synthesis increasingly practical and efficient. However, when input images exhibit low entropy and sparse, the inherent characteristics of diffusion models…

Computer Vision and Pattern Recognition · Computer Science 2025-01-14 Hao Wang , Xiwen Chen , Ashish Bastola , Jiayou Qin , Abolfazl Razi