English
Related papers

Related papers: DepthMaster: Taming Diffusion Models for Monocular…

200 papers

Monocular depth estimation is a challenging task that predicts the pixel-wise depth from a single 2D image. Current methods typically model this problem as a regression or classification task. We propose DiffusionDepth, a new approach that…

Computer Vision and Pattern Recognition · Computer Science 2023-08-30 Yiqun Duan , Xianda Guo , Zheng Zhu

Over the past few years, self-supervised monocular depth estimation that does not depend on ground-truth during the training phase has received widespread attention. Most efforts focus on designing different types of network architectures…

Computer Vision and Pattern Recognition · Computer Science 2023-11-14 Shuwei Shao , Zhongcai Pei , Weihai Chen , Dingchi Sun , Peter C. Y. Chen , Zhengguo Li

Recent work showed that large diffusion models can be reused as highly precise monocular depth estimators by casting depth estimation as an image-conditional image generation task. While the proposed model achieved state-of-the-art results,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-23 Gonzalo Martin Garcia , Karim Knaebel , Christian Schmidt , Daan de Geus , Alexander Hermans , Bastian Leibe

Unsupervised monocular depth estimation has received widespread attention because of its capability to train without ground truth. In real-world scenarios, the images may be blurry or noisy due to the influence of weather conditions and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-29 Runze Liu , Dongchen Zhu , Guanghui Zhang , Yue Xu , Wenjun Shi , Xiaolin Zhang , Lei Wang , Jiamao Li

We formulate monocular depth estimation using denoising diffusion models, inspired by their recent successes in high fidelity image generation. To that end, we introduce innovations to address problems arising due to noisy, incomplete depth…

Computer Vision and Pattern Recognition · Computer Science 2023-03-01 Saurabh Saxena , Abhishek Kar , Mohammad Norouzi , David J. Fleet

Monocular Depth Estimation (MDE) is a fundamental computer vision task with important applications in 3D vision. The current mainstream MDE methods employ an encoder-decoder architecture with multi-level/scale feature processing. However,…

Computer Vision and Pattern Recognition · Computer Science 2026-04-10 Huibin Bai , Shuai Li , Hanxiao Zhai , Yanbo Gao , Chong Lv , Yibo Wang , Haipeng Ping , Wei Hua , Xingyu Gao

A diffusion model, which is formulated to produce an image using thousands of denoising steps, usually suffers from a slow inference speed. Existing acceleration algorithms simplify the sampling by skipping most steps yet exhibit…

Computer Vision and Pattern Recognition · Computer Science 2025-10-02 Mengfei Xia , Yujun Shen , Changsong Lei , Yu Zhou , Ran Yi , Deli Zhao , Wenping Wang , Yong-Jin Liu

Diffusion models generate high-quality images but require dozens of forward passes. We introduce Distribution Matching Distillation (DMD), a procedure to transform a diffusion model into a one-step image generator with minimal impact on…

Computer Vision and Pattern Recognition · Computer Science 2024-10-08 Tianwei Yin , Michaël Gharbi , Richard Zhang , Eli Shechtman , Fredo Durand , William T. Freeman , Taesung Park

Current discriminative depth estimation methods often produce blurry artifacts, while generative approaches suffer from slow sampling due to curvatures in the noise-to-depth transport. Our method addresses these challenges by framing depth…

Computer Vision and Pattern Recognition · Computer Science 2024-12-20 Ming Gui , Johannes Schusterbauer , Ulrich Prestel , Pingchuan Ma , Dmytro Kotovenko , Olga Grebenkova , Stefan Andreas Baumann , Vincent Tao Hu , Björn Ommer

Monocular depth estimation has seen significant advances through discriminative approaches, yet their performance remains constrained by the limitations of training datasets. While generative approaches have addressed this challenge by…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Bulat Gabdullin , Nina Konovalova , Nikolay Patakin , Dmitry Senushkin , Anton Konushin

Monocular Depth Estimation (MDE) is a fundamental 3D vision problem with numerous applications such as 3D scene reconstruction, autonomous navigation, and AI content creation. However, robust and generalizable MDE remains challenging due to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Yunpeng Bai , Qixing Huang

Monocular depth estimation (MDE) aims to infer per-pixel depth from a single RGB image. While diffusion models have advanced MDE with impressive generalization, they often exhibit limitations in accurately reconstructing far-range regions.…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Mingxia Zhan , Li Zhang , Yingjie Wang , Xiaomeng Chu , Beibei Wang , Yanyong Zhang

We present a novel approach designed to address the complexities posed by challenging, out-of-distribution data in the single-image depth estimation task. Starting with images that facilitate depth prediction due to the absence of…

Computer Vision and Pattern Recognition · Computer Science 2024-07-24 Fabio Tosi , Pierluigi Zama Ramirez , Matteo Poggi

Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth from a single image is geometrically ill-posed and requires scene understanding, so it is not surprising that the rise of deep learning has led to a…

Computer Vision and Pattern Recognition · Computer Science 2024-04-04 Bingxin Ke , Anton Obukhov , Shengyu Huang , Nando Metzger , Rodrigo Caye Daudt , Konrad Schindler

The depth completion task is a critical problem in autonomous driving, involving the generation of dense depth maps from sparse depth maps and RGB images. Most existing methods employ a spatial propagation network to iteratively refine the…

Computer Vision and Pattern Recognition · Computer Science 2026-02-03 Ming Yuan , Chuang Zhang , Lei He , Qing Xu , Jianqiang Wang

Diffusion models have shown promising results in speech enhancement, using a task-adapted diffusion process for the conditional generation of clean speech given a noisy mixture. However, at test time, the neural network used for score…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-17 Bunlong Lay , Jean-Marie Lemercier , Julius Richter , Timo Gerkmann

Diffusion models have attained remarkable success in the domains of image generation and editing. It is widely recognized that employing larger inversion and denoising steps in diffusion model leads to improved image reconstruction quality.…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Chen Hou , Guoqiang Wei , Zhibo Chen

We propose DepR, a depth-guided single-view scene reconstruction framework that integrates instance-level diffusion within a compositional paradigm. Instead of reconstructing the entire scene holistically, DepR generates individual objects…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Qingcheng Zhao , Xiang Zhang , Haiyang Xu , Zeyuan Chen , Jianwen Xie , Yuan Gao , Zhuowen Tu

We introduce NitroFusion, a fundamentally different approach to single-step diffusion that achieves high-quality generation through a dynamic adversarial framework. While one-step methods offer dramatic speed advantages, they typically…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Dar-Yen Chen , Hmrishav Bandyopadhyay , Kai Zou , Yi-Zhe Song

Self-conditioning has been central to the success of continuous diffusion language models, as it allows models to correct previous errors. Yet its ability degrades precisely in the regime where diffusion is most attractive for deployment:…

Computation and Language · Computer Science 2026-04-08 Dat Nguyen-Cong , Tung Kieu , Hoang Thanh-Tung
‹ Prev 1 2 3 10 Next ›