English

SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation

Computer Vision and Pattern Recognition 2024-11-28 v1

Abstract

We propose SharpDepth, a novel approach to monocular metric depth estimation that combines the metric accuracy of discriminative depth estimation methods (e.g., Metric3D, UniDepth) with the fine-grained boundary sharpness typically achieved by generative methods (e.g., Marigold, Lotus). Traditional discriminative models trained on real-world data with sparse ground-truth depth can accurately predict metric depth but often produce over-smoothed or low-detail depth maps. Generative models, in contrast, are trained on synthetic data with dense ground truth, generating depth maps with sharp boundaries yet only providing relative depth with low accuracy. Our approach bridges these limitations by integrating metric accuracy with detailed boundary preservation, resulting in depth predictions that are both metrically precise and visually sharp. Our extensive zero-shot evaluations on standard depth estimation benchmarks confirm SharpDepth effectiveness, showing its ability to achieve both high depth accuracy and detailed representation, making it well-suited for applications requiring high-quality depth perception across diverse, real-world environments.

Keywords

Cite

@article{arxiv.2411.18229,
  title  = {SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation},
  author = {Duc-Hai Pham and Tung Do and Phong Nguyen and Binh-Son Hua and Khoi Nguyen and Rang Nguyen},
  journal= {arXiv preprint arXiv:2411.18229},
  year   = {2024}
}

Comments

Uncompressed version can be found in https://drive.google.com/file/d/1MG4-d_xDERVBCRfLDolNLnMLLuqd7qRz

R2 v1 2026-06-28T20:14:23.912Z