Related papers: Diffusion-Augmented Depth Prediction with Sparse A…
Depth completion is an important vision task, and many efforts have been made to enhance the quality of depth maps from sparse depth measurements. Despite significant advances, training these models to recover dense depth from sparse…
We propose a simple, efficient, yet powerful framework for dense visual predictions based on the conditional diffusion pipeline. Our approach follows a "noise-to-map" generative paradigm for prediction by progressively removing noise from a…
Depth completion, the technique of estimating a dense depth image from sparse depth measurements, has a variety of applications in robotics and autonomous driving. However, depth completion faces 3 main challenges: the irregularly spaced…
Accurate depth estimation plays a critical role in the navigation of endoscopic surgical robots, forming the foundation for 3D reconstruction and safe instrument guidance. Fine-tuning pretrained models heavily relies on endoscopic surgical…
Generative models have recently undergone significant advancement due to the diffusion models. The success of these models can be often attributed to their use of guidance techniques, such as classifier or classifier-free guidance, which…
Learning domain adaptive policies that can generalize to unseen transition dynamics, remains a fundamental challenge in learning-based control. Substantial progress has been made through domain representation learning to capture…
Self-supervised methods have showed promising results on depth estimation task. However, previous methods estimate the target depth map and camera ego-motion simultaneously, underusing multi-frame correlation information and ignoring the…
The depth completion task is a critical problem in autonomous driving, involving the generation of dense depth maps from sparse depth maps and RGB images. Most existing methods employ a spatial propagation network to iteratively refine the…
We propose a novel plug-and-play (PnP) module for improving depth prediction with taking arbitrary patterns of sparse depths as input. Given any pre-trained depth prediction model, our PnP module updates the intermediate feature map such…
In recent years, self-supervised methods for monocular depth estimation has rapidly become an significant branch of depth estimation task, especially for autonomous driving applications. Despite the high overall precision achieved, current…
Sparse annotations fundamentally constrain multimodal remote sensing: even recent state-of-the-art supervised methods such as MSFMamba are limited by the availability of labeled data, restricting their practical deployment despite…
Self-supervised learning has become a central strategy for representation learning, but the majority of architectures used for encoding data have only been validated on regularly-sampled inputs such as images, audios. and videos. In many…
Self-supervised depth estimation draws a lot of attention recently as it can promote the 3D sensing capabilities of self-driving vehicles. However, it intrinsically relies upon the photometric consistency assumption, which hardly holds…
Predicting the behavior of complex systems is critical in many scientific and engineering domains, and hinges on the model's ability to capture their underlying dynamics. Existing methods encode the intrinsic dynamics of high-dimensional…
Detection of the drivable area in all conditions is crucial for autonomous driving and advanced driver assistance systems. However, the amount of labeled data in adverse driving conditions is limited, especially in winter, and supervised…
Recently, diffusion-based blind super-resolution (SR) methods have shown great ability to generate high-resolution images with abundant high-frequency detail, but the detail is often achieved at the expense of fidelity. Meanwhile, another…
We present a method to estimate dense depth by optimizing a sparse set of points such that their diffusion into a depth map minimizes a multi-view reprojection error from RGB supervision. We optimize point positions, depths, and weights…
Point supervision has become a scalable solution to address dense annotation for infrared small target detection, but its performance is limited by two coupled bottlenecks: unstable pseudo-label evolution in cluttered, low-contrast infrared…
Semi-supervised object detection methods are widely used in autonomous driving systems, where only a fraction of objects are labeled. To propagate information from the labeled objects to the unlabeled ones, pseudo-labels for unlabeled…
We propose a method to learn explicit, class-conditioned spatial priors for object placement in natural scenes by distilling the implicit placement knowledge encoded in text-conditioned diffusion models. Prior work relies either on manually…