Related papers: A Dynamic Feature Interaction Framework for Multi-…
The perception system for autonomous driving generally requires to handle multiple diverse sub-tasks. However, current algorithms typically tackle individual sub-tasks separately, which leads to low efficiency when aiming at obtaining…
As the demand for enabling high-level autonomous driving has increased in recent years and visual perception is one of the critical features to enable fully autonomous driving, in this paper, we introduce an efficient approach for…
Scene understanding plays a critical role in enabling intelligence and autonomy in robotic systems. Traditional approaches often face challenges, including occlusions, ambiguous boundaries, and the inability to adapt attention based on…
Traffic scene recognition, which requires various visual classification tasks, is a critical ingredient in autonomous vehicles. However, most existing approaches treat each relevant task independently from one another, never considering the…
We propose a novel method for instance label segmentation of dense 3D voxel grids. We target volumetric scene representations, which have been acquired with depth sensors or multi-view stereo methods and which have been processed with…
Multi-task learning has recently emerged as a promising solution for a comprehensive understanding of complex scenes. In addition to being memory-efficient, multi-task models, when appropriately designed, can facilitate the exchange of…
Fusion of 2D images and 3D point clouds is important because information from dense images can enhance sparse point clouds. However, fusion is challenging because 2D and 3D data live in different spaces. In this work, we propose MVPNet…
Learning-based solutions for vision tasks require a large amount of labeled training data to ensure their performance and reliability. In single-task vision-based settings, inconsistency-based active learning has proven to be effective in…
Sensor-based human activity segmentation and recognition are two important and challenging problems in many real-world applications and they have drawn increasing attention from the deep learning community in recent years. Most of the…
We develop new algorithms for simultaneous learning of multiple tasks (e.g., image classification, depth estimation), and for adapting to unseen task/domain distributions within those high-level tasks (e.g., different environments). First,…
Single-task learning in artificial neural networks will be able to learn the model very well, and the benefits brought by transferring knowledge thus become limited. In this regard, when the number of tasks increases (e.g., semantic…
Deploying multiple machine learning models on resource-constrained robotic platforms for different perception tasks often results in redundant computations, large memory footprints, and complex integration challenges. In response, this work…
In driving scenarios, automobile active safety systems are increasingly incorporating deep learning technology. These systems typically need to handle multiple tasks simultaneously, such as detecting fatigue driving and recognizing the…
Scene understanding is crucial for autonomous systems which intend to operate in the real world. Single task vision networks extract information only based on some aspects of the scene. In multi-task learning (MTL), on the other hand, these…
Three-dimensional feature extraction is a critical component of autonomous driving systems, where perception tasks such as 3D object detection, bird's-eye-view (BEV) semantic segmentation, and occupancy prediction serve as important…
Accurate perception of dynamic traffic scenes is crucial for high-level autonomous driving systems, requiring robust object motion estimation and instance segmentation. However, traditional methods often treat them as separate tasks,…
LiDAR-based 3D object detection, semantic segmentation, and panoptic segmentation are usually implemented in specialized networks with distinctive architectures that are difficult to adapt to each other. This paper presents LidarMultiNet, a…
This report serves as a supplementary document for TaskPrompter, detailing its implementation on a new joint 2D-3D multi-task learning benchmark based on Cityscapes-3D. TaskPrompter presents an innovative multi-task prompting framework that…
Deep learning techniques have become the to-go models for most vision-related tasks on 2D images. However, their power has not been fully realised on several tasks in 3D space, e.g., 3D scene understanding. In this work, we jointly address…
Multi-task dense prediction aims at handling multiple pixel-wise prediction tasks within a unified network simultaneously for visual scene understanding. However, cross-task feature interactions of current methods are still suffering from…