Related papers: SparseAlign: A Fully Sparse Framework for Cooperat…

SparseCoop: Cooperative Perception with Kinematic-Grounded Queries

Cooperative perception is critical for autonomous driving, overcoming the inherent limitations of a single vehicle, such as occlusions and constrained fields-of-view. However, current approaches sharing dense Bird's-Eye-View (BEV) features…

Computer Vision and Pattern Recognition · Computer Science 2026-04-13 Jiahao Wang , Zhongwei Jiang , Wenchao Sun , Jiaru Zhong , Haibao Yu , Yuner Zhang , Chenyang Lu , Chuang Zhang , Lei He , Shaobing Xu , Jianqiang Wang

Long-SCOPE: Fully Sparse Long-Range Cooperative 3D Perception

Cooperative 3D perception via Vehicle-to-Everything communication is a promising paradigm for enhancing autonomous driving, offering extended sensing horizons and occlusion resolution. However, the practical deployment of existing methods…

Computer Vision and Pattern Recognition · Computer Science 2026-04-13 Jiahao Wang , Zikun Xu , Yuner Zhang , Zhongwei Jiang , Chenyang Lu , Shuocheng Yang , Yuxuan Wang , Jiaru Zhong , Chuang Zhang , Shaobing Xu , Jianqiang Wang

SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception

Multi-modal 3D object detection has exhibited significant progress in recent years. However, most existing methods can hardly scale to long-range scenarios due to their reliance on dense 3D features, which substantially escalate…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Yiheng Li , Hongyang Li , Zehao Huang , Hong Chang , Naiyan Wang

SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos

Camera-based 3D object detection in BEV (Bird's Eye View) space has drawn great attention over the past few years. Dense detectors typically follow a two-stage pipeline by first constructing a dense BEV feature and then performing object…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Haisong Liu , Yao Teng , Tao Lu , Haiguang Wang , Limin Wang

Sparse4D: Multi-view 3D Object Detection with Sparse Spatial-Temporal Fusion

Bird-eye-view (BEV) based methods have made great progress recently in multi-view 3D detection task. Comparing with BEV based methods, sparse based methods lag behind in performance, but still have lots of non-negligible merits. To push…

Computer Vision and Pattern Recognition · Computer Science 2023-02-13 Xuewu Lin , Tianwei Lin , Zixiang Pei , Lichao Huang , Zhizhong Su

SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection

Sparse 3D detectors have received significant attention since the query-based paradigm embraces low latency without explicit dense BEV feature construction. However, these detectors achieve worse performance than their dense counterparts.…

Computer Vision and Pattern Recognition · Computer Science 2024-07-11 Hongcheng Zhang , Liu Liang , Pengxin Zeng , Xiao Song , Zhe Wang

Fully Sparse Fusion for 3D Object Detection

Currently prevalent multimodal 3D detection methods are built upon LiDAR-based detectors that usually use dense Bird's-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Yingyan Li , Lue Fan , Yang Liu , Zehao Huang , Yuntao Chen , Naiyan Wang , Zhaoxiang Zhang

CoopDETR: A Unified Cooperative Perception Framework for 3D Detection via Object Query

Cooperative perception enhances the individual perception capabilities of autonomous vehicles (AVs) by providing a comprehensive view of the environment. However, balancing perception performance and transmission costs remains a significant…

Computer Vision and Pattern Recognition · Computer Science 2025-02-27 Zhe Wang , Shaocong Xu , Xucai Zhuang , Tongda Xu , Yan Wang , Jingjing Liu , Yilun Chen , Ya-Qin Zhang

Safety-Aligned 3D Object Detection: Single-Vehicle, Cooperative, and End-to-End Perspectives

Perception plays a central role in connected and autonomous vehicles (CAVs), underpinning not only conventional modular driving stacks, but also cooperative perception systems and recent end-to-end driving models. While deep learning has…

Computer Vision and Pattern Recognition · Computer Science 2026-04-07 Brian Hsuan-Cheng Liao , Chih-Hong Cheng , Hasan Esen , Alois Knoll

Collaborative Perceiver: Elevating Vision-based 3D Object Detection via Local Density-Aware Spatial Occupancy

Vision-based bird's-eye-view (BEV) 3D object detection has advanced significantly in autonomous driving by offering cost-effectiveness and rich contextual information. However, existing methods often construct BEV representations by…

Computer Vision and Pattern Recognition · Computer Science 2025-08-05 Jicheng Yuan , Manh Nguyen Duc , Qian Liu , Manfred Hauswirth , Danh Le Phuoc

SparseDet: A Simple and Effective Framework for Fully Sparse LiDAR-based 3D Object Detection

LiDAR-based sparse 3D object detection plays a crucial role in autonomous driving applications due to its computational efficiency advantages. Existing methods either use the features of a single central voxel as an object proxy, or treat…

Computer Vision and Pattern Recognition · Computer Science 2024-06-18 Lin Liu , Ziying Song , Qiming Xia , Feiyang Jia , Caiyan Jia , Lei Yang , Hongyu Pan

Collaborative 3D Object Detection for Automatic Vehicle Systems via Learnable Communications

Accurate detection of objects in 3D point clouds is a key problem in autonomous driving systems. Collaborative perception can incorporate information from spatially diverse sensors and provide significant benefits for improving the…

Computer Vision and Pattern Recognition · Computer Science 2022-05-25 Junyong Wang , Yuan Zeng , Yi Gong

SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction

Vision-based perception for autonomous driving requires an explicit modeling of a 3D space, where 2D latent representations are mapped and subsequent 3D operators are applied. However, operating on dense latent spaces introduces a cubic…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Pin Tang , Zhongdao Wang , Guoqing Wang , Jilai Zheng , Xiangxuan Ren , Bailan Feng , Chao Ma

TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

Cooperative perception presents significant potential for enhancing the sensing capabilities of individual vehicles, however, inter-agent latency remains a critical challenge. Latencies cause misalignments in both spatial and semantic…

Computer Vision and Pattern Recognition · Computer Science 2025-03-26 Zhiying Song , Lei Yang , Fuxi Wen , Jun Li

A Late Collaborative Perception Framework for 3D Multi-Object and Multi-Source Association and Fusion

In autonomous driving, recent research has increasingly focused on collaborative perception based on deep learning to overcome the limitations of individual perception systems. Although these methods achieve high accuracy, they rely on high…

Robotics · Computer Science 2025-07-04 Maryem Fadili , Mohamed Anis Ghaoui , Louis Lecrosnier , Steve Pechberti , Redouane Khemmar

SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection

In this work, we present SpaRC, a novel Sparse fusion transformer for 3D perception that integrates multi-view image semantics with Radar and Camera point features. The fusion of radar and camera modalities has emerged as an efficient…

Computer Vision and Pattern Recognition · Computer Science 2025-09-25 Philipp Wolters , Johannes Gilg , Torben Teepe , Fabian Herzog , Felix Fent , Gerhard Rigoll

Robust Collaborative 3D Object Detection in Presence of Pose Errors

Collaborative 3D object detection exploits information exchange among multiple agents to enhance accuracy of object detection in presence of sensor impairments such as occlusion. However, in practice, pose estimation errors due to imperfect…

Computer Vision and Pattern Recognition · Computer Science 2023-03-06 Yifan Lu , Quanhao Li , Baoan Liu , Mehrdad Dianati , Chen Feng , Siheng Chen , Yanfeng Wang

SparseLaneSTP: Leveraging Spatio-Temporal Priors with Sparse Transformers for 3D Lane Detection

3D lane detection has emerged as a critical challenge in autonomous driving, encompassing identification and localization of lane markings and the 3D road surface. Conventional 3D methods detect lanes from dense birds-eye-viewed (BEV)…

Computer Vision and Pattern Recognition · Computer Science 2026-01-09 Maximilian Pittner , Joel Janai , Mario Faigle , Alexandru Paul Condurache

Sparse3DTrack: Monocular 3D Object Tracking Using Sparse Supervision

Monocular 3D object tracking aims to estimate temporally consistent 3D object poses across video frames, enabling autonomous agents to reason about scene dynamics. However, existing state-of-the-art approaches are fully supervised and rely…

Robotics · Computer Science 2026-03-20 Nikhil Gosala , B. Ravi Kiran , Senthil Yogamani , Abhinav Valada

CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers

Bird's eye view (BEV) semantic segmentation plays a crucial role in spatial sensing for autonomous driving. Although recent literature has made significant progress on BEV map understanding, they are all based on single-agent camera-based…

Computer Vision and Pattern Recognition · Computer Science 2022-09-27 Runsheng Xu , Zhengzhong Tu , Hao Xiang , Wei Shao , Bolei Zhou , Jiaqi Ma