Related papers: Exploring Object-Centric Temporal Modeling for Eff…

Focal-PETR: Embracing Foreground for Efficient Multi-Camera 3D Object Detection

The dominant multi-camera 3D detection paradigm is based on explicit 3D feature construction, which requires complicated indexing of local image-view features via 3D-to-2D projection. Other methods implicitly introduce geometric positional…

Computer Vision and Pattern Recognition · Computer Science 2022-12-14 Shihao Wang , Xiaohui Jiang , Ying Li

MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection

Utilizing temporal information to improve the performance of 3D detection has made great progress recently in the field of autonomous driving. Traditional transformer-based temporal fusion methods suffer from quadratic computational cost…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Tong Ning , Ke Lu , Xirui Jiang , Jian Xue

A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation

3D object detection using LiDAR data is an indispensable component for autonomous driving systems. Yet, only a few LiDAR-based 3D object detection methods leverage segmentation information to further guide the detection process. In this…

Computer Vision and Pattern Recognition · Computer Science 2022-03-07 Hamidreza Fazlali , Yixuan Xu , Yuan Ren , Bingbing Liu

Real-time Stereo-based 3D Object Detection for Streaming Perception

The ability to promptly respond to environmental changes is crucial for the perception system of autonomous driving. Recently, a new task called streaming perception was proposed. It jointly evaluate the latency and accuracy into a single…

Computer Vision and Pattern Recognition · Computer Science 2024-10-17 Changcai Li , Zonghua Gu , Gang Chen , Libo Huang , Wei Zhang , Huihui Zhou

DETR4D: Direct Multi-View 3D Object Detection with Sparse Attention

3D object detection with surround-view images is an essential task for autonomous driving. In this work, we propose DETR4D, a Transformer-based framework that explores sparse attention and direct feature query for 3D object detection in…

Computer Vision and Pattern Recognition · Computer Science 2022-12-16 Zhipeng Luo , Changqing Zhou , Gongjie Zhang , Shijian Lu

ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting

We introduce ForeSight, a novel joint detection and forecasting framework for vision-based 3D perception in autonomous vehicles. Traditional approaches treat detection and forecasting as separate sequential tasks, limiting their ability to…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Sandro Papais , Letian Wang , Brian Cheong , Steven L. Waslander

StreamMOTP: Streaming and Unified Framework for Joint 3D Multi-Object Tracking and Trajectory Prediction

3D multi-object tracking and trajectory prediction are two crucial modules in autonomous driving systems. Generally, the two tasks are handled separately in traditional paradigms and a few methods have started to explore modeling these two…

Computer Vision and Pattern Recognition · Computer Science 2024-07-01 Jiaheng Zhuang , Guoan Wang , Siyu Zhang , Xiyang Wang , Hangning Zhou , Ziyao Xu , Chi Zhang , Zhiheng Li

3D Object Detection and Tracking Based on Streaming Data

Recent approaches for 3D object detection have made tremendous progresses due to the development of deep learning. However, previous researches are mostly based on individual frames, leading to limited exploitation of information between…

Computer Vision and Pattern Recognition · Computer Science 2020-09-15 Xusen Guo , Jiangfeng Gu , Silu Guo , Zixiao Xu , Chengzhang Yang , Shanghua Liu , Long Cheng , Kai Huang

Segmenting Moving Objects via an Object-Centric Layered Representation

The objective of this paper is a model that is able to discover, track and segment multiple moving objects in a video. We make four contributions: First, we introduce an object-centric segmentation model with a depth-ordered layer…

Computer Vision and Pattern Recognition · Computer Science 2022-11-15 Junyu Xie , Weidi Xie , Andrew Zisserman

Coreset-Based Adaptive Tracking

We propose a method for learning from streaming visual data using a compact, constant size representation of all the data that was seen until a given moment. Specifically, we construct a 'coreset' representation of streaming data using a…

Computer Vision and Pattern Recognition · Computer Science 2015-11-20 Abhimanyu Dubey , Nikhil Naik , Dan Raviv , Rahul Sukthankar , Ramesh Raskar

Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection

The Detection Transformer (DETR) has revolutionized the design of CNN-based object detection systems, showcasing impressive performance. However, its potential in the domain of multi-frame 3D object detection remains largely unexplored. In…

Computer Vision and Pattern Recognition · Computer Science 2025-08-21 Yifan Zhang , Zhiyu Zhu , Junhui Hou , Dapeng Wu

PETR: Position Embedding Transformation for Multi-View 3D Object Detection

In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can…

Computer Vision and Pattern Recognition · Computer Science 2022-07-20 Yingfei Liu , Tiancai Wang , Xiangyu Zhang , Jian Sun

RoPETR: Improving Temporal Camera-Only 3D Detection by Integrating Enhanced Rotary Position Embedding

This technical report introduces a targeted improvement to the StreamPETR framework, specifically aimed at enhancing velocity estimation, a critical factor influencing the overall NuScenes Detection Score. While StreamPETR exhibits strong…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Hang Ji , Tao Ni , Xufeng Huang , Zhan Shi , Tao Luo , Xin Zhan , Junbo Chen

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

Accurate and reliable 3D detection is vital for many applications including autonomous driving vehicles and service robots. In this paper, we present a flexible and high-performance 3D detection framework, named MPPNet, for 3D temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-09-05 Xuesong Chen , Shaoshuai Shi , Benjin Zhu , Ka Chun Cheung , Hang Xu , Hongsheng Li

FlowTrack: Point-level Flow Network for 3D Single Object Tracking

3D single object tracking (SOT) is a crucial task in fields of mobile robotics and autonomous driving. Traditional motion-based approaches achieve target tracking by estimating the relative movement of target between two consecutive frames.…

Computer Vision and Pattern Recognition · Computer Science 2024-07-03 Shuo Li , Yubo Cui , Zhiheng Li , Zheng Fang

StreamMOS: Streaming Moving Object Segmentation with Multi-View Perception and Dual-Span Memory

Moving object segmentation based on LiDAR is a crucial and challenging task for autonomous driving and mobile robotics. Most approaches explore spatio-temporal information from LiDAR sequences to predict moving objects in the current frame.…

Computer Vision and Pattern Recognition · Computer Science 2024-12-12 Zhiheng Li , Yubo Cui , Jiexi Zhong , Zheng Fang

MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences

Occluded and long-range objects are ubiquitous and challenging for 3D object detection. Point cloud sequence data provide unique opportunities to improve such cases, as an occluded or distant object can be observed from different viewpoints…

Computer Vision and Pattern Recognition · Computer Science 2023-06-07 Yingwei Li , Charles R. Qi , Yin Zhou , Chenxi Liu , Dragomir Anguelov

CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object Tracking

3D multi-object tracking is a crucial component in the perception system of autonomous driving vehicles. Tracking all dynamic objects around the vehicle is essential for tasks such as obstacle avoidance and path planning. Autonomous…

Computer Vision and Pattern Recognition · Computer Science 2021-07-13 Ramin Nabati , Landon Harris , Hairong Qi

LMNet: Real-time Multiclass Object Detection on CPU using 3D LiDAR

This paper describes an optimized single-stage deep convolutional neural network to detect objects in urban environments, using nothing more than point cloud data. This feature enables our method to work regardless the time of the day and…

Computer Vision and Pattern Recognition · Computer Science 2018-05-21 Kazuki Minemura , Hengfui Liau , Abraham Monrroy , Shinpei Kato

FlowMOT: 3D Multi-Object Tracking by Scene Flow Association

Most end-to-end Multi-Object Tracking (MOT) methods face the problems of low accuracy and poor generalization ability. Although traditional filter-based methods can achieve better results, they are difficult to be endowed with optimal…

Computer Vision and Pattern Recognition · Computer Science 2021-03-08 Guangyao Zhai , Xin Kong , Jinhao Cui , Yong Liu , Zhen Yang