English
Related papers

Related papers: Multiple Thinking Achieving Meta-Ability Decouplin…

200 papers

Many query-based approaches for 3D Multi-Object Tracking (MOT) adopt the tracking-by-attention paradigm, utilizing track queries for identity-consistent detection and object queries for identity-agnostic track spawning.…

Computer Vision and Pattern Recognition · Computer Science 2024-12-16 Shuxiao Ding , Lukas Schneider , Marius Cordts , Juergen Gall

Multi-Object Tracking (MOT) has been a long-standing challenge in video understanding. A natural and intuitive approach is to split this task into two parts: object detection and association. Most mainstream methods employ meticulously…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Ruopeng Gao , Ji Qi , Limin Wang

Modern multi-object tracking (MOT) systems usually model the trajectories by associating per-frame detections. However, when camera motion, fast motion, and occlusion challenges occur, it is difficult to ensure long-range tracking or even…

Computer Vision and Pattern Recognition · Computer Science 2020-09-21 Shoudong Han , Piao Huang , Hongwei Wang , En Yu , Donghaisheng Liu , Xiaofeng Pan , Jun Zhao

Moving objects have special importance for Autonomous Driving tasks. Detecting moving objects can be posed as Moving Object Segmentation, by segmenting the object pixels, or Moving Object Detection, by generating a bounding box for the…

Computer Vision and Pattern Recognition · Computer Science 2021-06-23 Eslam Mohamed , Ahmed El-Sallab

Connected autonomous vehicles (CAVs) must simultaneously perform multiple tasks, such as object detection, semantic segmentation, depth estimation, trajectory prediction, motion prediction, and behaviour prediction, to ensure safe and…

Robotics · Computer Science 2025-08-07 Jiayuan Wang , Farhad Pourpanah , Q. M. Jonathan Wu , Ning Zhang

Most existing multimodal trackers adopt uniform fusion strategies, overlooking the inherent differences between modalities. Moreover, they propagate temporal information through mixed tokens, leading to entangled and less discriminative…

Computer Vision and Pattern Recognition · Computer Science 2026-03-11 Shilei Wang , Pujian Lai , Dong Gao , Jifeng Ning , Gong Cheng

Existing online multiple object tracking (MOT) algorithms often consist of two subtasks, detection and re-identification (ReID). In order to enhance the inference speed and reduce the complexity, current methods commonly integrate these…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 En Yu , Zhuoling Li , Shoudong Han , Hongwei Wang

This paper introduces a novel Multi-Agent Cooperative Learning (MACL) framework to address cross-modal alignment collapse in vision-language models when handling out-of-distribution (OOD) concepts. Four core agents, including image, text,…

Multiagent Systems · Computer Science 2026-04-08 Philip Xu

Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time. Our aim in this paper is to move beyond tracking-by-detection…

Computer Vision and Pattern Recognition · Computer Science 2022-10-27 Bruno Korbar , Andrew Zisserman

Traditional multiple object tracking methods divide the task into two parts: affinity learning and data association. The separation of the task requires to define a hand-crafted training goal in affinity learning stage and a hand-crafted…

Computer Vision and Pattern Recognition · Computer Science 2018-08-07 Han Shen , Lichao Huang , Chang Huang , Wei Xu

Modern online multiple object tracking (MOT) methods usually focus on two directions to improve tracking performance. One is to predict new positions in an incoming frame based on tracking information from previous frames, and the other is…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Song Guo , Jingya Wang , Xinchao Wang , Dacheng Tao

Vision is well-known for its use in manipulation, especially using visual servoing. Due to the 3D nature of the world, using multiple camera views and merging them creates better representations for Q-learning and in turn, trains more…

Machine Learning · Computer Science 2025-09-01 Abdulaziz Almuzairee , Rohan Patil , Dwait Bhatt , Henrik I. Christensen

Robust multi-object tracking (MOT) is a prerequisite fora safe deployment of self-driving cars. Tracking objects, however, remains a highly challenging problem, especially in cluttered autonomous driving scenes in which objects tend to…

Computer Vision and Pattern Recognition · Computer Science 2020-08-20 Wei-Chih Hung , Henrik Kretzschmar , Tsung-Yi Lin , Yuning Chai , Ruichi Yu , Ming-Hsuan Yang , Dragomir Anguelov

Online multi-object tracking (MOT) is extremely important for high-level spatial reasoning and path planning for autonomous and highly-automated vehicles. In this paper, we present a modular framework for tracking multiple objects…

Computer Vision and Pattern Recognition · Computer Science 2019-02-20 Akshay Rangesh , Mohan M. Trivedi

Recent efforts on training visual navigation agents conditioned on language using deep reinforcement learning have been successful in learning policies for different multimodal tasks, such as semantic goal navigation and embodied question…

Machine Learning · Computer Science 2019-02-05 Devendra Singh Chaplot , Lisa Lee , Ruslan Salakhutdinov , Devi Parikh , Dhruv Batra

Multi-object tracking (MOT) has profound applications in a variety of fields, including surveillance, sports analytics, self-driving, and cooperative robotics. Despite considerable advancements, existing MOT methodologies tend to falter…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Hamza Mukhtar , Muhammad Usman Ghani Khan

Chain-of-Thought (CoT) prompting has proven highly effective for enhancing complex reasoning in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs). Yet, it struggles in complex spatial reasoning tasks. Nonetheless,…

Computation and Language · Computer Science 2025-01-14 Chengzu Li , Wenshan Wu , Huanyu Zhang , Yan Xia , Shaoguang Mao , Li Dong , Ivan Vulić , Furu Wei

We present an efficient task and motion replanning approach for sequential multi-object manipulation in dynamic environments. Conventional Task And Motion Planning (TAMP) solvers experience an exponential increase in planning time as the…

Robotics · Computer Science 2026-05-20 Yan Zhang , Teng Xue , Amirreza Razmjoo , Sylvain Calinon

Cross-modal transfer learning is used to improve multi-modal classification models (e.g., for human activity recognition in human-robot collaboration). However, existing methods require paired sensor data at both training and inference,…

Machine Learning · Computer Science 2025-09-15 Leen Daher , Zhaobo Wang , Malcolm Mielle

Many vision-related tasks benefit from reasoning over multiple modalities to leverage complementary views of data in an attempt to learn robust embedding spaces. Most deep learning-based methods rely on a late fusion technique whereby…

Computer Vision and Pattern Recognition · Computer Science 2020-03-04 Austin Reiter , Menglin Jia , Pu Yang , Ser-Nam Lim
‹ Prev 1 2 3 10 Next ›