Related papers: MultiScope: Efficient Video Pre-processing for Exp…

NoScope: Optimizing Neural Network Queries over Video at Scale

Recent advances in computer vision-in the form of deep neural networks-have made it possible to query increasing volumes of video data with high accuracy. However, neural network inference is computationally expensive at scale: applying a…

Databases · Computer Science 2017-08-10 Daniel Kang , John Emmons , Firas Abuzaid , Peter Bailis , Matei Zaharia

AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling

In vision-enabled autonomous systems such as robots and autonomous cars, video object detection plays a crucial role, and both its speed and accuracy are important factors to provide reliable operation. The key insight we show in this paper…

Computer Vision and Pattern Recognition · Computer Science 2019-02-11 Ting-Wu Chin , Ruizhou Ding , Diana Marculescu

Scanner: Efficient Video Analysis at Scale

A growing number of visual computing applications depend on the analysis of large video collections. The challenge is that scaling applications to operate on these datasets requires efficient systems for pixel data access and parallel…

Computer Vision and Pattern Recognition · Computer Science 2018-05-21 Alex Poms , Will Crichton , Pat Hanrahan , Kayvon Fatahalian

A Survey of Performance Optimization in Neural Network-Based Video Analytics Systems

Video analytics systems perform automatic events, movements, and actions recognition in a video and make it possible to execute queries on the video. As a result of a large number of video data that need to be processed, optimizing the…

Computer Vision and Pattern Recognition · Computer Science 2021-06-01 Nada Ibrahim , Preeti Maurya , Omid Jafari , Parth Nagarkar

DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation

Recent advancements in generative models have provided promising solutions for synthesizing realistic driving videos, which are crucial for training autonomous driving perception models. However, existing approaches often struggle with…

Computer Vision and Pattern Recognition · Computer Science 2024-09-13 Wei Wu , Xi Guo , Weixuan Tang , Tingxuan Huang , Chiyu Wang , Dongyue Chen , Chenjing Ding

Towards High Performance Video Object Detection

There has been significant progresses for image object detection in recent years. Nevertheless, video object detection has received little attention, although it is more challenging and more important in practical scenarios. Built upon the…

Computer Vision and Pattern Recognition · Computer Science 2017-12-01 Xizhou Zhu , Jifeng Dai , Lu Yuan , Yichen Wei

Large-Scale Video Analytics through Object-Level Consolidation

As the number of installed cameras grows, so do the compute resources required to process and analyze all the images captured by these cameras. Video analytics enables new use cases, such as smart cities or autonomous driving. At the same…

Computer Vision and Pattern Recognition · Computer Science 2021-12-01 Daniel Rivas , Francesc Guim , Jordà Polo , David Carrera

MedScope: Incentivizing "Think with Videos" for Clinical Reasoning via Coarse-to-Fine Tool Calling

Long-form clinical videos are central to visual evidence-based decision-making, with growing importance for applications such as surgical robotics and related settings. However, current multimodal large language models typically process…

Computer Vision and Pattern Recognition · Computer Science 2026-02-17 Wenjie Li , Yujie Zhang , Haoran Sun , Xingqi He , Hongcheng Gao , Chenglong Ma , Ming Hu , Guankun Wang , Shiyi Yao , Renhao Yang , Hongliang Ren , Lei Wang , Junjun He , Yankai Jiang

Attend and Interact: Higher-Order Object Interactions for Video Understanding

Human actions often involve complex interactions across several inter-related objects in the scene. However, existing approaches to fine-grained video understanding or visual relationship detection often rely on single object representation…

Computer Vision and Pattern Recognition · Computer Science 2018-03-22 Chih-Yao Ma , Asim Kadav , Iain Melvin , Zsolt Kira , Ghassan AlRegib , Hans Peter Graf

MultiCounter: Multiple Action Agnostic Repetition Counting in Untrimmed Videos

Multi-instance Repetitive Action Counting (MRAC) aims to estimate the number of repetitive actions performed by multiple instances in untrimmed videos, commonly found in human-centric domains like sports and exercise. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2024-09-09 Yin Tang , Wei Luo , Jinrui Zhang , Wei Huang , Ruihai Jing , Deyu Zhang

Video Monitoring Queries

Recent advances in video processing utilizing deep learning primitives achieved breakthroughs in fundamental problems in video analysis such as frame classification and object detection enabling an array of new applications. In this paper…

Databases · Computer Science 2020-02-26 Nick Koudas , Raymond Li , Ioannis Xarchakos

MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models

Recent advances in world models have demonstrated strong capabilities in simulating physical reality, making them an increasingly important foundation for embodied intelligence. For UAV agents in particular, accurate prediction of complex…

Computer Vision and Pattern Recognition · Computer Science 2026-04-10 Zile Guo , Zhan Chen , Enze Zhu , Kan Wei , Yongkang Zou , Xiaoxuan Liu , Lei Wang

Efficient Unsupervised Video Object Segmentation Network Based on Motion Guidance

Due to the problem of performance constraints of unsupervised video object detection, its large-scale application is limited. In response to this pain point, we propose another excellent method to solve this problematic point. By…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Chao Hu , Liqiang Zhu

Multiscale Video Pretraining for Long-Term Activity Forecasting

Long-term activity forecasting is an especially challenging research problem because it requires understanding the temporal relationships between observed actions, as well as the variability and complexity of human activities. Despite…

Computer Vision and Pattern Recognition · Computer Science 2023-07-25 Reuben Tan , Matthias De Lange , Michael Iuzzolino , Bryan A. Plummer , Kate Saenko , Karl Ridgeway , Lorenzo Torresani

Optimizing Video Object Detection via a Scale-Time Lattice

High-performance object detection relies on expensive convolutional networks to compute features, often leading to significant challenges in applications, e.g. those that require detecting objects from video streams in real time. The key to…

Computer Vision and Pattern Recognition · Computer Science 2018-04-17 Kai Chen , Jiaqi Wang , Shuo Yang , Xingcheng Zhang , Yuanjun Xiong , Chen Change Loy , Dahua Lin

A Large-scale Distributed Video Parsing and Evaluation Platform

Visual surveillance systems have become one of the largest data sources of Big Visual Data in real world. However, existing systems for video analysis still lack the ability to handle the problems of scalability, expansibility and…

Computer Vision and Pattern Recognition · Computer Science 2016-11-30 Kai Yu , Yang Zhou , Da Li , Zhang Zhang , Kaiqi Huang

Multiple Object Tracking with Motion and Appearance Cues

Due to better video quality and higher frame rate, the performance of multiple object tracking issues has been greatly improved in recent years. However, in real application scenarios, camera motion and noisy per frame detection results…

Computer Vision and Pattern Recognition · Computer Science 2019-09-04 Weiqiang Li , Jiatong Mu , Guizhong Liu

Multiple Object Tracking as ID Prediction

Multi-Object Tracking (MOT) has been a long-standing challenge in video understanding. A natural and intuitive approach is to split this task into two parts: object detection and association. Most mainstream methods employ meticulously…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Ruopeng Gao , Ji Qi , Limin Wang

Maximum Likelihood Speed Estimation of Moving Objects in Video Signals

Video processing solutions for motion analysis are key tasks in many computer vision applications, ranging from human activity recognition to object detection. In particular, speed estimation algorithms may be relevant in contexts such as…

Image and Video Processing · Electrical Eng. & Systems 2022-11-29 Veronica Mattioli , Davide Alinovi , Riccardo Raheli

Detect to Track and Track to Detect

Recent approaches for high accuracy detection and tracking of object categories in video consist of complex multistage solutions that become more cumbersome each year. In this paper we propose a ConvNet architecture that jointly performs…

Computer Vision and Pattern Recognition · Computer Science 2018-03-08 Christoph Feichtenhofer , Axel Pinz , Andrew Zisserman