Related papers: DFA: Dynamic Feature Aggregation for Efficient Vid…

Practical Video Object Detection via Feature Selection and Aggregation

Compared with still image object detection, video object detection (VOD) needs to particularly concern the high across-frame variation in object appearance, and the diverse deterioration in some frames. In principle, the detection in a…

Computer Vision and Pattern Recognition · Computer Science 2024-07-30 Yuheng Shi , Tong Zhang , Xiaojie Guo

FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors

Video object detection needs to solve feature degradation situations that rarely happen in the image domain. One solution is to use the temporal information and fuse the features from the neighboring frames. With Transformerbased object…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Yiming Cui , Linjie Yang

Object-aware Feature Aggregation for Video Object Detection

We present an Object-aware Feature Aggregation (OFA) module for video object detection (VID). Our approach is motivated by the intriguing property that video-level object-aware knowledge can be employed as a powerful semantic prior to help…

Computer Vision and Pattern Recognition · Computer Science 2020-10-26 Qichuan Geng , Hong Zhang , Na Jiang , Xiaojuan Qi , Liangjun Zhang , Zhong Zhou

Sequence Level Semantics Aggregation for Video Object Detection

Video objection detection (VID) has been a rising research direction in recent years. A central issue of VID is the appearance degradation of video frames caused by fast motion. This problem is essentially ill-posed for a single frame.…

Computer Vision and Pattern Recognition · Computer Science 2019-08-21 Haiping Wu , Yuntao Chen , Naiyan Wang , Zhaoxiang Zhang

Flow-Guided Feature Aggregation for Video Object Detection

Extending state-of-the-art object detectors from image to video is challenging. The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc. Existing work attempts to…

Computer Vision and Pattern Recognition · Computer Science 2017-08-21 Xizhou Zhu , Yujie Wang , Jifeng Dai , Lu Yuan , Yichen Wei

Object Detection in Video with Spatial-temporal Context Aggregation

Recent cutting-edge feature aggregation paradigms for video object detection rely on inferring feature correspondence. The feature correspondence estimation problem is fundamentally difficult due to poor image quality, motion blur, etc, and…

Computer Vision and Pattern Recognition · Computer Science 2019-07-12 Hao Luo , Lichao Huang , Han Shen , Yuan Li , Chang Huang , Xinggang Wang

Real-Time and Accurate Object Detection in Compressed Video by Long Short-term Feature Aggregation

Video object detection is a fundamental problem in computer vision and has a wide spectrum of applications. Based on deep networks, video object detection is actively studied for pushing the limits of detection speed and accuracy. To reduce…

Computer Vision and Pattern Recognition · Computer Science 2021-03-29 Xinggang Wang , Zhaojin Huang , Bencheng Liao , Lichao Huang , Yongchao Gong , Chang Huang

SSGA-Net: Stepwise Spatial Global-local Aggregation Networks for for Autonomous Driving

Visual-based perception is the key module for autonomous driving. Among those visual perception tasks, video object detection is a primary yet challenging one because of feature degradation caused by fast motion or multiple poses. Current…

Computer Vision and Pattern Recognition · Computer Science 2024-05-30 Yiming Cui , Cheng Han , Dongfang Liu

Multi-frame Feature Aggregation for Real-time Instrument Segmentation in Endoscopic Video

Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for…

Computer Vision and Pattern Recognition · Computer Science 2021-07-27 Shan Lin , Fangbo Qin , Haonan Peng , Randall A. Bly , Kris S. Moe , Blake Hannaford

Few-Shot Object Detection via Variational Feature Aggregation

As few-shot object detectors are often trained with abundant base samples and fine-tuned on few-shot novel examples,the learned models are usually biased to base classes and sensitive to the variance of novel examples. To address this…

Computer Vision and Pattern Recognition · Computer Science 2023-02-01 Jiaming Han , Yuqiang Ren , Jian Ding , Ke Yan , Gui-Song Xia

Deformable Feature Alignment and Refinement for Moving Infrared Dim-small Target Detection

The detection of moving infrared dim-small targets has been a challenging and prevalent research topic. The current state-of-the-art methods are mainly based on ConvLSTM to aggregate information from adjacent frames to facilitate the…

Computer Vision and Pattern Recognition · Computer Science 2024-07-11 Dengyan Luo , Yanping Xiang , Hu Wang , Luping Ji , Shuai Li , Mao Ye

Impression Network for Video Object Detection

Video object detection is more challenging compared to image object detection. Previous works proved that applying object detector frame by frame is not only slow but also inaccurate. Visual clues get weakened by defocus and motion blur,…

Computer Vision and Pattern Recognition · Computer Science 2017-12-19 Congrui Hetang , Hongwei Qin , Shaohui Liu , Junjie Yan

CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation

Video instance segmentation is a complex task in which we need to detect, segment, and track each object for any given video. Previous approaches only utilize single-frame features for the detection, segmentation, and tracking of objects…

Computer Vision and Pattern Recognition · Computer Science 2020-12-08 Yang Fu , Linjie Yang , Ding Liu , Thomas S. Huang , Humphrey Shi

DiffVQA: Video Quality Assessment Using Diffusion Feature Extractor

Video Quality Assessment (VQA) aims to evaluate video quality based on perceptual distortions and human preferences. Despite the promising performance of existing methods using Convolutional Neural Networks (CNNs) and Vision Transformers…

Computer Vision and Pattern Recognition · Computer Science 2025-05-07 Wei-Ting Chen , Yu-Jiet Vong , Yi-Tsung Lee , Sy-Yen Kuo , Qiang Gao , Sizhuo Ma , Jian Wang

Learning Where to Focus for Efficient Video Object Detection

Transferring existing image-based detectors to the video is non-trivial since the quality of frames is always deteriorated by part occlusion, rare pose, and motion blur. Previous approaches exploit to propagate and aggregate features across…

Computer Vision and Pattern Recognition · Computer Science 2020-07-17 Zhengkai Jiang , Yu Liu , Ceyuan Yang , Jihao Liu , Peng Gao , Qian Zhang , Shiming Xiang , Chunhong Pan

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Recent proposed neural network-based Temporal Action Detection (TAD) models are inherently limited to extracting the discriminative representations and modeling action instances with various lengths from complex scenes by shared-weights…

Computer Vision and Pattern Recognition · Computer Science 2024-07-04 Le Yang , Ziwei Zheng , Yizeng Han , Hao Cheng , Shiji Song , Gao Huang , Fan Li

Voxelized 3D Feature Aggregation for Multiview Detection

Multi-view detection incorporates multiple camera views to alleviate occlusion in crowded scenes, where the state-of-the-art approaches adopt homography transformations to project multi-view features to the ground plane. However, we find…

Computer Vision and Pattern Recognition · Computer Science 2023-01-05 Jiahao Ma , Jinguang Tong , Shan Wang , Wei Zhao , Zicheng Duan , Chuong Nguyen

Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention

Video object segmentation is a fundamental research problem in computer vision. Recent techniques have often applied attention mechanism to object representation learning from video sequences. However, due to temporal changes in the video…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Quang-Trung Truong , Duc Thanh Nguyen , Binh-Son Hua , Sai-Kit Yeung

IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a Single Image

3D object detection from a single image is an important task in Autonomous Driving (AD), where various approaches have been proposed. However, the task is intrinsically ambiguous and challenging as single image depth estimation is already…

Computer Vision and Pattern Recognition · Computer Science 2021-03-08 Dingfu Zhou , Xibin Song , Yuchao Dai , Junbo Yin , Feixiang Lu , Jin Fang , Miao Liao , Liangjun Zhang

Self-distilled Feature Aggregation for Self-supervised Monocular Depth Estimation

Self-supervised monocular depth estimation has received much attention recently in computer vision. Most of the existing works in literature aggregate multi-scale features for depth prediction via either straightforward concatenation or…

Computer Vision and Pattern Recognition · Computer Science 2022-09-16 Zhengming Zhou , Qiulei Dong