Related papers: CompFeat: Comprehensive Feature Aggregation for Vi…
Compared with still image object detection, video object detection (VOD) needs to particularly concern the high across-frame variation in object appearance, and the diverse deterioration in some frames. In principle, the detection in a…
Extending state-of-the-art object detectors from image to video is challenging. The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc. Existing work attempts to…
Segmenting foreground object from a video is a challenging task because of the large deformations of the objects, occlusions, and background clutter. In this paper, we propose a frame-by-frame but computationally efficient approach for…
Visual objects often have acoustic signatures that are naturally synchronized with them in audio-bearing video recordings. For this project, we explore the multimodal feature aggregation for video instance segmentation task, in which we…
The goal of video segmentation is to turn video data into a set of concrete motion clusters that can be easily interpreted as building blocks of the video. There are some works on similar topics like detecting scene cuts in a video, but…
Recent cutting-edge feature aggregation paradigms for video object detection rely on inferring feature correspondence. The feature correspondence estimation problem is fundamentally difficult due to poor image quality, motion blur, etc, and…
In this paper, we introduce the Context-Aware Video Instance Segmentation (CAVIS), a novel framework designed to enhance instance association by integrating contextual information adjacent to each object. To efficiently extract and leverage…
Existing methods for instance segmentation in videos typically involve multi-stage pipelines that follow the tracking-by-detection paradigm and model a video clip as a sequence of images. Multiple networks are used to detect objects in…
Inspired by recent advances of deep learning in instance segmentation and object tracking, we introduce video object segmentation problem as a concept of guided instance segmentation. Our model proceeds on a per-frame basis, guided by the…
The goal of this paper is to discover, segment, and track independently moving objects in complex visual scenes. Previous approaches have explored the use of optical flow for motion segmentation, leading to imperfect predictions due to…
Video object detection is a fundamental problem in computer vision and has a wide spectrum of applications. Based on deep networks, video object detection is actively studied for pushing the limits of detection speed and accuracy. To reduce…
Video instance segmentation (VIS) aims to segment and associate all instances of predefined classes for each frame in videos. Prior methods usually obtain segmentation for a frame or clip first, and merge the incomplete results by tracking…
Video instance segmentation aims at predicting object segmentation masks for each frame, as well as associating the instances across multiple frames. Recent end-to-end video instance segmentation methods are capable of performing object…
This paper aims to learn a compact representation of a video for video face recognition task. We make the following contributions: first, we propose a meta attention-based aggregation scheme which adaptively and fine-grained weighs the…
Video instance segmentation, also known as multi-object tracking and segmentation, is an emerging computer vision research area introduced in 2019, aiming at detecting, segmenting, and tracking instances in videos simultaneously. By…
Video object detection is a fundamental yet challenging task in computer vision. One practical solution is to take advantage of temporal information from the video and apply feature aggregation to enhance the object features in each frame.…
Video objection detection (VID) has been a rising research direction in recent years. A central issue of VID is the appearance degradation of video frames caused by fast motion. This problem is essentially ill-posed for a single frame.…
Video co-segmentation refers to the task of jointly segmenting common objects appearing in a given group of videos. In practice, high-dimensional data such as videos can be conceptually thought as being drawn from a union of subspaces…
The advancement of computer vision has pushed visual analysis tasks from still images to the video domain. In recent years, video instance segmentation, which aims to track and segment multiple objects in video frames, has drawn much…
Video Instance Segmentation is a fundamental computer vision task that deals with segmenting and tracking object instances across a video sequence. Most existing methods typically accomplish this task by employing a multi-stage top-down…