Related papers: Learning Spatial-Semantic Features for Robust Vide…

Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS

Video object segmentation (VOS) is a crucial task in computer vision, but current VOS methods struggle with complex scenes and prolonged object motions. To address these challenges, the MOSE dataset aims to enhance object recognition and…

Computer Vision and Pattern Recognition · Computer Science 2024-08-30 Deshui Miao , Yameng Gu , Xin Li , Zhenyu He , Yaowei Wang , Ming-Hsuan Yang

Spatial-Temporal Multi-level Association for Video Object Segmentation

Existing semi-supervised video object segmentation methods either focus on temporal feature matching or spatial-temporal feature modeling. However, they do not address the issues of sufficient target interaction and efficient parallel…

Computer Vision and Pattern Recognition · Computer Science 2024-04-10 Deshui Miao , Xin Li , Zhenyu He , Huchuan Lu , Ming-Hsuan Yang

Submodular video object proposal selection for semantic object segmentation

Learning a data-driven spatio-temporal semantic representation of the objects is the key to coherent and consistent labelling in video. This paper proposes to achieve semantic video object segmentation by learning a data-driven…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 Tinghuai Wang

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

Learning long-term spatial-temporal features are critical for many video analysis tasks. However, existing video segmentation methods predominantly rely on static image segmentation techniques, and methods capturing temporal dependency for…

Computer Vision and Pattern Recognition · Computer Science 2018-09-05 Ning Xu , Linjie Yang , Yuchen Fan , Jianchao Yang , Dingcheng Yue , Yuchen Liang , Brian Price , Scott Cohen , Thomas Huang

Training-Free Spatio-temporal Decoupled Reasoning Video Segmentation with Adaptive Object Memory

Reasoning Video Object Segmentation (ReasonVOS) is a challenging task that requires stable object segmentation across video sequences using implicit and complex textual inputs. Previous methods fine-tune Multimodal Large Language Models…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Zhengtong Zhu , Jiaqing Fan , Zhixuan Liu , Fanzhang Li

Video Object Segmentation using Tracked Object Proposals

We present an approach to semi-supervised video object segmentation, in the context of the DAVIS 2017 challenge. Our approach combines category-based object detection, category-independent object appearance segmentation and temporal object…

Computer Vision and Pattern Recognition · Computer Science 2017-07-21 Gilad Sharir , Eddie Smolyansky , Itamar Friedman

Self-supervised Amodal Video Object Segmentation

Amodal perception requires inferring the full shape of an object that is partially occluded. This task is particularly challenging on two levels: (1) it requires more information than what is contained in the instant retina or imaging…

Computer Vision and Pattern Recognition · Computer Science 2022-10-25 Jian Yao , Yuxin Hong , Chiyu Wang , Tianjun Xiao , Tong He , Francesco Locatello , David Wipf , Yanwei Fu , Zheng Zhang

Generating Masks from Boxes by Mining Spatio-Temporal Consistencies in Videos

Segmenting objects in videos is a fundamental computer vision task. The current deep learning based paradigm offers a powerful, but data-hungry solution. However, current datasets are limited by the cost and human effort of annotating…

Computer Vision and Pattern Recognition · Computer Science 2021-01-07 Bin Zhao , Goutam Bhat , Martin Danelljan , Luc Van Gool , Radu Timofte

Self-supervised Video Object Segmentation

The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a.k.a. dense tracking). We make the following contributions: (i) we propose to improve the existing…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Fangrui Zhu , Li Zhang , Yanwei Fu , Guodong Guo , Weidi Xie

YouTube-VOS: A Large-Scale Video Object Segmentation Benchmark

Learning long-term spatial-temporal features are critical for many video analysis tasks. However, existing video segmentation methods predominantly rely on static image segmentation techniques, and methods capturing temporal dependency for…

Computer Vision and Pattern Recognition · Computer Science 2018-09-11 Ning Xu , Linjie Yang , Yuchen Fan , Dingcheng Yue , Yuchen Liang , Jianchao Yang , Thomas Huang

Video Object Segmentation using Space-Time Memory Networks

We propose a novel solution for semi-supervised video object segmentation. By the nature of the problem, available cues (e.g. video frame(s) with object masks) become richer with the intermediate predictions. However, the existing methods…

Computer Vision and Pattern Recognition · Computer Science 2019-08-13 Seoung Wug Oh , Joon-Young Lee , Ning Xu , Seon Joo Kim

Learning Fast and Robust Target Models for Video Object Segmentation

Video object segmentation (VOS) is a highly challenging problem since the initial mask, defining the target object, is only given at test-time. The main difficulty is to effectively handle appearance changes and similar background objects,…

Computer Vision and Pattern Recognition · Computer Science 2020-04-01 Andreas Robinson , Felix Järemo Lawin , Martin Danelljan , Fahad Shahbaz Khan , Michael Felsberg

Self-supervised Object-Centric Learning for Videos

Unsupervised multi-object segmentation has shown impressive results on images by utilizing powerful semantics learned from self-supervised pretraining. An additional modality such as depth or motion is often used to facilitate the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Görkay Aydemir , Weidi Xie , Fatma Güney

Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation

The problem of video object segmentation can become extremely challenging when multiple instances co-exist. While each instance may exhibit large scale and pose variations, the problem is compounded when instances occlude each other causing…

Computer Vision and Pattern Recognition · Computer Science 2018-03-15 Xiaoxiao Li , Chen Change Loy

Video Object Segmentation with Dynamic Query Modulation

Storing intermediate frame segmentations as memory for long-range context modeling, spatial-temporal memory-based methods have recently showcased impressive results in semi-supervised video object segmentation (SVOS). However, these methods…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Hantao Zhou , Runze Hu , Xiu Li

Learning Spatio-Appearance Memory Network for High-Performance Visual Tracking

Existing visual object tracking usually learns a bounding-box based template to match the targets across frames, which cannot accurately learn a pixel-wise representation, thereby being limited in handling severe appearance variations. To…

Computer Vision and Pattern Recognition · Computer Science 2021-04-07 Fei Xie , Wankou Yang , Bo Liu , Kaihua Zhang , Wanli Xue , Wangmeng Zuo

STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos

Existing methods for instance segmentation in videos typically involve multi-stage pipelines that follow the tracking-by-detection paradigm and model a video clip as a sequence of images. Multiple networks are used to detect objects in…

Computer Vision and Pattern Recognition · Computer Science 2023-09-04 Ali Athar , Sabarinath Mahadevan , Aljoša Ošep , Laura Leal-Taixé , Bastian Leibe

Fast video object segmentation with Spatio-Temporal GANs

Learning descriptive spatio-temporal object models from data is paramount for the task of semi-supervised video object segmentation. Most existing approaches mainly rely on models that estimate the segmentation mask based on a reference…

Computer Vision and Pattern Recognition · Computer Science 2019-03-29 Sergi Caelles , Albert Pumarola , Francesc Moreno-Noguer , Alberto Sanfeliu , Luc Van Gool

Towards Robust Video Object Segmentation with Adaptive Object Calibration

In the booming video era, video segmentation attracts increasing research attention in the multimedia community. Semi-supervised video object segmentation (VOS) aims at segmenting objects in all target frames of a video, given annotated…

Computer Vision and Pattern Recognition · Computer Science 2022-07-05 Xiaohao Xu , Jinglu Wang , Xiang Ming , Yan Lu

Evaluating SAM2 for Video Semantic Segmentation

The Segmentation Anything Model 2 (SAM2) has proven to be a powerful foundation model for promptable visual object segmentation in both images and videos, capable of storing object-aware memories and transferring them temporally through…

Computer Vision and Pattern Recognition · Computer Science 2026-01-30 Syed Hesham Syed Ariff , Yun Liu , Guolei Sun , Jing Yang , Henghui Ding , Xue Geng , Xudong Jiang