Related papers: DeVIS: Making Deformable Transformers Work for Vid…

Deformable VisTR: Spatio temporal deformable attention for video instance segmentation

Video instance segmentation (VIS) task requires classifying, segmenting, and tracking object instances over all frames in a video clip. Recently, VisTR has been proposed as end-to-end transformer-based VIS framework, while demonstrating…

Computer Vision and Pattern Recognition · Computer Science 2022-03-15 Sudhir Yarram , Jialian Wu , Pan Ji , Yi Xu , Junsong Yuan

DVIS: Decoupled Video Instance Segmentation Framework

Video instance segmentation (VIS) is a critical task with diverse applications, including autonomous driving and video editing. Existing methods often underperform on complex and long videos in real world, primarily due to two factors.…

Computer Vision and Pattern Recognition · Computer Science 2023-07-17 Tao Zhang , Xingye Tian , Yu Wu , Shunping Ji , Xuebo Wang , Yuan Zhang , Pengfei Wan

End-to-End Video Instance Segmentation with Transformers

Video instance segmentation (VIS) is the task that requires simultaneously classifying, segmenting and tracking object instances of interest in video. Recent methods typically develop sophisticated pipelines to tackle this task. Here, we…

Computer Vision and Pattern Recognition · Computer Science 2021-10-11 Yuqing Wang , Zhaoliang Xu , Xinlong Wang , Chunhua Shen , Baoshan Cheng , Hao Shen , Huaxia Xia

DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation

The recent works on Video Object Segmentation achieved remarkable results by matching dense semantic and instance-level features between the current and previous frames for long-time propagation. Nevertheless, global feature matching…

Computer Vision and Pattern Recognition · Computer Science 2024-05-15 Volodymyr Fedynyak , Yaroslav Romanus , Bohdan Hlovatskyi , Bohdan Sydor , Oles Dobosevych , Igor Babin , Roman Riazantsev

Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer

State-of-the-art transformer-based video instance segmentation (VIS) approaches typically utilize either single-scale spatio-temporal features or per-frame multi-scale features during the attention computations. We argue that such an…

Computer Vision and Pattern Recognition · Computer Science 2022-03-25 Omkar Thawakar , Sanath Narayan , Jiale Cao , Hisham Cholakkal , Rao Muhammad Anwer , Muhammad Haris Khan , Salman Khan , Michael Felsberg , Fahad Shahbaz Khan

1st Place Solution for YouTubeVOS Challenge 2021:Video Instance Segmentation

Video Instance Segmentation (VIS) is a multi-task problem performing detection, segmentation, and tracking simultaneously. Extended from image set applications, video data additionally induces the temporal information, which, if handled…

Computer Vision and Pattern Recognition · Computer Science 2021-07-12 Thuy C. Nguyen , Tuan N. Tang , Nam LH. Phan , Chuong H. Nguyen , Masayuki Yamazaki , Masao Yamanaka

Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention

Video object segmentation is a fundamental research problem in computer vision. Recent techniques have often applied attention mechanism to object representation learning from video sequences. However, due to temporal changes in the video…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Quang-Trung Truong , Duc Thanh Nguyen , Binh-Son Hua , Sai-Kit Yeung

InstanceFormer: An Online Video Instance Segmentation Framework

Recent transformer-based offline video instance segmentation (VIS) approaches achieve encouraging results and significantly outperform online approaches. However, their reliance on the whole video and the immense computational complexity…

Computer Vision and Pattern Recognition · Computer Science 2022-11-28 Rajat Koner , Tanveer Hannan , Suprosanna Shit , Sahand Sharifzadeh , Matthias Schubert , Thomas Seidl , Volker Tresp

DVIS++: Improved Decoupled Framework for Universal Video Segmentation

We present the \textbf{D}ecoupled \textbf{VI}deo \textbf{S}egmentation (DVIS) framework, a novel approach for the challenging task of universal video segmentation, including video instance segmentation (VIS), video semantic segmentation…

Computer Vision and Pattern Recognition · Computer Science 2023-12-22 Tao Zhang , Xingye Tian , Yikang Zhou , Shunping Ji , Xuebo Wang , Xin Tao , Yuan Zhang , Pengfei Wan , Zhongyuan Wang , Yu Wu

RefineVIS: Video Instance Segmentation with Temporal Attention Refinement

We introduce a novel framework called RefineVIS for Video Instance Segmentation (VIS) that achieves good object association between frames and accurate segmentation masks by iteratively refining the representations using sequence context.…

Computer Vision and Pattern Recognition · Computer Science 2023-06-09 Andre Abrantes , Jiang Wang , Peng Chu , Quanzeng You , Zicheng Liu

STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation

Video Instance Segmentation (VIS) is a task that simultaneously requires classification, segmentation, and instance association in a video. Recent VIS approaches rely on sophisticated pipelines to achieve this goal, including RoI-related…

Computer Vision and Pattern Recognition · Computer Science 2022-08-23 Zhengkai Jiang , Zhangxuan Gu , Jinlong Peng , Hang Zhou , Liang Liu , Yabiao Wang , Ying Tai , Chengjie Wang , Liqing Zhang

UVIS: Unsupervised Video Instance Segmentation

Video instance segmentation requires classifying, segmenting, and tracking every object across video frames. Unlike existing approaches that rely on masks, boxes, or category labels, we propose UVIS, a novel Unsupervised Video Instance…

Computer Vision and Pattern Recognition · Computer Science 2024-06-12 Shuaiyi Huang , Saksham Suri , Kamal Gupta , Sai Saketh Rambhatla , Ser-nam Lim , Abhinav Shrivastava

Efficient Video Instance Segmentation via Tracklet Query and Proposal

Video Instance Segmentation (VIS) aims to simultaneously classify, segment, and track multiple object instances in videos. Recent clip-level VIS takes a short video clip as input each time showing stronger performance than frame-level VIS…

Computer Vision and Pattern Recognition · Computer Science 2022-03-04 Jialian Wu , Sudhir Yarram , Hui Liang , Tian Lan , Junsong Yuan , Jayan Eledath , Gerard Medioni

Video Object of Interest Segmentation

In this work, we present a new computer vision task named video object of interest segmentation (VOIS). Given a video and a target image of interest, our objective is to simultaneously segment and track all objects in the video that are…

Computer Vision and Pattern Recognition · Computer Science 2022-12-07 Siyuan Zhou , Chunru Zhan , Biao Wang , Tiezheng Ge , Yuning Jiang , Li Niu

Online Video Instance Segmentation via Robust Context Fusion

Video instance segmentation (VIS) aims at classifying, segmenting and tracking object instances in video sequences. Recent transformer-based neural networks have demonstrated their powerful capability of modeling spatio-temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-07-13 Xiang Li , Jinglu Wang , Xiaohao Xu , Bhiksha Raj , Yan Lu

Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation

In this paper, we propose a simple yet effective approach for self-supervised video object segmentation (VOS). Our key insight is that the inherent structural dependencies present in DINO-pretrained Transformers can be leveraged to…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 Shuangrui Ding , Rui Qian , Haohang Xu , Dahua Lin , Hongkai Xiong

Robust Online Video Instance Segmentation with Track Queries

Recently, transformer-based methods have achieved impressive results on Video Instance Segmentation (VIS). However, most of these top-performing methods run in an offline manner by processing the entire video clip at once to predict…

Computer Vision and Pattern Recognition · Computer Science 2022-11-17 Zitong Zhan , Daniel McKee , Svetlana Lazebnik

1st Place Solution for the 5th LSVOS Challenge: Video Instance Segmentation

Video instance segmentation is a challenging task that serves as the cornerstone of numerous downstream applications, including video editing and autonomous driving. In this report, we present further improvements to the SOTA VIS method,…

Computer Vision and Pattern Recognition · Computer Science 2023-08-29 Tao Zhang , Xingye Tian , Yikang Zhou , Yu Wu , Shunping Ji , Cilin Yan , Xuebo Wang , Xin Tao , Yuan Zhang , Pengfei Wan

Tag-Based Attention Guided Bottom-Up Approach for Video Instance Segmentation

Video Instance Segmentation is a fundamental computer vision task that deals with segmenting and tracking object instances across a video sequence. Most existing methods typically accomplish this task by employing a multi-stage top-down…

Computer Vision and Pattern Recognition · Computer Science 2022-04-25 Jyoti Kini , Mubarak Shah

A2VIS: Amodal-Aware Approach to Video Instance Segmentation

Handling occlusion remains a significant challenge for video instance-level tasks like Multiple Object Tracking (MOT) and Video Instance Segmentation (VIS). In this paper, we propose a novel framework, Amodal-Aware Video Instance…

Computer Vision and Pattern Recognition · Computer Science 2025-04-11 Minh Tran , Thang Pham , Winston Bounsavy , Tri Nguyen , Ngan Le