Related papers: BoxVIS: Video Instance Segmentation with Box Annot…

PM-VIS: High-Performance Box-Supervised Video Instance Segmentation

Labeling pixel-wise object masks in videos is a resource-intensive and laborious process. Box-supervised Video Instance Segmentation (VIS) methods have emerged as a viable solution to mitigate the labor-intensive annotation process. . In…

Computer Vision and Pattern Recognition · Computer Science 2024-04-23 Zhangjing Yang , Dun Liu , Wensheng Cheng , Jinqiao Wang , Yi Wu

BoxInst: High-Performance Instance Segmentation with Box Annotations

We present a high-performance method that can achieve mask-level instance segmentation with only bounding-box annotations for training. While this setting has been studied in the literature, here we show significantly stronger performance…

Computer Vision and Pattern Recognition · Computer Science 2020-12-07 Zhi Tian , Chunhua Shen , Xinlong Wang , Hao Chen

PM-VIS+: High-Performance Video Instance Segmentation without Video Annotation

Video instance segmentation requires detecting, segmenting, and tracking objects in videos, typically relying on costly video annotations. This paper introduces a method that eliminates video annotations by utilizing image datasets. The…

Computer Vision and Pattern Recognition · Computer Science 2024-07-01 Zhangjing Yang , Dun Liu , Xin Wang , Zhe Li , Barathwaj Anandan , Yi Wu

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

We propose MinVIS, a minimal video instance segmentation (VIS) framework that achieves state-of-the-art VIS performance with neither video-based architectures nor training procedures. By only training a query-based image instance…

Computer Vision and Pattern Recognition · Computer Science 2022-08-04 De-An Huang , Zhiding Yu , Anima Anandkumar

Mask-Free Video Instance Segmentation

The recent advancement in Video Instance Segmentation (VIS) has largely been driven by the use of deeper and increasingly data-hungry transformer-based models. However, video masks are tedious and expensive to annotate, limiting the scale…

Computer Vision and Pattern Recognition · Computer Science 2023-03-29 Lei Ke , Martin Danelljan , Henghui Ding , Yu-Wing Tai , Chi-Keung Tang , Fisher Yu

UVIS: Unsupervised Video Instance Segmentation

Video instance segmentation requires classifying, segmenting, and tracking every object across video frames. Unlike existing approaches that rely on masks, boxes, or category labels, we propose UVIS, a novel Unsupervised Video Instance…

Computer Vision and Pattern Recognition · Computer Science 2024-06-12 Shuaiyi Huang , Saksham Suri , Kamal Gupta , Sai Saketh Rambhatla , Ser-nam Lim , Abhinav Shrivastava

What is Point Supervision Worth in Video Instance Segmentation?

Video instance segmentation (VIS) is a challenging vision task that aims to detect, segment, and track objects in videos. Conventional VIS methods rely on densely-annotated object masks which are expensive. We reduce the human annotations…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Shuaiyi Huang , De-An Huang , Zhiding Yu , Shiyi Lan , Subhashree Radhakrishnan , Jose M. Alvarez , Abhinav Shrivastava , Anima Anandkumar

Generating Masks from Boxes by Mining Spatio-Temporal Consistencies in Videos

Segmenting objects in videos is a fundamental computer vision task. The current deep learning based paradigm offers a powerful, but data-hungry solution. However, current datasets are limited by the cost and human effort of annotating…

Computer Vision and Pattern Recognition · Computer Science 2021-01-07 Bin Zhao , Goutam Bhat , Martin Danelljan , Luc Van Gool , Radu Timofte

Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency

Weakly supervised instance segmentation reduces the cost of annotations required to train models. However, existing approaches which rely only on image-level class labels predominantly suffer from errors due to (a) partial segmentation of…

Computer Vision and Pattern Recognition · Computer Science 2021-03-25 Qing Liu , Vignesh Ramanathan , Dhruv Mahajan , Alan Yuille , Zhenheng Yang

BoxMask: Revisiting Bounding Box Supervision for Video Object Detection

We present a new, simple yet effective approach to uplift video object detection. We observe that prior works operate on instance-level feature aggregation that imminently neglects the refined pixel-level representation, resulting in…

Computer Vision and Pattern Recognition · Computer Science 2022-10-13 Khurram Azeem Hashmi , Alain Pagani , Didier Stricker , Muhammamd Zeshan Afzal

Pointly-Supervised Instance Segmentation

We propose an embarrassingly simple point annotation scheme to collect weak supervision for instance segmentation. In addition to bounding boxes, we collect binary labels for a set of points uniformly sampled inside each bounding box. We…

Computer Vision and Pattern Recognition · Computer Science 2022-06-17 Bowen Cheng , Omkar Parkhi , Alexander Kirillov

Online Video Instance Segmentation via Robust Context Fusion

Video instance segmentation (VIS) aims at classifying, segmenting and tracking object instances in video sequences. Recent transformer-based neural networks have demonstrated their powerful capability of modeling spatio-temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-07-13 Xiang Li , Jinglu Wang , Xiaohao Xu , Bhiksha Raj , Yan Lu

BoxTeacher: Exploring High-Quality Pseudo Labels for Weakly Supervised Instance Segmentation

Labeling objects with pixel-wise segmentation requires a huge amount of human labor compared to bounding boxes. Most existing methods for weakly supervised instance segmentation focus on designing heuristic losses with priors from bounding…

Computer Vision and Pattern Recognition · Computer Science 2023-03-20 Tianheng Cheng , Xinggang Wang , Shaoyu Chen , Qian Zhang , Wenyu Liu

Point-VOS: Pointing Up Video Object Segmentation

Current state-of-the-art Video Object Segmentation (VOS) methods rely on dense per-object mask annotations both during training and testing. This requires time-consuming and costly video annotation mechanisms. We propose a novel Point-VOS…

Computer Vision and Pattern Recognition · Computer Science 2024-06-11 Idil Esen Zulfikar , Sabarinath Mahadevan , Paul Voigtlaender , Bastian Leibe

Crossover Learning for Fast Online Video Instance Segmentation

Modeling temporal visual context across frames is critical for video instance segmentation (VIS) and other video understanding tasks. In this paper, we propose a fast online VIS model named CrossVIS. For temporal information modeling in…

Computer Vision and Pattern Recognition · Computer Science 2021-04-14 Shusheng Yang , Yuxin Fang , Xinggang Wang , Yu Li , Chen Fang , Ying Shan , Bin Feng , Wenyu Liu

TCOVIS: Temporally Consistent Online Video Instance Segmentation

In recent years, significant progress has been made in video instance segmentation (VIS), with many offline and online methods achieving state-of-the-art performance. While offline methods have the advantage of producing temporally…

Computer Vision and Pattern Recognition · Computer Science 2023-09-22 Junlong Li , Bingyao Yu , Yongming Rao , Jie Zhou , Jiwen Lu

In Defense of Online Models for Video Instance Segmentation

In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models gradually attracted less attention possibly due to their inferior performance. However, online methods have their inherent…

Computer Vision and Pattern Recognition · Computer Science 2022-07-22 Junfeng Wu , Qihao Liu , Yi Jiang , Song Bai , Alan Yuille , Xiang Bai

DeVIS: Making Deformable Transformers Work for Video Instance Segmentation

Video Instance Segmentation (VIS) jointly tackles multi-object detection, tracking, and segmentation in video sequences. In the past, VIS methods mirrored the fragmentation of these subtasks in their architectural design, hence missing out…

Computer Vision and Pattern Recognition · Computer Science 2022-07-25 Adrià Caelles , Tim Meinhardt , Guillem Brasó , Laura Leal-Taixé

Occluded Video Instance Segmentation: A Benchmark

Can our video understanding systems perceive objects when a heavy occlusion exists in a scene? To answer this question, we collect a large-scale dataset called OVIS for occluded video instance segmentation, that is, to simultaneously…

Computer Vision and Pattern Recognition · Computer Science 2022-05-18 Jiyang Qi , Yan Gao , Yao Hu , Xinggang Wang , Xiaoyu Liu , Xiang Bai , Serge Belongie , Alan Yuille , Philip H. S. Torr , Song Bai

Video Mask Transfiner for High-Quality Video Instance Segmentation

While Video Instance Segmentation (VIS) has seen rapid progress, current approaches struggle to predict high-quality masks with accurate boundary details. Moreover, the predicted segmentations often fluctuate over time, suggesting that…

Computer Vision and Pattern Recognition · Computer Science 2022-07-29 Lei Ke , Henghui Ding , Martin Danelljan , Yu-Wing Tai , Chi-Keung Tang , Fisher Yu