Related papers: Occluded Video Instance Segmentation: A Benchmark

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge

Although deep learning methods have achieved advanced video object recognition performance in recent years, perceiving heavily occluded objects in a video is still a very challenging task. To promote the development of occlusion…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Jiyang Qi , Yan Gao , Yao Hu , Xinggang Wang , Xiaoyu Liu , Xiang Bai , Serge Belongie , Alan Yuille , Philip H. S. Torr , Song Bai

A2VIS: Amodal-Aware Approach to Video Instance Segmentation

Handling occlusion remains a significant challenge for video instance-level tasks like Multiple Object Tracking (MOT) and Video Instance Segmentation (VIS). In this paper, we propose a novel framework, Amodal-Aware Video Instance…

Computer Vision and Pattern Recognition · Computer Science 2025-04-11 Minh Tran , Thang Pham , Winston Bounsavy , Tri Nguyen , Ngan Le

Real-time Human-Centric Segmentation for Complex Video Scenes

Most existing video tasks related to "human" focus on the segmentation of salient humans, ignoring the unspecified others in the video. Few studies have focused on segmenting and tracking all humans in a complex video, including pedestrians…

Computer Vision and Pattern Recognition · Computer Science 2021-08-17 Ran Yu , Chenyu Tian , Weihao Xia , Xinyuan Zhao , Haoqian Wang , Yujiu Yang

Towards Open-Vocabulary Video Instance Segmentation

Video Instance Segmentation (VIS) aims at segmenting and categorizing objects in videos from a closed set of training categories, lacking the generalization ability to handle novel categories in real-world videos. To address this…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Haochen Wang , Cilin Yan , Shuai Wang , Xiaolong Jiang , XU Tang , Yao Hu , Weidi Xie , Efstratios Gavves

MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

Video object segmentation (VOS) aims at segmenting a particular object throughout the entire video clip sequence. The state-of-the-art VOS methods have achieved excellent performance (e.g., 90+% J&F) on existing datasets. However, since the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-24 Henghui Ding , Chang Liu , Shuting He , Xudong Jiang , Philip H. S. Torr , Song Bai

LVIS: A Dataset for Large Vocabulary Instance Segmentation

Progress on object detection is enabled by datasets that focus the research community's attention on open challenges. This process led us from simple images to complex scenes and from bounding boxes to segmentation masks. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2019-09-17 Agrim Gupta , Piotr Dollár , Ross Girshick

OVSNet : Towards One-Pass Real-Time Video Object Segmentation

Video object segmentation aims at accurately segmenting the target object regions across consecutive frames. It is technically challenging for coping with complicated factors (e.g., shape deformations, occlusion and out of the lens). Recent…

Computer Vision and Pattern Recognition · Computer Science 2019-07-03 Peng Sun , Peiwen Lin , Guangliang Cheng , Jianping Shi , Jiawan Zhang , Xi Li

Audio-Visual Instance Segmentation

In this paper, we propose a new multi-modal task, termed audio-visual instance segmentation (AVIS), which aims to simultaneously identify, segment and track individual sounding object instances in audible videos. To facilitate this…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Ruohao Guo , Xianghua Ying , Yaru Chen , Dantong Niu , Guangyao Li , Liao Qu , Yanyu Qi , Jinxing Zhou , Bowei Xing , Wenzhen Yue , Ji Shi , Qixun Wang , Peiliang Zhang , Buwen Liang

OpenVIS: Open-vocabulary Video Instance Segmentation

Open-vocabulary Video Instance Segmentation (OpenVIS) can simultaneously detect, segment, and track arbitrary object categories in a video, without being constrained to categories seen during training. In this work, we propose InstFormer, a…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Pinxue Guo , Tony Huang , Peiyang He , Xuefeng Liu , Tianjun Xiao , Zhaoyu Chen , Wenqiang Zhang

What is Point Supervision Worth in Video Instance Segmentation?

Video instance segmentation (VIS) is a challenging vision task that aims to detect, segment, and track objects in videos. Conventional VIS methods rely on densely-annotated object masks which are expensive. We reduce the human annotations…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Shuaiyi Huang , De-An Huang , Zhiding Yu , Shiyi Lan , Subhashree Radhakrishnan , Jose M. Alvarez , Abhinav Shrivastava , Anima Anandkumar

Video Instance Segmentation in an Open-World

Existing video instance segmentation (VIS) approaches generally follow a closed-world assumption, where only seen category instances are identified and spatio-temporally segmented at inference. Open-world formulation relaxes the close-world…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Omkar Thawakar , Sanath Narayan , Hisham Cholakkal , Rao Muhammad Anwer , Salman Khan , Jorma Laaksonen , Mubarak Shah , Fahad Shahbaz Khan

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion

Analyzing complex scenes with Deep Neural Networks is a challenging task, particularly when images contain multiple objects that partially occlude each other. Existing approaches to image analysis mostly process objects independently and do…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Xiaoding Yuan , Adam Kortylewski , Yihong Sun , Alan Yuille

Pose2Seg: Detection Free Human Instance Segmentation

The standard approach to image instance segmentation is to perform the object detection first, and then segment the object from the detection bounding-box. More recently, deep learning methods like Mask R-CNN perform them jointly. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-06-14 Song-Hai Zhang , Ruilong Li , Xin Dong , Paul L. Rosin , Zixi Cai , Xi Han , Dingcheng Yang , Hao-Zhi Huang , Shi-Min Hu

Look Before You Match: Instance Understanding Matters in Video Object Segmentation

Exploring dense matching between the current frame and past frames for long-range context modeling, memory-based methods have demonstrated impressive results in video object segmentation (VOS) recently. Nevertheless, due to the lack of…

Computer Vision and Pattern Recognition · Computer Science 2022-12-14 Junke Wang , Dongdong Chen , Zuxuan Wu , Chong Luo , Chuanxin Tang , Xiyang Dai , Yucheng Zhao , Yujia Xie , Lu Yuan , Yu-Gang Jiang

Instance Segmentation of Visible and Occluded Regions for Finding and Picking Target from a Pile of Objects

We present a robotic system for picking a target from a pile of objects that is capable of finding and grasping the target object by removing obstacles in the appropriate order. The fundamental idea is to segment instances with both visible…

Robotics · Computer Science 2020-01-22 Kentaro Wada , Shingo Kitagawa , Kei Okada , Masayuki Inaba

Adding New Categories in Object Detection Using Few-Shot Copy-Paste

Developing data-efficient instance detection models that can handle rare object categories remains a key challenge in computer vision. However, existing research often overlooks data collection strategies and evaluation metrics tailored to…

Computer Vision and Pattern Recognition · Computer Science 2025-04-15 Boyang Deng , Meiyan Lin , Shoulun Long

CAVIS: Context-Aware Video Instance Segmentation

In this paper, we introduce the Context-Aware Video Instance Segmentation (CAVIS), a novel framework designed to enhance instance association by integrating contextual information adjacent to each object. To efficiently extract and leverage…

Computer Vision and Pattern Recognition · Computer Science 2025-07-10 Seunghun Lee , Jiwan Seo , Kiljoon Han , Minwoo Choi , Sunghoon Im

Online Reasoning Video Object Segmentation

Reasoning video object segmentation predicts pixel-level masks in videos from natural-language queries that may involve implicit and temporally grounded references. However, existing methods are developed and evaluated in an offline regime,…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Jinyuan Liu , Yang Wang , Zeyu Zhao , Weixin Li , Song Wang , Ruize Han

Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation

Current state-of-the-art object detection and segmentation methods work well under the closed-world assumption. This closed-world setting assumes that the list of object categories is available during training and deployment. However, many…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Weiyao Wang , Matt Feiszli , Heng Wang , Du Tran

MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes

Video object segmentation (VOS) aims to segment specified target objects throughout a video. Although state-of-the-art methods have achieved impressive performance (e.g., 90+% J&F) on benchmarks such as DAVIS and YouTube-VOS, these datasets…

Computer Vision and Pattern Recognition · Computer Science 2025-09-23 Henghui Ding , Kaining Ying , Chang Liu , Shuting He , Xudong Jiang , Yu-Gang Jiang , Philip H. S. Torr , Song Bai