Related papers: Revisiting Click-based Interactive Video Object Se…

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

We present Modular interactive VOS (MiVOS) framework which decouples interaction-to-mask and mask propagation, allowing for higher generalizability and better performance. Trained separately, the interaction module converts user…

Computer Vision and Pattern Recognition · Computer Science 2021-03-23 Ho Kei Cheng , Yu-Wing Tai , Chi-Keung Tang

ClickVOS: Click Video Object Segmentation

Video Object Segmentation (VOS) task aims to segment objects in videos. However, previous settings either require time-consuming manual masks of target objects at the first frame during inference or lack the flexibility to specify arbitrary…

Computer Vision and Pattern Recognition · Computer Science 2024-03-12 Pinxue Guo , Lingyi Hong , Xinyu Zhou , Shuyong Gao , Wanyun Li , Jinglun Li , Zhaoyu Chen , Xiaoqiang Li , Wei Zhang , Wenqiang Zhang

IDPro: Flexible Interactive Video Object Segmentation by ID-queried Concurrent Propagation

Interactive Video Object Segmentation (iVOS) is a challenging task that requires real-time human-computer interaction. To improve the user experience, it is important to consider the user's input habits, segmentation quality, running time…

Computer Vision and Pattern Recognition · Computer Science 2025-02-10 Kexin Li , Tao Jiang , Zongxin Yang , Yi Yang , Yueting Zhuang , Jun Xiao

Click Carving: Segmenting Objects in Video with Point Clicks

We present a novel form of interactive video object segmentation where a few clicks by the user helps the system produce a full spatio-temporal segmentation of the object of interest. Whereas conventional interactive pipelines take the…

Computer Vision and Pattern Recognition · Computer Science 2016-07-06 Suyog Dutt Jain , Kristen Grauman

Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks

We present a deep learning method for the interactive video object segmentation. Our method is built upon two core operations, interaction and propagation, and each operation is conducted by Convolutional Neural Networks. The two networks…

Computer Vision and Pattern Recognition · Computer Science 2019-05-03 Seoung Wug Oh , Joon-Young Lee , Ning Xu , Seon Joo Kim

Memory Aggregation Networks for Efficient Interactive Video Object Segmentation

Interactive video object segmentation (iVOS) aims at efficiently harvesting high-quality segmentation masks of the target object in a video with user interactions. Most previous state-of-the-arts tackle the iVOS with two independent…

Computer Vision and Pattern Recognition · Computer Science 2020-03-31 Jiaxu Miao , Yunchao Wei , Yi Yang

PseudoClick: Interactive Image Segmentation with Click Imitation

The goal of click-based interactive image segmentation is to obtain precise object segmentation masks with limited user interaction, i.e., by a minimal number of user clicks. Existing methods require users to provide all the clicks: by…

Computer Vision and Pattern Recognition · Computer Science 2022-07-28 Qin Liu , Meng Zheng , Benjamin Planche , Srikrishna Karanam , Terrence Chen , Marc Niethammer , Ziyan Wu

InterRVOS: Interaction-aware Referring Video Object Segmentation

Referring video object segmentation (RVOS) aims to segment objects in a video described by a natural language expression. However, most existing approaches focus on segmenting only the referred object (typically the actor), even when the…

Computer Vision and Pattern Recognition · Computer Science 2025-08-19 Woojeong Jin , Seongchan Kim , Jaeho Lee , Seungryong Kim

ActionVOS: Actions as Prompts for Video Object Segmentation

Delving into the realm of egocentric vision, the advancement of referring video object segmentation (RVOS) stands as pivotal in understanding human activities. However, existing RVOS task primarily relies on static attributes such as object…

Computer Vision and Pattern Recognition · Computer Science 2024-07-11 Liangyang Ouyang , Ruicong Liu , Yifei Huang , Ryosuke Furuta , Yoichi Sato

MMMS: Multi-Modal Multi-Surface Interactive Segmentation

In this paper, we present a method to interactively create segmentation masks on the basis of user clicks. We pay particular attention to the segmentation of multiple surfaces that are simultaneously present in the same image. Since these…

Computer Vision and Pattern Recognition · Computer Science 2025-09-17 Robin Schön , Julian Lorenz , Katja Ludwig , Daniel Kienzle , Rainer Lienhart

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes. These methods are considerably more computationally expensive compared to feedforward…

Computer Vision and Pattern Recognition · Computer Science 2021-02-15 Konstantin Sofiiuk , Ilia A. Petrov , Anton Konushin

Scalable Video Object Segmentation with Simplified Framework

The current popular methods for video object segmentation (VOS) implement feature matching through several hand-crafted modules that separately perform feature extraction and matching. However, the above hand-crafted designs empirically…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Qiangqiang Wu , Tianyu Yang , Wei WU , Antoni Chan

DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation

The recent works on Video Object Segmentation achieved remarkable results by matching dense semantic and instance-level features between the current and previous frames for long-time propagation. Nevertheless, global feature matching…

Computer Vision and Pattern Recognition · Computer Science 2024-05-15 Volodymyr Fedynyak , Yaroslav Romanus , Bohdan Hlovatskyi , Bohdan Sydor , Oles Dobosevych , Igor Babin , Roman Riazantsev

Interactive Object Segmentation in 3D Point Clouds

We propose an interactive approach for 3D instance segmentation, where users can iteratively collaborate with a deep learning model to segment objects in a 3D point cloud directly. Current methods for 3D instance segmentation are generally…

Computer Vision and Pattern Recognition · Computer Science 2023-01-24 Theodora Kontogianni , Ekin Celikkan , Siyu Tang , Konrad Schindler

Video Object Segmentation using Tracked Object Proposals

We present an approach to semi-supervised video object segmentation, in the context of the DAVIS 2017 challenge. Our approach combines category-based object detection, category-independent object appearance segmentation and temporal object…

Computer Vision and Pattern Recognition · Computer Science 2017-07-21 Gilad Sharir , Eddie Smolyansky , Itamar Friedman

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

The task of video object segmentation with referring expressions (language-guided VOS) is to, given a linguistic phrase and a video, generate binary masks for the object to which the phrase refers. Our work argues that existing benchmarks…

Computer Vision and Pattern Recognition · Computer Science 2020-10-02 Miriam Bellver , Carles Ventura , Carina Silberer , Ioannis Kazakos , Jordi Torres , Xavier Giro-i-Nieto

ScribbleSeg: Scribble-based Interactive Image Segmentation

Interactive segmentation enables users to extract masks by providing simple annotations to indicate the target, such as boxes, clicks, or scribbles. Among these interaction formats, scribbles are the most flexible as they can be of…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Xi Chen , Yau Shing Jonathan Cheung , Ser-Nam Lim , Hengshuang Zhao

MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions

This paper strives for motion expressions guided video segmentation, which focuses on segmenting objects in video content based on a sentence describing the motion of the objects. Existing referring video object datasets typically focus on…

Computer Vision and Pattern Recognition · Computer Science 2023-08-17 Henghui Ding , Chang Liu , Shuting He , Xudong Jiang , Chen Change Loy

Video Object Segmentation with Dynamic Query Modulation

Storing intermediate frame segmentations as memory for long-range context modeling, spatial-temporal memory-based methods have recently showcased impressive results in semi-supervised video object segmentation (SVOS). However, these methods…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Hantao Zhou , Runze Hu , Xiu Li

Online Adaptation of Convolutional Neural Networks for Video Object Segmentation

We tackle the task of semi-supervised video object segmentation, i.e. segmenting the pixels belonging to an object in the video using the ground truth pixel mask for the first frame. We build on the recently introduced one-shot video object…

Computer Vision and Pattern Recognition · Computer Science 2017-08-02 Paul Voigtlaender , Bastian Leibe