Related papers: Evaluating SAM2 for Video Semantic Segmentation

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

The Segment Anything Model 2 (SAM 2) has emerged as a powerful foundation model for object segmentation in both images and videos, paving the way for various downstream video applications. The crucial design of SAM 2 for video segmentation…

Computer Vision and Pattern Recognition · Computer Science 2025-07-30 Shuangrui Ding , Rui Qian , Xiaoyi Dong , Pan Zhang , Yuhang Zang , Yuhang Cao , Yuwei Guo , Dahua Lin , Jiaqi Wang

When SAM2 Meets Video Shadow and Mirror Detection

As the successor to the Segment Anything Model (SAM), the Segment Anything Model 2 (SAM2) not only improves performance in image segmentation but also extends its capabilities to video segmentation. However, its effectiveness in segmenting…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Leiping Jie

SAM 2: Segment Anything in Images and Videos

We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos. We build a data engine, which improves model and data via user interaction, to collect the largest video…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Nikhila Ravi , Valentin Gabeur , Yuan-Ting Hu , Ronghang Hu , Chaitanya Ryali , Tengyu Ma , Haitham Khedr , Roman Rädle , Chloe Rolland , Laura Gustafson , Eric Mintun , Junting Pan , Kalyan Vasudev Alwala , Nicolas Carion , Chao-Yuan Wu , Ross Girshick , Piotr Dollár , Christoph Feichtenhofer

Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track

Video Object Segmentation (VOS) task aims to segmenting a particular object instance throughout the entire video sequence given only the object mask of the first frame. Recently, Segment Anything Model 2 (SAM 2) is proposed, which is a…

Computer Vision and Pattern Recognition · Computer Science 2024-08-27 Feiyu Pan , Hao Fang , Runmin Cong , Wei Zhang , Xiankai Lu

Segment Anything for Video: A Comprehensive Review of Video Object Segmentation and Tracking from Past to Future

Video Object Segmentation and Tracking (VOST) presents a complex yet critical challenge in computer vision, requiring robust integration of segmentation and tracking across temporally dynamic frames. Traditional methods have struggled with…

Computer Vision and Pattern Recognition · Computer Science 2025-08-05 Guoping Xu , Jayaram K. Udupa , Yajun Yu , Hua-Chieh Shao , Songlin Zhao , Wei Liu , You Zhang

SAM2 for Image and Video Segmentation: A Comprehensive Survey

Despite significant advances in deep learning for image and video segmentation, existing models continue to face challenges in cross-domain adaptability and generalization. Image and video segmentation are fundamental tasks in computer…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Zhang Jiaxing , Tang Hao

Towards Fine-grained Interactive Segmentation in Images and Videos

The recent Segment Anything Models (SAMs) have emerged as foundational visual models for general interactive segmentation. Despite demonstrating robust generalization abilities, they still suffer performance degradations in scenarios…

Computer Vision and Pattern Recognition · Computer Science 2025-02-17 Yuan Yao , Qiushi Yang , Miaomiao Cui , Liefeng Bo

Propagating Semantic Labels in Video Data

Semantic Segmentation combines two sub-tasks: the identification of pixel-level image masks and the application of semantic labels to those masks. Recently, so-called Foundation Models have been introduced; general models trained on very…

Computer Vision and Pattern Recognition · Computer Science 2023-10-03 David Balaban , Justin Medich , Pranay Gosar , Justin Hart

MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection

The recent Segment Anything Model 2 (SAM2) has demonstrated exceptional capabilities in interactive object segmentation for both images and videos. However, as a foundational model on interactive segmentation, SAM2 performs segmentation…

Computer Vision and Pattern Recognition · Computer Science 2025-05-05 Qiushi Yang , Yuan Yao , Miaomiao Cui , Liefeng Bo

Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2

The Segment Anything Model (SAM), introduced by Meta AI Research as a generic object segmentation model, quickly garnered widespread attention and significantly influenced the academic community. To extend its application to video, Meta…

Computer Vision and Pattern Recognition · Computer Science 2024-08-01 Lv Tang , Bo Li

TSMS-SAM2: Multi-scale Temporal Sampling Augmentation and Memory-Splitting Pruning for Promptable Video Object Segmentation and Tracking in Surgical Scenarios

Promptable video object segmentation and tracking (VOST) has seen significant advances with the emergence of foundation models like Segment Anything Model 2 (SAM2); however, their application in surgical video analysis remains challenging…

Computer Vision and Pattern Recognition · Computer Science 2025-08-11 Guoping Xu , Hua-Chieh Shao , You Zhang

MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation

Referring video object segmentation (RVOS) aims to segment objects in a video according to textual descriptions, which requires the integration of multimodal information and temporal dynamics perception. The Segment Anything Model 2 (SAM 2)…

Computer Vision and Pattern Recognition · Computer Science 2025-08-11 Fu Rong , Meng Lan , Qian Zhang , Lefei Zhang

When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation

This study investigates the application and performance of the Segment Anything Model 2 (SAM2) in the challenging task of video camouflaged object segmentation (VCOS). VCOS involves detecting objects that blend seamlessly in the…

Computer Vision and Pattern Recognition · Computer Science 2025-05-13 Yuli Zhou , Guolei Sun , Yawei Li , Guo-Sen Xie , Luca Benini , Ender Konukoglu

Segment anything model 2: an application to 2D and 3D medical images

Segment Anything Model (SAM) has gained significant attention because of its ability to segment various objects in images given a prompt. The recently developed SAM 2 has extended this ability to video inputs. This opens an opportunity to…

Computer Vision and Pattern Recognition · Computer Science 2024-08-23 Haoyu Dong , Hanxue Gu , Yaqian Chen , Jichen Yang , Yuwen Chen , Maciej A. Mazurowski

CamSAM2: Segment Anything Accurately in Camouflaged Videos

Video camouflaged object segmentation (VCOS), aiming at segmenting camouflaged objects that seamlessly blend into their environment, is a fundamental vision task with various real-world applications. With the release of SAM2, video…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Yuli Zhou , Yawei Li , Yuqian Fu , Luca Benini , Ender Konukoglu , Guolei Sun

Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting

Segmented light field images can serve as a powerful representation in many of computer vision tasks exploiting geometry and appearance of objects, such as object pose tracking. In the light field domain, segmentation presents an additional…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Nikolai Goncharov , Donald G. Dansereau

An Analysis of Data Transformation Effects on Segment Anything 2

Video object segmentation (VOS) is a critical task in the development of video perception and understanding. The Segment-Anything Model 2 (SAM 2), released by Meta AI, is the current state-of-the-art architecture for end-to-end VOS. SAM 2…

Image and Video Processing · Electrical Eng. & Systems 2025-05-14 Clayton Bromley , Alexander Moore , Amar Saini , Doug Poland , Carmen Carrano

Fast SAM2 with Text-Driven Token Pruning

Segment Anything Model 2 (SAM2), a vision foundation model has significantly advanced in prompt-driven video object segmentation, yet their practical deployment remains limited by the high computational and memory cost of processing dense…

Computer Vision and Pattern Recognition · Computer Science 2025-12-25 Avilasha Mandal , Chaoning Zhang , Fachrina Dewi Puspitasari , Xudong Wang , Jiaquan Zhang , Caiyan Qin , Guoqing Wang , Yang Yang , Heng Tao Shen

MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation

Research has focused on Multi-Modal Semantic Segmentation (MMSS), where pixel-wise predictions are derived from multiple visual modalities captured by diverse sensors. Recently, the large vision model, Segment Anything Model 2 (SAM2), has…

Computer Vision and Pattern Recognition · Computer Science 2025-03-24 Chenfei Liao , Xu Zheng , Yuanhuiyi Lyu , Haiwei Xue , Yihong Cao , Jiawen Wang , Kailun Yang , Xuming Hu

SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost

Foundation models like the Segment Anything Model (SAM) have significantly advanced promptable image segmentation in computer vision. However, extending these capabilities to videos presents substantial challenges, particularly in ensuring…

Computer Vision and Pattern Recognition · Computer Science 2025-06-03 Haiyang Mei , Pengyu Zhang , Mike Zheng Shou