Related papers: Efficient-SAM2: Accelerating SAM2 with Object-Awar…

Fast SAM2 with Text-Driven Token Pruning

Segment Anything Model 2 (SAM2), a vision foundation model has significantly advanced in prompt-driven video object segmentation, yet their practical deployment remains limited by the high computational and memory cost of processing dense…

Computer Vision and Pattern Recognition · Computer Science 2025-12-25 Avilasha Mandal , Chaoning Zhang , Fachrina Dewi Puspitasari , Xudong Wang , Jiaquan Zhang , Caiyan Qin , Guoqing Wang , Yang Yang , Heng Tao Shen

TinySAM 2: Extreme Memory Compression for Efficient Track Anything Model

Segment Anything Model 2 (SAM 2) serves as a core foundation model in the field of video segmentation. Building upon the original SAM model, it introduces a memory bank mechanism and demonstrates outstanding performance in tasks such as…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Zhaoyuan Ding , Yijing Yang , Han Shu , Xinghao Chen

Evaluating SAM2 for Video Semantic Segmentation

The Segmentation Anything Model 2 (SAM2) has proven to be a powerful foundation model for promptable visual object segmentation in both images and videos, capable of storing object-aware memories and transferring them temporally through…

Computer Vision and Pattern Recognition · Computer Science 2026-01-30 Syed Hesham Syed Ariff , Yun Liu , Guolei Sun , Jing Yang , Henghui Ding , Xue Geng , Xudong Jiang

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

The Segment Anything Model 2 (SAM 2) has emerged as a powerful foundation model for object segmentation in both images and videos, paving the way for various downstream video applications. The crucial design of SAM 2 for video segmentation…

Computer Vision and Pattern Recognition · Computer Science 2025-07-30 Shuangrui Ding , Rui Qian , Xiaoyi Dong , Pan Zhang , Yuhang Zang , Yuhang Cao , Yuwei Guo , Dahua Lin , Jiaqi Wang

Memory-Augmented SAM2 for Training-Free Surgical Video Segmentation

Surgical video segmentation is a critical task in computer-assisted surgery, essential for enhancing surgical quality and patient outcomes. Recently, the Segment Anything Model 2 (SAM2) framework has demonstrated remarkable advancements in…

Computer Vision and Pattern Recognition · Computer Science 2025-07-23 Ming Yin , Fu Wang , Xujiong Ye , Yanda Meng , Zeyu Fu

Efficient Track Anything

Segment Anything Model 2 (SAM 2) has emerged as a powerful tool for video object segmentation and tracking anything. Key components of SAM 2 that drive the impressive video object segmentation performance include a large multistage image…

Computer Vision and Pattern Recognition · Computer Science 2024-12-02 Yunyang Xiong , Chong Zhou , Xiaoyu Xiang , Lemeng Wu , Chenchen Zhu , Zechun Liu , Saksham Suri , Balakrishnan Varadarajan , Ramya Akula , Forrest Iandola , Raghuraman Krishnamoorthi , Bilge Soran , Vikas Chandra

Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

Surgical video segmentation is a critical task in computer-assisted surgery and is vital for enhancing surgical quality and patient outcomes. Recently, the Segment Anything Model 2 (SAM2) framework has shown superior advancements in image…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Haofeng Liu , Erli Zhang , Junde Wu , Mingxuan Hong , Yueming Jin

SAM 2: Segment Anything in Images and Videos

We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos. We build a data engine, which improves model and data via user interaction, to collect the largest video…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Nikhila Ravi , Valentin Gabeur , Yuan-Ting Hu , Ronghang Hu , Chaitanya Ryali , Tengyu Ma , Haitham Khedr , Roman Rädle , Chloe Rolland , Laura Gustafson , Eric Mintun , Junting Pan , Kalyan Vasudev Alwala , Nicolas Carion , Chao-Yuan Wu , Ross Girshick , Piotr Dollár , Christoph Feichtenhofer

Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2

Manual annotation of volumetric medical images, such as magnetic resonance imaging (MRI) and computed tomography (CT), is a labor-intensive and time-consuming process. Recent advancements in foundation models for video object segmentation,…

Image and Video Processing · Electrical Eng. & Systems 2025-11-04 Yuwen Chen , Zafer Yildiz , Qihang Li , Yaqian Chen , Haoyu Dong , Hanxue Gu , Nicholas Konz , Maciej A. Mazurowski

MobileSAMv2: Faster Segment Anything to Everything

Segment anything model (SAM) addresses two practical yet challenging segmentation tasks: \textbf{segment anything (SegAny)}, which utilizes a certain point to predict the mask for a single object of interest, and \textbf{segment everything…

Computer Vision and Pattern Recognition · Computer Science 2023-12-18 Chaoning Zhang , Dongshen Han , Sheng Zheng , Jinwoo Choi , Tae-Ho Kim , Choong Seon Hong

MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection

The recent Segment Anything Model 2 (SAM2) has demonstrated exceptional capabilities in interactive object segmentation for both images and videos. However, as a foundational model on interactive segmentation, SAM2 performs segmentation…

Computer Vision and Pattern Recognition · Computer Science 2025-05-05 Qiushi Yang , Yuan Yao , Miaomiao Cui , Liefeng Bo

Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2

The Segment Anything Model (SAM), introduced by Meta AI Research as a generic object segmentation model, quickly garnered widespread attention and significantly influenced the academic community. To extend its application to video, Meta…

Computer Vision and Pattern Recognition · Computer Science 2024-08-01 Lv Tang , Bo Li

SAM-Lightening: A Lightweight Segment Anything Model with Dilated Flash Attention to Achieve 30 times Acceleration

Segment Anything Model (SAM) has garnered significant attention in segmentation tasks due to their zero-shot generalization ability. However, a broader application of SAMs to real-world practice has been restricted by their low inference…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Yanfei Song , Bangzheng Pu , Peng Wang , Hongxu Jiang , Dong Dong , Yongxiang Cao , Yiqing Shen

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Segment Anything Model (SAM) has emerged as a powerful tool for numerous vision applications. A key component that drives the impressive performance for zero-shot transfer and high versatility is a super large Transformer model trained on…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Yunyang Xiong , Bala Varadarajan , Lemeng Wu , Xiaoyu Xiang , Fanyi Xiao , Chenchen Zhu , Xiaoliang Dai , Dilin Wang , Fei Sun , Forrest Iandola , Raghuraman Krishnamoorthi , Vikas Chandra

SAM2RL: Towards Reinforcement Learning Memory Control in Segment Anything Model 2

Segment Anything Model 2 (SAM 2) has demonstrated strong performance in object segmentation tasks and has become the state-of-the-art for visual object tracking. The model stores information from previous frames in a memory bank, enabling…

Computer Vision and Pattern Recognition · Computer Science 2025-07-14 Alen Adamyan , Tomáš Čížek , Matej Straka , Klara Janouskova , Martin Schmid

Q-SAM2: Accurate Quantization for Segment Anything Model 2

The Segment Anything Model 2 (SAM2) is a powerful foundation model for promptable segmentation. However, its high computational and memory costs are a major barrier to deployment on resource-constrained devices. In this paper, we present…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Nicola Farronato , Florian Scheidegger , Mattia Rigotti , Cristiano Malossi , Michele Magno , Haotong Qin

When SAM2 Meets Video Shadow and Mirror Detection

As the successor to the Segment Anything Model (SAM), the Segment Anything Model 2 (SAM2) not only improves performance in image segmentation but also extends its capabilities to video segmentation. However, its effectiveness in segmenting…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Leiping Jie

SparseSAM: Structured Sparsification of Activations in Segment Anything Models

The Segment Anything Model (SAM) achieves strong open-vocabulary segmentation, but its ViT-based image encoders dominate inference latency and memory. Existing activation compression methods, such as token merging, reduce the token length…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Hoai-Chau Tran , Chi H. Nguyen , Duy M. H. Nguyen , Mathias Niepert , Fan Lai , Khoa D. Doan

OFL-SAM2: Prompt SAM2 with Online Few-shot Learner for Efficient Medical Image Segmentation

The Segment Anything Model 2 (SAM2) has demonstrated remarkable promptable visual segmentation capabilities in video data, showing potential for extension to medical image segmentation (MIS) tasks involving 3D volumes and temporally…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Meng Lan , Lefei Zhang , Xiaomeng Li

Distractor-Aware Memory-Based Visual Object Tracking

Recent emergence of memory-based video segmentation methods such as SAM2 has led to models with excellent performance in segmentation tasks, achieving leading results on numerous benchmarks. However, these modes are not fully adjusted for…

Computer Vision and Pattern Recognition · Computer Science 2025-09-18 Jovana Videnovic , Matej Kristan , Alan Lukezic