English
Related papers

Related papers: Diffuse, Attend, and Segment: Unsupervised Zero-Sh…

200 papers

Zero-shot referring image segmentation is a challenging task because it aims to find an instance segmentation mask based on the given referring descriptions, without training on this type of paired data. Current zero-shot methods mainly…

Computer Vision and Pattern Recognition · Computer Science 2023-09-04 Minheng Ni , Yabo Zhang , Kailai Feng , Xiaoming Li , Yiwen Guo , Wangmeng Zuo

Recent progress in interactive point prompt based Image Segmentation allows to significantly reduce the manual effort to obtain high quality semantic labels. State-of-the-art unsupervised methods use self-supervised pre-trained models to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-21 Markus Karmann , Onay Urfalioglu

Entrusted with the goal of pixel-level object classification, the semantic segmentation networks entail the laborious preparation of pixel-level annotation masks. To obtain pixel-level annotation masks for a given class without human…

Computer Vision and Pattern Recognition · Computer Science 2025-09-16 Joon Hyun Park , Kumju Jo , Sungyong Baik

We propose an unsupervised image segmentation method using features from pre-trained text-to-image diffusion models. Inspired by classic spectral clustering approaches, we construct adjacency matrices from self-attention layers between…

Computer Vision and Pattern Recognition · Computer Science 2025-11-27 Daniela Ivanova , Marco Aversa , Paul Henderson , John Williamson

Producing high-quality segmentation masks for medical images is a fundamental challenge in biomedical image analysis. Recent research has explored large-scale supervised training to enable segmentation across various medical imaging…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Abderrachid Hamrani , Anuradha Godavarty

Significant strides have been made using large vision-language models, like Stable Diffusion (SD), for a variety of downstream tasks, including image editing, image correspondence, and 3D shape generation. Inspired by these advancements, we…

Computer Vision and Pattern Recognition · Computer Science 2024-03-15 Aliasghar Khani , Saeid Asgari Taghanaki , Aditya Sanghi , Ali Mahdavi Amiri , Ghassan Hamarneh

Panoptic and instance segmentation networks are often trained with specialized object detection modules, complex loss functions, and ad-hoc post-processing steps to manage the permutation-invariance of the instance masks. This work builds…

Computer Vision and Pattern Recognition · Computer Science 2024-07-17 Wouter Van Gansbeke , Bert De Brabandere

Semantic segmentation is essential in computer vision for various applications, yet traditional approaches face significant challenges, including the high cost of annotation and extensive training for supervised learning. Additionally, due…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Yasufumi Kawano , Yoshimitsu Aoki

As powerful generative models, text-to-image diffusion models have recently been explored for discriminative tasks. A line of research focuses on adapting a pre-trained diffusion model to semantic segmentation without any further training,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-30 Benyuan Meng , Qianqian Xu , Zitai Wang , Xiaochun Cao , Longtao Huang , Qingming Huang

We introduce segmentation-free guidance, a novel method designed for text-to-image diffusion models like Stable Diffusion. Our method does not require retraining of the diffusion model. At no additional compute cost, it uses the diffusion…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 Kambiz Azarian , Debasmit Das , Qiqi Hou , Fatih Porikli

Semantic segmentation is a computer vision task where classification is performed at a pixel level. Due to this, the process of labeling images for semantic segmentation is time-consuming and expensive. To mitigate this cost there has been…

Computer Vision and Pattern Recognition · Computer Science 2025-01-07 Javier Montalvo , Álvaro García-Martín , Pablo Carballeira , Juan C. SanMiguel

Collecting and annotating images with pixel-wise labels is time-consuming and laborious. In contrast, synthetic data can be freely available using a generative model (e.g., DALL-E, Stable Diffusion). In this paper, we show that it is…

Computer Vision and Pattern Recognition · Computer Science 2024-01-23 Weijia Wu , Yuzhong Zhao , Mike Zheng Shou , Hong Zhou , Chunhua Shen

Open-vocabulary semantic segmentation (OVSS) aims to segment objects from arbitrary text categories without requiring densely annotated datasets. Although contrastive learning based models enable zero-shot segmentation, they often lose fine…

Computer Vision and Pattern Recognition · Computer Science 2026-04-30 Huy Che , Vinh-Tiep Nguyen

Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks. While prior works have addressed unsupervised image segmentation, they significantly lag behind supervised models. In…

Computer Vision and Pattern Recognition · Computer Science 2025-10-03 Paul Couairon , Mustafa Shukor , Jean-Emmanuel Haugeard , Matthieu Cord , Nicolas Thome

Stable diffusion has demonstrated strong image synthesis ability to given text descriptions, suggesting it to contain strong semantic clue for grouping objects. The researchers have explored employing stable diffusion for training-free…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Lin Sun , Jiale Cao , Jin Xie , Fahad Shahbaz Khan , Yanwei Pang

This paper investigates the use of large-scale diffusion models for Zero-Shot Video Object Segmentation (ZS-VOS) without fine-tuning on video data or training on any image segmentation data. While diffusion models have demonstrated strong…

Computer Vision and Pattern Recognition · Computer Science 2025-04-09 Thanos Delatolas , Vicky Kalogeiton , Dim P. Papadopoulos

The recent wave of large-scale text-to-image diffusion models has dramatically increased our text-based image generation abilities. These models can generate realistic images for a staggering variety of prompts and exhibit impressive…

Machine Learning · Computer Science 2023-09-14 Alexander C. Li , Mihir Prabhudesai , Shivam Duggal , Ellis Brown , Deepak Pathak

Current semantic segmentation models typically require a substantial amount of manually annotated data, a process that is both time-consuming and resource-intensive. Alternatively, leveraging advanced text-to-image models such as Midjourney…

Computer Vision and Pattern Recognition · Computer Science 2025-07-08 Bo Gao , Jianhui Wang , Xinyuan Song , Yangfan He , Fangxu Xing , Tianyu Shi

Diffusion models, such as Stable Diffusion, have shown incredible performance on text-to-image generation. Since text-to-image generation often requires models to generate visual concepts with fine-grained details and attributes specified…

Computer Vision and Pattern Recognition · Computer Science 2024-04-26 Xuehai He , Weixi Feng , Tsu-Jui Fu , Varun Jampani , Arjun Akula , Pradyumna Narayana , Sugato Basu , William Yang Wang , Xin Eric Wang

Few-shot segmentation focuses on the generalization of models to segment unseen object with limited annotated samples. However, existing approaches still face two main challenges. First, huge feature distinction between support and query…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Qi Zhao , Binghao Liu , Shuchang Lyu , Huojin Chen
‹ Prev 1 2 3 10 Next ›