Related papers: Language-driven Semantic Segmentation

LMSeg: Language-guided Multi-dataset Segmentation

It's a meaningful and attractive topic to build a general and inclusive segmentation model that can recognize more categories in various scenarios. A straightforward way is to combine the existing fragmented segmentation datasets and train…

Computer Vision and Pattern Recognition · Computer Science 2023-02-28 Qiang Zhou , Yuang Liu , Chaohui Yu , Jingliang Li , Zhibin Wang , Fan Wang

Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings

We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting. It thus achieves results equivalent to those of the supervised methods, on each of the major semantic…

Computer Vision and Pattern Recognition · Computer Science 2024-05-01 Wei Yin , Yifan Liu , Chunhua Shen , Baichuan Sun , Anton van den Hengel

Exploring Simple Open-Vocabulary Semantic Segmentation

Open-vocabulary semantic segmentation models aim to accurately assign a semantic label to each pixel in an image from a set of arbitrary open-vocabulary texts. In order to learn such pixel-level alignment, current approaches typically rely…

Computer Vision and Pattern Recognition · Computer Science 2024-01-23 Zihang Lai

Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding

To bridge the gap between supervised semantic segmentation and real-world applications that acquires one model to recognize arbitrary new concepts, recent zero-shot segmentation attracts a lot of attention by exploring the relationships…

Computer Vision and Pattern Recognition · Computer Science 2022-11-01 Quande Liu , Youpeng Wen , Jianhua Han , Chunjing Xu , Hang Xu , Xiaodan Liang

Learning unbiased zero-shot semantic segmentation networks via transductive transfer

Semantic segmentation, which aims to acquire a detailed understanding of images, is an essential issue in computer vision. However, in practical scenarios, new categories that are different from the categories in training usually appear.…

Computer Vision and Pattern Recognition · Computer Science 2020-07-02 Haiyang Liu , Yichen Wang , Jiayi Zhao , Guowu Yang , Fengmao Lv

Zero-Shot Recognition through Image-Guided Semantic Classification

We present a new embedding-based framework for zero-shot learning (ZSL). Most embedding-based methods aim to learn the correspondence between an image classifier (visual representation) and its class prototype (semantic representation) for…

Computer Vision and Pattern Recognition · Computer Science 2020-07-24 Mei-Chen Yeh , Fang Li

Exploring Open-Vocabulary Semantic Segmentation without Human Labels

Semantic segmentation is a crucial task in computer vision that involves segmenting images into semantically meaningful regions at the pixel level. However, existing approaches often rely on expensive human annotations as supervision for…

Computer Vision and Pattern Recognition · Computer Science 2023-06-02 Jun Chen , Deyao Zhu , Guocheng Qian , Bernard Ghanem , Zhicheng Yan , Chenchen Zhu , Fanyi Xiao , Mohamed Elhoseiny , Sean Chang Culatana

Cross-Domain Semantic Segmentation with Large Language Model-Assisted Descriptor Generation

Semantic segmentation plays a crucial role in enabling machines to understand and interpret visual scenes at a pixel level. While traditional segmentation methods have achieved remarkable success, their generalization to diverse scenes and…

Computer Vision and Pattern Recognition · Computer Science 2025-01-29 Philip Hughes , Larry Burns , Luke Adams

Zero-Shot Learning by Convex Combination of Semantic Embeddings

Several recent publications have proposed methods for mapping images into continuous semantic embedding spaces. In some cases the embedding space is trained jointly with the image transformation. In other cases the semantic embedding space…

Machine Learning · Computer Science 2017-02-28 Mohammad Norouzi , Tomas Mikolov , Samy Bengio , Yoram Singer , Jonathon Shlens , Andrea Frome , Greg S. Corrado , Jeffrey Dean

Zero-Shot Audio Classification using Image Embeddings

Supervised learning methods can solve the given problem in the presence of a large set of labeled data. However, the acquisition of a dataset covering all the target classes typically requires manual labeling which is expensive and…

Sound · Computer Science 2022-06-13 Duygu Dogan , Huang Xie , Toni Heittola , Tuomas Virtanen

IFSeg: Image-free Semantic Segmentation via Vision-Language Model

Vision-language (VL) pre-training has recently gained much attention for its transferability and flexibility in novel concepts (e.g., cross-modality transfer) across various visual tasks. However, VL-driven segmentation has been…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Sukmin Yun , Seong Hyeon Park , Paul Hongsuck Seo , Jinwoo Shin

Language Models as Zero-shot Visual Semantic Learners

Visual Semantic Embedding (VSE) models, which map images into a rich semantic embedding space, have been a milestone in object recognition and zero-shot learning. Current approaches to VSE heavily rely on static word em-bedding techniques.…

Computer Vision and Pattern Recognition · Computer Science 2021-07-27 Yue Jiao , Jonathon Hare , Adam Prügel-Bennett

Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

Open-vocabulary semantic segmentation is a challenging task, which requires the model to output semantic masks of an image beyond a close-set vocabulary. Although many efforts have been made to utilize powerful CLIP models to accomplish…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Xiangheng Shan , Dongyue Wu , Guilin Zhu , Yuanjie Shao , Nong Sang , Changxin Gao

Language Semantic Graph Guided Data-Efficient Learning

Developing generalizable models that can effectively learn from limited data and with minimal reliance on human supervision is a significant objective within the machine learning community, particularly in the era of deep neural networks.…

Computer Vision and Pattern Recognition · Computer Science 2023-11-16 Wenxuan Ma , Shuang Li , Lincan Cai , Jingxuan Kang

SIGN: Spatial-information Incorporated Generative Network for Generalized Zero-shot Semantic Segmentation

Unlike conventional zero-shot classification, zero-shot semantic segmentation predicts a class label at the pixel level instead of the image level. When solving zero-shot semantic segmentation problems, the need for pixel-level prediction…

Computer Vision and Pattern Recognition · Computer Science 2021-08-31 Jiaxin Cheng , Soumyaroop Nandi , Prem Natarajan , Wael Abd-Almageed

Zero-Shot Audio Classification via Semantic Embeddings

In this paper, we study zero-shot learning in audio classification via semantic embeddings extracted from textual labels and sentence descriptions of sound classes. Our goal is to obtain a classifier that is capable of recognizing audio…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-12 Huang Xie , Tuomas Virtanen

LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation

It is widely agreed that open-vocabulary-based approaches outperform classical closed-set training solutions for recognizing unseen objects in images for semantic segmentation. Existing open-vocabulary approaches leverage vision-language…

Computer Vision and Pattern Recognition · Computer Science 2026-02-19 Huadong Tang , Youpeng Zhao , Yan Huang , Min Xu , Jun Wang , Qiang Wu

Text4Seg: Reimagining Image Segmentation as Text Generation

Multimodal Large Language Models (MLLMs) have shown exceptional capabilities in vision-language tasks; however, effectively integrating image segmentation into these models remains a significant challenge. In this paper, we introduce…

Computer Vision and Pattern Recognition · Computer Science 2025-02-18 Mengcheng Lan , Chaofeng Chen , Yue Zhou , Jiaxing Xu , Yiping Ke , Xinjiang Wang , Litong Feng , Wayne Zhang

ESS: Learning Event-based Semantic Segmentation from Still Images

Retrieving accurate semantic information in challenging high dynamic range (HDR) and high-speed conditions remains an open challenge for image-based algorithms due to severe image degradations. Event cameras promise to address these…

Computer Vision and Pattern Recognition · Computer Science 2022-08-03 Zhaoning Sun , Nico Messikommer , Daniel Gehrig , Davide Scaramuzza

Training-Free Semantic Segmentation via LLM-Supervision

Recent advancements in open vocabulary models, like CLIP, have notably advanced zero-shot classification and segmentation by utilizing natural language for class-specific embeddings. However, most research has focused on improving model…

Computer Vision and Pattern Recognition · Computer Science 2024-04-02 Wenfang Sun , Yingjun Du , Gaowen Liu , Ramana Kompella , Cees G. M. Snoek