Related papers: Neural Implicit Vision-Language Feature Fields

VL-Fields: Towards Language-Grounded Neural Implicit Spatial Representations

We present Visual-Language Fields (VL-Fields), a neural implicit spatial representation that enables open-vocabulary semantic queries. Our model encodes and fuses the geometry of a scene with vision-language trained latent features by…

Computer Vision and Pattern Recognition · Computer Science 2023-05-26 Nikolaos Tsagkas , Oisin Mac Aodha , Chris Xiaoxuan Lu

Panoptic Vision-Language Feature Fields

Recently, methods have been proposed for 3D open-vocabulary semantic segmentation. Such methods are able to segment scenes into arbitrary classes based on text descriptions provided during runtime. In this paper, we propose to the best of…

Computer Vision and Pattern Recognition · Computer Science 2024-01-19 Haoran Chen , Kenneth Blomqvist , Francesco Milano , Roland Siegwart

OpenScene: 3D Scene Understanding with Open Vocabularies

Traditional 3D scene understanding approaches rely on labeled 3D datasets to train a model for a single task with supervision. We propose OpenScene, an alternative approach where a model predicts dense features for 3D scene points that are…

Computer Vision and Pattern Recognition · Computer Science 2023-04-07 Songyou Peng , Kyle Genova , Chiyu "Max" Jiang , Andrea Tagliasacchi , Marc Pollefeys , Thomas Funkhouser

Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting

Open-vocabulary 3D scene understanding presents a significant challenge in computer vision, with wide-ranging applications in embodied agents and augmented reality systems. Existing methods adopt neurel rendering methods as 3D…

Computer Vision and Pattern Recognition · Computer Science 2024-08-26 Jun Guo , Xiaojian Ma , Yue Fan , Huaping Liu , Qing Li

Exploring Open-Vocabulary Semantic Segmentation without Human Labels

Semantic segmentation is a crucial task in computer vision that involves segmenting images into semantically meaningful regions at the pixel level. However, existing approaches often rely on expensive human annotations as supervision for…

Computer Vision and Pattern Recognition · Computer Science 2023-06-02 Jun Chen , Deyao Zhu , Guocheng Qian , Bernard Ghanem , Zhicheng Yan , Chenchen Zhu , Fanyi Xiao , Mohamed Elhoseiny , Sean Chang Culatana

Towards Open-Vocabulary Video Semantic Segmentation

Semantic segmentation in videos has been a focal point of recent research. However, existing models encounter challenges when faced with unfamiliar categories. To address this, we introduce the Open Vocabulary Video Semantic Segmentation…

Multimedia · Computer Science 2024-12-13 Xinhao Li , Yun Liu , Guolei Sun , Min Wu , Le Zhang , Ce Zhu

Open-Vocabulary SAM3D: Towards Training-free Open-Vocabulary 3D Scene Understanding

Open-vocabulary 3D scene understanding presents a significant challenge in the field. Recent works have sought to transfer knowledge embedded in vision-language models from 2D to 3D domains. However, these approaches often require prior…

Computer Vision and Pattern Recognition · Computer Science 2024-09-06 Hanchen Tai , Qingdong He , Jiangning Zhang , Yijie Qian , Zhenyu Zhang , Xiaobin Hu , Xiangtai Li , Yabiao Wang , Yong Liu

OpenMask3D: Open-Vocabulary 3D Instance Segmentation

We introduce the task of open-vocabulary 3D instance segmentation. Current approaches for 3D instance segmentation can typically only recognize object categories from a pre-defined closed set of classes that are annotated in the training…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Ayça Takmaz , Elisabetta Fedele , Robert W. Sumner , Marc Pollefeys , Federico Tombari , Francis Engelmann

Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding

To bridge the gap between supervised semantic segmentation and real-world applications that acquires one model to recognize arbitrary new concepts, recent zero-shot segmentation attracts a lot of attention by exploring the relationships…

Computer Vision and Pattern Recognition · Computer Science 2022-11-01 Quande Liu , Youpeng Wen , Jianhua Han , Chunjing Xu , Hang Xu , Xiaodan Liang

Delving into Shape-aware Zero-shot Semantic Segmentation

Thanks to the impressive progress of large-scale vision-language pretraining, recent recognition models can classify arbitrary objects in a zero-shot and open-set manner, with a surprisingly high accuracy. However, translating this success…

Computer Vision and Pattern Recognition · Computer Science 2023-04-18 Xinyu Liu , Beiwen Tian , Zhen Wang , Rui Wang , Kehua Sheng , Bo Zhang , Hao Zhao , Guyue Zhou

GNeSF: Generalizable Neural Semantic Fields

3D scene segmentation based on neural implicit representation has emerged recently with the advantage of training only on 2D supervision. However, existing approaches still requires expensive per-scene optimization that prohibits…

Computer Vision and Pattern Recognition · Computer Science 2023-10-27 Hanlin Chen , Chen Li , Mengqi Guo , Zhiwen Yan , Gim Hee Lee

A Training-Free Framework for Open-Vocabulary Image Segmentation and Recognition with EfficientNet and CLIP

This paper presents a novel training-free framework for open-vocabulary image segmentation and object recognition (OVSR), which leverages EfficientNetB0, a convolutional neural network, for unsupervised segmentation and CLIP, a…

Computer Vision and Pattern Recognition · Computer Science 2025-10-28 Ying Dai , Wei Yu Chen

O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation

Online construction of open-ended language scenes is crucial for robotic applications, where open-vocabulary interactive scene understanding is required. Recently, neural implicit representation has provided a promising direction for online…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Muer Tie , Julong Wei , Zhengjun Wang , Ke Wu , Shansuai Yuan , Kaizhao Zhang , Jie Jia , Jieru Zhao , Zhongxue Gan , Wenchao Ding

Hierarchical Open-vocabulary Universal Image Segmentation

Open-vocabulary image segmentation aims to partition an image into semantic regions according to arbitrary text descriptions. However, complex visual scenes can be naturally decomposed into simpler parts and abstracted at multiple levels of…

Computer Vision and Pattern Recognition · Computer Science 2023-12-22 Xudong Wang , Shufan Li , Konstantinos Kallidromitis , Yusuke Kato , Kazuki Kozuka , Trevor Darrell

Leveraging 2D-VLM for Label-Free 3D Segmentation in Large-Scale Outdoor Scene Understanding

This paper presents a novel 3D semantic segmentation method for large-scale point cloud data that does not require annotated 3D training data or paired RGB images. The proposed approach projects 3D point clouds onto 2D images using virtual…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Toshihiko Nishimura , Hirofumi Abe , Kazuhiko Murasaki , Taiga Yoshida , Ryuichi Tanida

Zero-Shot Semantic Segmentation

Semantic segmentation models are limited in their ability to scale to large numbers of object classes. In this paper, we introduce the new task of zero-shot semantic segmentation: learning pixel-wise classifiers for never-seen object…

Computer Vision and Pattern Recognition · Computer Science 2019-11-19 Maxime Bucher , Tuan-Hung Vu , Matthieu Cord , Patrick Pérez

Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments

The global rise in the number of people with physical disabilities, in part due to improvements in post-trauma survivorship and longevity, has amplified the demand for advanced assistive technologies to improve mobility and independence.…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Yifan Xu , Vineet Kamat , Carol Menassa

Self-supervised Learning of Neural Implicit Feature Fields for Camera Pose Refinement

Visual localization techniques rely upon some underlying scene representation to localize against. These representations can be explicit such as 3D SFM map or implicit, such as a neural network that learns to encode the scene. The former…

Computer Vision and Pattern Recognition · Computer Science 2024-06-13 Maxime Pietrantoni , Gabriela Csurka , Martin Humenberger , Torsten Sattler

IGLOSS: Image Generation for Lidar Open-vocabulary Semantic Segmentation

This paper presents a new method for the zero-shot open-vocabulary semantic segmentation (OVSS) of 3D automotive lidar data. To circumvent the recognized image-text modality gap that is intrinsic to approaches based on Vision Language…

Computer Vision and Pattern Recognition · Computer Science 2026-04-03 Nermin Samet , Gilles Puy , Renaud Marlet

Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments

Semantic segmentation is a critical technique for effective scene understanding. Traditional RGB-T semantic segmentation models often struggle to generalize across diverse scenarios due to their reliance on pretrained models and predefined…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Meng Yu , Luojie Yang , Xunjie He , Yi Yang , Yufeng Yue