Related papers: GRES: Generalized Referring Expression Segmentatio…

GREx: Generalized Referring Expression Segmentation, Comprehension, and Generation

Referring Expression Segmentation (RES) and Comprehension (REC) respectively segment and detect the object described by an expression, while Referring Expression Generation (REG) generates an expression for the selected object. Existing…

Computer Vision and Pattern Recognition · Computer Science 2026-01-09 Henghui Ding , Chang Liu , Shuting He , Xudong Jiang , Yu-Gang Jiang

Advancing Referring Expression Segmentation Beyond Single Image

Referring Expression Segmentation (RES) is a widely explored multi-modal task, which endeavors to segment the pre-existing object within a single image with a given linguistic expression. However, in broader real-world scenarios, it is not…

Computer Vision and Pattern Recognition · Computer Science 2023-05-23 Yixuan Wu , Zhao Zhang , Xie Chi , Feng Zhu , Rui Zhao

Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities

Referring expression segmentation (RES) aims at segmenting the entities' masks that match the descriptive language expression. While traditional RES methods primarily address object-level grounding, real-world scenarios demand a more…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 Jing Liu , Wenxuan Wang , Yisi Zhang , Yepeng Tang , Xingjian He , Longteng Guo , Tongtian Yue , Xinlong Wang

Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation

Referring expression segmentation (RES) aims at segmenting the foreground masks of the entities that match the descriptive natural language expression. Previous datasets and methods for classic RES task heavily rely on the prior assumption…

Computer Vision and Pattern Recognition · Computer Science 2024-03-22 Wenxuan Wang , Tongtian Yue , Yisi Zhang , Longteng Guo , Xingjian He , Xinlong Wang , Jing Liu

GREC: Generalized Referring Expression Comprehension

The objective of Classic Referring Expression Comprehension (REC) is to produce a bounding box corresponding to the object mentioned in a given textual description. Commonly, existing datasets and techniques in classic REC are tailored for…

Computer Vision and Pattern Recognition · Computer Science 2023-12-27 Shuting He , Henghui Ding , Chang Liu , Xudong Jiang

Towards Omni-supervised Referring Expression Segmentation

Referring Expression Segmentation (RES) is an emerging task in computer vision, which segments the target instances in images based on text descriptions. However, its development is plagued by the expensive segmentation labels. To address…

Computer Vision and Pattern Recognition · Computer Science 2023-11-28 Minglang Huang , Yiyi Zhou , Gen Luo , Guannan Jiang , Weilin Zhuang , Xiaoshuai Sun

GSVA: Generalized Segmentation via Multimodal Large Language Models

Generalized Referring Expression Segmentation (GRES) extends the scope of classic RES to refer to multiple objects in one expression or identify the empty targets absent in the image. GRES poses challenges in modeling the complex spatial…

Computer Vision and Pattern Recognition · Computer Science 2024-03-22 Zhuofan Xia , Dongchen Han , Yizeng Han , Xuran Pan , Shiji Song , Gao Huang

Phrase-Instance Alignment for Generalized Referring Segmentation

Generalized Referring expressions can describe one object, several related objects, or none at all. Existing generalized referring segmentation (GRES) models treat all cases alike, predicting a single binary mask and ignoring how linguistic…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 E-Ro Nguyen , Hieu Le , Dimitris Samaras , Michael S. Ryoo

Fully and Weakly Supervised Referring Expression Segmentation with End-to-End Learning

Referring Expression Segmentation (RES), which is aimed at localizing and segmenting the target according to the given language expression, has drawn increasing attention. Existing methods jointly consider the localization and segmentation…

Computer Vision and Pattern Recognition · Computer Science 2022-12-21 Hui Li , Mingjie Sun , Jimin Xiao , Eng Gee Lim , Yao Zhao

Meta Compositional Referring Expression Segmentation

Referring expression segmentation aims to segment an object described by a language expression from an image. Despite the recent progress on this task, existing models tackling this task may not be able to fully capture semantics and visual…

Computer Vision and Pattern Recognition · Computer Science 2023-04-13 Li Xu , Mark He Huang , Xindi Shang , Zehuan Yuan , Ying Sun , Jun Liu

Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation

Referring Expression Segmentation (RES) has attracted rising attention, aiming to identify and segment objects based on natural language expressions. While substantial progress has been made in RES, the emergence of Generalized Referring…

Computer Vision and Pattern Recognition · Computer Science 2025-01-07 Weize Li , Zhicheng Zhao , Haochen Bai , Fei Su

Towards Generalizable Referring Image Segmentation via Target Prompt and Visual Coherence

Referring image segmentation (RIS) aims to segment objects in an image conditioning on free-from text descriptions. Despite the overwhelming progress, it still remains challenging for current approaches to perform well on cases with various…

Computer Vision and Pattern Recognition · Computer Science 2023-12-04 Yajie Liu , Pu Ge , Haoxiang Ma , Shichao Fan , Qingjie Liu , Di Huang , Yunhong Wang

Refer to Any Segmentation Mask Group With Vision-Language Prompts

Recent image segmentation models have advanced to segment images into high-quality masks for visual entities, and yet they cannot provide comprehensive semantic understanding for complex queries based on both language and vision. This…

Computer Vision and Pattern Recognition · Computer Science 2025-10-20 Shengcao Cao , Zijun Wei , Jason Kuen , Kangning Liu , Lingzhi Zhang , Jiuxiang Gu , HyunJoon Jung , Liang-Yan Gui , Yu-Xiong Wang

3D-GRES: Generalized 3D Referring Expression Segmentation

3D Referring Expression Segmentation (3D-RES) is dedicated to segmenting a specific instance within a 3D space based on a natural language description. However, current approaches are limited to segmenting a single target, restricting the…

Computer Vision and Pattern Recognition · Computer Science 2024-08-01 Changli Wu , Yihang Liu , Jiayi Ji , Yiwei Ma , Haowei Wang , Gen Luo , Henghui Ding , Xiaoshuai Sun , Rongrong Ji

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

The task of video object segmentation with referring expressions (language-guided VOS) is to, given a linguistic phrase and a video, generate binary masks for the object to which the phrase refers. Our work argues that existing benchmarks…

Computer Vision and Pattern Recognition · Computer Science 2020-10-02 Miriam Bellver , Carles Ventura , Carina Silberer , Ioannis Kazakos , Jordi Torres , Xavier Giro-i-Nieto

RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner

Referring expression segmentation (RES), a task that involves localizing specific instance-level objects based on free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interaction. It demands an intricate…

Computer Vision and Pattern Recognition · Computer Science 2024-02-13 Ying Zang , Chenglong Fu , Runlong Cao , Didi Zhu , Min Zhang , Wenjun Hu , Lanyun Zhu , Tianrun Chen

ResAgent: Entropy-based Prior Point Discovery and Visual Reasoning for Referring Expression Segmentation

Referring Expression Segmentation (RES) is a core vision-language segmentation task that enables pixel-level understanding of targets via free-form linguistic expressions, supporting critical applications such as human-robot interaction and…

Computer Vision and Pattern Recognition · Computer Science 2026-01-26 Yihao Wang , Jusheng Zhang , Ziyi Tang , Keze Wang , Meng Yang

Latent Expression Generation for Referring Image Segmentation and Grounding

Visual grounding tasks, such as referring image segmentation (RIS) and referring expression comprehension (REC), aim to localize a target object based on a given textual description. The target object in an image can be described in…

Computer Vision and Pattern Recognition · Computer Science 2025-08-19 Seonghoon Yu , Junbeom Hong , Joonseok Lee , Jeany Son

CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation

The newly proposed Generalized Referring Expression Segmentation (GRES) amplifies the formulation of classic RES by involving complex multiple/non-target scenarios. Recent approaches address GRES by directly extending the well-adopted RES…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Zhuoyan Luo , Yinghao Wu , Tianheng Cheng , Yong Liu , Yicheng Xiao , Hongfa Wang , Xiao-Ping Zhang , Yujiu Yang

Referring Expression Comprehension: A Survey of Methods and Datasets

Referring expression comprehension (REC) aims to localize a target object in an image described by a referring expression phrased in natural language. Different from the object detection task that queried object labels have been…

Computer Vision and Pattern Recognition · Computer Science 2020-12-08 Yanyuan Qiao , Chaorui Deng , Qi Wu