Related papers: Learning Implicit Entity-object Relations by Bidir…

Multi-Granularity Cross-Modality Representation Learning for Named Entity Recognition on Social Media

Named Entity Recognition (NER) on social media refers to discovering and classifying entities from unstructured free-form content, and it plays an important role for various applications such as intention understanding and user…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Peipei Liu , Gaosheng Wang , Hong Li , Jie Liu , Yimo Ren , Hongsong Zhu , Limin Sun

Flat Multi-modal Interaction Transformer for Named Entity Recognition

Multi-modal named entity recognition (MNER) aims at identifying entity spans and recognizing their categories in social media posts with the aid of images. However, in dominant MNER approaches, the interaction of different modalities is…

Computer Vision and Pattern Recognition · Computer Science 2023-03-10 Junyu Lu , Dixiang Zhang , Pingjian Zhang

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

Recently, Multi-modal Named Entity Recognition (MNER) has attracted a lot of attention. Most of the work utilizes image information through region-level visual representations obtained from a pretrained object detector and relies on an…

Computation and Language · Computer Science 2022-09-21 Xinyu Wang , Min Gui , Yong Jiang , Zixia Jia , Nguyen Bach , Tao Wang , Zhongqiang Huang , Fei Huang , Kewei Tu

MNER-QG: An End-to-End MRC framework for Multimodal Named Entity Recognition with Query Grounding

Multimodal named entity recognition (MNER) is a critical step in information extraction, which aims to detect entity spans and classify them to corresponding entity types given a sentence-image pair. Existing methods either (1) obtain named…

Computer Vision and Pattern Recognition · Computer Science 2022-11-29 Meihuizi Jia , Lei Shen , Xin Shen , Lejian Liao , Meng Chen , Xiaodong He , Zhendong Chen , Jiaqi Li

Joint Multimodal Entity-Relation Extraction Based on Edge-enhanced Graph Alignment Network and Word-pair Relation Tagging

Multimodal named entity recognition (MNER) and multimodal relation extraction (MRE) are two fundamental subtasks in the multimodal knowledge graph construction task. However, the existing methods usually handle two tasks independently,…

Computation and Language · Computer Science 2023-02-21 Li Yuan , Yi Cai , Jin Wang , Qing Li

Hierarchical Aligned Multimodal Learning for NER on Tweet Posts

Mining structured knowledge from tweets using named entity recognition (NER) can be beneficial for many down stream applications such as recommendation and intention understanding. With tweet posts tending to be multimodal, multimodal named…

Computation and Language · Computer Science 2024-01-05 Peipei Liu , Hong Li , Yimo Ren , Jie Liu , Shuaizong Si , Hongsong Zhu , Limin Sun

2M-NER: Contrastive Learning for Multilingual and Multimodal NER with Language and Modal Fusion

Named entity recognition (NER) is a fundamental task in natural language processing that involves identifying and classifying entities in sentences into pre-defined types. It plays a crucial role in various research fields, including entity…

Computation and Language · Computer Science 2024-04-29 Dongsheng Wang , Xiaoqin Feng , Zeming Liu , Chuan Wang

Can images help recognize entities? A study of the role of images for Multimodal NER

Multimodal named entity recognition (MNER) requires to bridge the gap between language understanding and visual context. While many multimodal neural techniques have been proposed to incorporate images into the MNER task, the model's…

Computation and Language · Computer Science 2021-09-21 Shuguang Chen , Gustavo Aguilar , Leonardo Neves , Thamar Solorio

A multimodal deep learning approach for named entity recognition from social media

Named Entity Recognition (NER) from social media posts is a challenging task. User generated content that forms the nature of social media, is noisy and contains grammatical and linguistic errors. This noisy content makes it much harder for…

Computation and Language · Computer Science 2021-09-16 Meysam Asgari-Chenaghlu , M. Reza Feizi-Derakhshi , Leili Farzinvash , M. A. Balafar , Cina Motamed

Advancing Grounded Multimodal Named Entity Recognition via LLM-Based Reformulation and Box-Based Segmentation

Grounded Multimodal Named Entity Recognition (GMNER) task aims to identify named entities, entity types and their corresponding visual regions. GMNER task exhibits two challenging attributes: 1) The tenuous correlation between images and…

Multimedia · Computer Science 2025-09-03 Jinyuan Li , Ziyan Li , Han Li , Jianfei Yu , Rui Xia , Di Sun , Gang Pan

LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition

Grounded Multimodal Named Entity Recognition (GMNER) is a nascent multimodal task that aims to identify named entities, entity types and their corresponding visual regions. GMNER task exhibits two challenging properties: 1) The weak…

Computer Vision and Pattern Recognition · Computer Science 2024-05-30 Jinyuan Li , Han Li , Di Sun , Jiahao Wang , Wenkun Zhang , Zan Wang , Gang Pan

RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER

Recently multimodal named entity recognition (MNER) has utilized images to improve the accuracy of NER in tweets. However, most of the multimodal methods use attention mechanisms to extract visual clues regardless of whether the text and…

Computation and Language · Computer Science 2021-02-08 Lin Sun , Jiquan Wang , Kai Zhang , Yindu Su , Fangsheng Weng

Multi-Grained Query-Guided Set Prediction Network for Grounded Multimodal Named Entity Recognition

Grounded Multimodal Named Entity Recognition (GMNER) is an emerging information extraction (IE) task, aiming to simultaneously extract entity spans, types, and corresponding visual regions of entities from given sentence-image pairs data.…

Information Retrieval · Computer Science 2025-01-28 Jielong Tang , Zhenxing Wang , Ziyang Gong , Jianxing Yu , Xiangwei Zhu , Jian Yin

Integrating Large Pre-trained Models into Multimodal Named Entity Recognition with Evidential Fusion

Multimodal Named Entity Recognition (MNER) is a crucial task for information extraction from social media platforms such as Twitter. Most current methods rely on attention weights to extract information from both text and images but are…

Computer Vision and Pattern Recognition · Computer Science 2023-06-30 Weide Liu , Xiaoyang Zhong , Jingwen Hou , Shaohua Li , Haozhe Huang , Yuming Fang

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Answering questions that require reading texts in an image is challenging for current models. One key difficulty of this task is that rare, polysemous, and ambiguous words frequently appear in images, e.g., names of places, products, and…

Computer Vision and Pattern Recognition · Computer Science 2020-04-01 Difei Gao , Ke Li , Ruiping Wang , Shiguang Shan , Xilin Chen

SCANNER: Knowledge-Enhanced Approach for Robust Multi-modal Named Entity Recognition of Unseen Entities

Recent advances in named entity recognition (NER) have pushed the boundary of the task to incorporate visual signals, leading to many variants, including multi-modal NER (MNER) or grounded MNER (GMNER). A key challenge to these tasks is…

Computation and Language · Computer Science 2024-04-03 Hyunjong Ok , Taeho Kil , Sukmin Seo , Jaeho Lee

Multi-task Transformer with Relation-attention and Type-attention for Named Entity Recognition

Named entity recognition (NER) is an important research problem in natural language processing. There are three types of NER tasks, including flat, nested and discontinuous entity recognition. Most previous sequential labeling models are…

Computation and Language · Computer Science 2023-03-21 Ying Mo , Hongyin Tang , Jiahao Liu , Qifan Wang , Zenglin Xu , Jingang Wang , Wei Wu , Zhoujun Li

Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge

Multimodal Named Entity Recognition (MNER) on social media aims to enhance textual entity prediction by incorporating image-based clues. Existing studies mainly focus on maximizing the utilization of pertinent image information or…

Computation and Language · Computer Science 2023-10-19 Jinyuan Li , Han Li , Zhuo Pan , Di Sun , Jiahao Wang , Wenkun Zhang , Gang Pan

GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer

Named Entity Recognition (NER) is essential in various Natural Language Processing (NLP) applications. Traditional NER models are effective but limited to a set of predefined entity types. In contrast, Large Language Models (LLMs) can…

Computation and Language · Computer Science 2023-11-16 Urchade Zaratiana , Nadi Tomeh , Pierre Holat , Thierry Charnois

E2E-GMNER: End-to-End Generative Grounded Multimodal Named Entity Recognition

Grounded Multimodal Named Entity Recognition (GMNER) aims to jointly identify named entity mentions in text, predict their semantic types, and ground each entity to a corresponding visual region in an associated image. Existing approaches…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Meng Zhang , Jinzhong Ning , Xiaolong Wu , Hongfei Lin , Yijia Zhang