Related papers: GLiNER2: An Efficient Multi-Task Information Extra…

GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks

Information extraction tasks require both accurate, efficient, and generalisable models. Classical supervised deep learning approaches can achieve the required performance, but they need large datasets and are limited in their ability to…

Machine Learning · Computer Science 2024-08-02 Ihor Stepanov , Mykhailo Shtopko

GLiNER2-PII: A Multilingual Model for Personally Identifiable Information Extraction

Reliable detection of personally identifiable information (PII) is increasingly important across modern data-processing systems, yet the task remains difficult: PII spans are heterogeneous, locale-dependent, context-sensitive, and often…

Computation and Language · Computer Science 2026-05-12 Urchade Zaratiana , Ash Lewis , George Hurn-Maloney

GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer

Named Entity Recognition (NER) is essential in various Natural Language Processing (NLP) applications. Traditional NER models are effective but limited to a set of predefined entity types. In contrast, Large Language Models (LLMs) can…

Computation and Language · Computer Science 2023-11-16 Urchade Zaratiana , Nadi Tomeh , Pierre Holat , Thierry Charnois

LLM-IE: A Python Package for Generative Information Extraction with Large Language Models

Objectives: Despite the recent adoption of large language models (LLMs) for biomedical information extraction, challenges in prompt engineering and algorithms persist, with no dedicated software available. To address this, we developed…

Machine Learning · Computer Science 2025-04-02 Enshuo Hsu , Kirk Roberts

GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction

Information extraction (IE) is an important task in Natural Language Processing (NLP), involving the extraction of named entities and their relationships from unstructured text. In this paper, we propose a novel approach to this task by…

Computation and Language · Computer Science 2024-04-22 Urchade Zaratiana , Nadi Tomeh , Niama El Khbir , Pierre Holat , Thierry Charnois

Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout

State-of-the-art solutions for Natural Language Processing (NLP) are able to capture a broad range of contexts, like the sentence-level context or document-level context for short documents. But these solutions are still struggling when it…

Computation and Language · Computer Science 2020-03-09 Filip Graliński , Tomasz Stanisławek , Anna Wróblewska , Dawid Lipiński , Agnieszka Kaliska , Paulina Rosalska , Bartosz Topolski , Przemysław Biecek

GLiNER-Relex: A Unified Framework for Joint Named Entity Recognition and Relation Extraction

Joint named entity recognition (NER) and relation extraction (RE) is a fundamental task in natural language processing for constructing knowledge graphs from unstructured text. While recent approaches treat NER and RE as separate tasks…

Computation and Language · Computer Science 2026-05-12 Ihor Stepanov , Oleksandr Lukashov , Mykhailo Shtopko , Vivek Kalyanarangan

Effective and Efficient Schema-aware Information Extraction Using On-Device Large Language Models

Information extraction (IE) plays a crucial role in natural language processing (NLP) by converting unstructured text into structured knowledge. Deploying computationally intensive large language models (LLMs) on resource-constrained…

Computation and Language · Computer Science 2025-05-22 Zhihao Wen , Sheng Liang , Yaxiong Wu , Yongyue Zhang , Yong Liu

An Empirical Study on Information Extraction using Large Language Models

Human-like large language models (LLMs), especially the most powerful and popular ones in OpenAI's GPT family, have proven to be very helpful for many natural language processing (NLP) related tasks. Therefore, various attempts have been…

Computation and Language · Computer Science 2024-09-11 Ridong Han , Chaohao Yang , Tao Peng , Prayag Tiwari , Xiang Wan , Lu Liu , Benyou Wang

An Empirical Study on Information Extraction using Large Language Models

Human-like large language models (LLMs), especially the most powerful and popular ones in OpenAI's GPT family, have proven to be very helpful for many natural language processing (NLP) related tasks. Therefore, various attempts have been…

Computation and Language · Computer Science 2024-09-12 Ridong Han , Chaohao Yang , Tao Peng , Prayag Tiwari , Xiang Wan , Lu Liu , Benyou Wang

Neural Open Information Extraction

Conventional Open Information Extraction (Open IE) systems are usually built on hand-crafted patterns from other NLP tools such as syntactic parsing, yet they face problems of error propagation. In this paper, we propose a neural Open IE…

Computation and Language · Computer Science 2018-05-14 Lei Cui , Furu Wei , Ming Zhou

SpannerLib: Embedding Declarative Information Extraction in an Imperative Workflow

Document spanners have been proposed as a formal framework for declarative Information Extraction (IE) from text, following IE products from the industry and academia. Over the past decade, the framework has been studied thoroughly in terms…

Databases · Computer Science 2024-09-05 Dean Light , Ahmad Aiashy , Mahmoud Diab , Daniel Nachmias , Stijn Vansummeren , Benny Kimelfeld

Local Obfuscation by GLINER for Impartial Context Aware Lineage: Development and evaluation of PII Removal system

Removing Personally Identifiable Information (PII) from clinical notes in Electronic Health Records (EHRs) is essential for research and AI development. While Large Language Models (LLMs) are powerful, their high computational costs and the…

Computation and Language · Computer Science 2025-10-23 Prakrithi Shivaprakash , Lekhansh Shukla , Animesh Mukherjee , Prabhat Chand , Pratima Murthy

SCIR: A Self-Correcting Iterative Refinement Framework for Enhanced Information Extraction Based on Schema

Although Large language Model (LLM)-powered information extraction (IE) systems have shown impressive capabilities, current fine-tuning paradigms face two major limitations: high training costs and difficulties in aligning with LLM…

Computation and Language · Computer Science 2025-12-16 Yushen Fang , Jianjun Li , Mingqian Ding , Chang Liu , Xinchi Zou , Wenqi Yang

IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus

Large Language Models (LLMs) demonstrate remarkable potential across various domains; however, they exhibit a significant performance gap in Information Extraction (IE). Note that high-quality instruction data is the vital key for enhancing…

Computation and Language · Computer Science 2024-05-28 Honghao Gui , Lin Yuan , Hongbin Ye , Ningyu Zhang , Mengshu Sun , Lei Liang , Huajun Chen

TEXT2DB: Integration-Aware Information Extraction with Large Language Model Agents

The task of information extraction (IE) is to extract structured knowledge from text. However, it is often not straightforward to utilize IE output due to the mismatch between the IE ontology and the downstream application needs. We propose…

Computation and Language · Computer Science 2025-10-31 Yizhu Jiao , Sha Li , Sizhe Zhou , Heng Ji , Jiawei Han

Benchmarking Large Language Models with Augmented Instructions for Fine-grained Information Extraction

Information Extraction (IE) is an essential task in Natural Language Processing. Traditional methods have relied on coarse-grained extraction with simple instructions. However, with the emergence of Large Language Models (LLMs), there is a…

Computation and Language · Computer Science 2023-10-10 Jun Gao , Huan Zhao , Yice Zhang , Wei Wang , Changlong Yu , Ruifeng Xu

A Multilingual Information Extraction Pipeline for Investigative Journalism

We introduce an advanced information extraction pipeline to automatically process very large collections of unstructured textual data for the purpose of investigative journalism. The pipeline serves as a new input processor for the upcoming…

Computation and Language · Computer Science 2018-09-17 Gregor Wiedemann , Seid Muhie Yimam , Chris Biemann

llmNER: (Zero|Few)-Shot Named Entity Recognition, Exploiting the Power of Large Language Models

Large language models (LLMs) allow us to generate high-quality human-like text. One interesting task in natural language processing (NLP) is named entity recognition (NER), which seeks to detect mentions of relevant information in…

Computation and Language · Computer Science 2024-06-10 Fabián Villena , Luis Miranda , Claudio Aracena

PLiNIO: A User-Friendly Library of Gradient-based Methods for Complexity-aware DNN Optimization

Accurate yet efficient Deep Neural Networks (DNNs) are in high demand, especially for applications that require their execution on constrained edge devices. Finding such DNNs in a reasonable time for new applications requires automated…

Machine Learning · Computer Science 2023-07-20 Daniele Jahier Pagliari , Matteo Risso , Beatrice Alessandra Motetti , Alessio Burrello