English
Related papers

Related papers: SpannerLib: Embedding Declarative Information Extr…

200 papers

A document spanner models a program for Information Extraction (IE) as a function that takes as input a text document (string over a finite alphabet) and produces a relation of spans (intervals in the document) over a predefined schema. A…

Databases · Computer Science 2018-05-24 Liat Peterfreund , Balder ten Cate , Ronald Fagin , Benny Kimelfeld

The probing classifiers framework has been employed for interpreting deep neural network models for a variety of natural language processing (NLP) applications. Studies, however, have largely focused on sentencelevel NLP tasks. This work is…

Computation and Language · Computer Science 2023-10-25 Barry Wang , Xinya Du , Claire Cardie

Information extraction (IE) is fundamental to numerous NLP applications, yet existing solutions often require specialized models for different tasks or rely on computationally expensive large language models. We present GLiNER2, a unified…

Computation and Language · Computer Science 2025-07-25 Urchade Zaratiana , Gil Pasternak , Oliver Boyd , George Hurn-Maloney , Ash Lewis

Information Extraction (IE) tasks are commonly studied topics in various domains of research. Hence, the community continuously produces multiple techniques, solutions, and tools to perform such tasks. However, running those tools and…

Computation and Language · Computer Science 2022-06-06 Mohamad Yaser Jaradeh , Kuldeep Singh , Markus Stocker , Sören Auer

Information extraction (IE) for visually-rich documents (VRDs) has achieved SOTA performance recently thanks to the adaptation of Transformer-based language models, which shows the great potential of pre-training methods. In this paper, we…

Artificial Intelligence · Computer Science 2021-07-07 Tuan-Anh D. Nguyen , Hieu M. Vu , Nguyen Hong Son , Minh-Tien Nguyen

Typically, information extraction (IE) requires a pipeline approach: first, a sequence labeling model is trained on manually annotated documents to extract relevant spans; then, when a new document arrives, a model predicts spans which are…

Computation and Language · Computer Science 2021-10-12 Benjamin Townsend , Eamon Ito-Fisher , Lily Zhang , Madison May

The objective of Information Extraction (IE) is to derive structured representations from unstructured or semi-structured documents. However, developing IE models is complex due to the need of integrating several subtasks. Additionally,…

Information Retrieval · Computer Science 2024-06-04 Arne Binder , Leonhard Hennig , Christoph Alt

The vast amounts of on-line text now available have led to renewed interest in information extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper…

Artificial Intelligence · Computer Science 2014-11-17 S. Soderland , Lehnert. W

Information Extraction refers to a collection of tasks within Natural Language Processing (NLP) that identifies sub-sequences within text and their labels. These tasks have been used for many years to link extract relevant information and…

Computation and Language · Computer Science 2024-03-26 Yifan Ding , Michael Yankoski , Tim Weninger

Conventional Open Information Extraction (Open IE) systems are usually built on hand-crafted patterns from other NLP tools such as syntactic parsing, yet they face problems of error propagation. In this paper, we propose a neural Open IE…

Computation and Language · Computer Science 2018-05-14 Lei Cui , Furu Wei , Ming Zhou

State-of-the-art solutions for Natural Language Processing (NLP) are able to capture a broad range of contexts, like the sentence-level context or document-level context for short documents. But these solutions are still struggling when it…

We propose a new grammar-based language for defining information-extractors from documents (text) that is built upon the well-studied framework of document spanners for extracting structured data from text. While previously studied…

Databases · Computer Science 2023-01-25 Liat Peterfreund

We present a theoretical framework for the extraction and transformation of text documents. We propose to use a two-phase process where the first phase extracts span-tuples from a document, and the second phase maps the content of the…

Databases · Computer Science 2024-05-22 Cristian Riveros , Markus L. Schmid , Nicole Schweikardt

The task of information extraction (IE) is to extract structured knowledge from text. However, it is often not straightforward to utilize IE output due to the mismatch between the IE ontology and the downstream application needs. We propose…

Computation and Language · Computer Science 2025-10-31 Yizhu Jiao , Sha Li , Sizhe Zhou , Heng Ji , Jiawei Han

Information extraction (IE) aims to produce structured information from an input text, e.g., Named Entity Recognition and Relation Extraction. Various attempts have been proposed for IE via feature engineering or deep learning. However,…

Computation and Language · Computer Science 2019-12-09 Wenya Wang , Sinno Jialin Pan

This research work deals with Natural Language Processing (NLP) and extraction of essential information in an explicit form. The most common among the information management strategies is Document Retrieval (DR) and Information Filtering.…

Computation and Language · Computer Science 2020-04-07 K. R. Chowdhary

Document-level information extraction (IE) is a crucial task in natural language processing (NLP). This paper conducts a systematic review of recent document-level IE literature. In addition, we conduct a thorough error analysis with…

Computation and Language · Computer Science 2023-09-26 Hanwen Zheng , Sijia Wang , Lifu Huang

Programs for extracting structured information from text, namely information extractors, often operate separately on document segments obtained from a generic splitting operation such as sentences, paragraphs, k-grams, HTTP requests, and so…

Databases · Computer Science 2021-05-21 Johannes Doleschal , Benny Kimelfeld , Wim Martens , Frank Neven , Matthias Niewerth

Recent research in information extraction (IE) focuses on utilizing code-style inputs to enhance structured output generation. The intuition behind this is that the programming languages (PLs) inherently exhibit greater structural…

Computation and Language · Computer Science 2025-05-23 Bo Li , Gexiang Fang , Wei Ye , Zhenghua Xu , Jinglei Zhang , Hao Cheng , Shikun Zhang

We introduce an advanced information extraction pipeline to automatically process very large collections of unstructured textual data for the purpose of investigative journalism. The pipeline serves as a new input processor for the upcoming…

Computation and Language · Computer Science 2018-09-17 Gregor Wiedemann , Seid Muhie Yimam , Chris Biemann
‹ Prev 1 2 3 10 Next ›