Related papers: Automatic Knowledge Extraction with Human Interfac…

PaperWeaver: Enriching Topical Paper Alerts by Contextualizing Recommended Papers with User-collected Papers

With the rapid growth of scholarly archives, researchers subscribe to "paper alert" systems that periodically provide them with recommendations of recently published papers that are similar to previously collected papers. However,…

Digital Libraries · Computer Science 2024-05-10 Yoonjoo Lee , Hyeonsu B. Kang , Matt Latzke , Juho Kim , Jonathan Bragg , Joseph Chee Chang , Pao Siangliulue

WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

This paper tackles \textbf{open-ended deep research (OEDR)}, a complex challenge where AI agents must synthesize vast web-scale information into insightful reports. Current approaches are plagued by dual-fold limitations: static research…

Computation and Language · Computer Science 2025-10-08 Zijian Li , Xin Guan , Bo Zhang , Shen Huang , Houquan Zhou , Shaopeng Lai , Ming Yan , Yong Jiang , Pengjun Xie , Fei Huang , Jun Zhang , Jingren Zhou

Weaver: Deep Co-Encoding of Questions and Documents for Machine Reading

This paper aims at improving how machines can answer questions directly from text, with the focus of having models that can answer correctly multiple types of questions and from various types of texts, documents or even from large…

Computation and Language · Computer Science 2018-04-30 Martin Raison , Pierre-Emmanuel Mazaré , Rajarshi Das , Antoine Bordes

Design of Automatically Adaptable Web Wrappers

Nowadays, the huge amount of information distributed through the Web motivates studying techniques to be adopted in order to extract relevant data in an efficient and reliable way. Both academia and enterprises developed several approaches…

Artificial Intelligence · Computer Science 2013-06-06 Emilio Ferrara , Robert Baumgartner

Information Extraction from Unstructured data using Augmented-AI and Computer Vision

Information extraction (IE) from unstructured documents remains a critical challenge in data processing pipelines. Traditional optical character recognition (OCR) methods and conventional parsing engines demonstrate limited effectiveness…

Computer Vision and Pattern Recognition · Computer Science 2025-07-28 Aditya Parikh

TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing

In this paper, we introduce TextBrewer, an open-source knowledge distillation toolkit designed for natural language processing. It works with different neural network models and supports various kinds of supervised learning tasks, such as…

Computation and Language · Computer Science 2020-12-14 Ziqing Yang , Yiming Cui , Zhipeng Chen , Wanxiang Che , Ting Liu , Shijin Wang , Guoping Hu

Text to Insight: Accelerating Organic Materials Knowledge Extraction via Deep Learning

Scientific literature is one of the most significant resources for sharing knowledge. Researchers turn to scientific literature as a first step in designing an experiment. Given the extensive and growing volume of literature, the common…

Computation and Language · Computer Science 2021-09-28 Xintong Zhao , Steven Lopez , Semion Saikin , Xiaohua Hu , Jane Greenberg

Deep Reader: Information extraction from Document images via relation extraction and Natural Language

Recent advancements in the area of Computer Vision with state-of-art Neural Networks has given a boost to Optical Character Recognition (OCR) accuracies. However, extracting characters/text alone is often insufficient for relevant…

Computer Vision and Pattern Recognition · Computer Science 2018-12-17 Vishwanath D , Rohit Rahul , Gunjan Sehgal , Swati , Arindam Chowdhury , Monika Sharma , Lovekesh Vig , Gautam Shroff , Ashwin Srinivasan

Wrap-Up: a Trainable Discourse Module for Information Extraction

The vast amounts of on-line text now available have led to renewed interest in information extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper…

Artificial Intelligence · Computer Science 2014-11-17 S. Soderland , Lehnert. W

A Novel Biologically Mechanism-Based Visual Cognition Model--Automatic Extraction of Semantics, Formation of Integrated Concepts and Re-selection Features for Ambiguity

Integration between biology and information science benefits both fields. Many related models have been proposed, such as computational visual cognition models, computational motor control models, integrations of both and so on. In general,…

Computer Vision and Pattern Recognition · Computer Science 2016-03-28 Peijie Yin , Hong Qiao , Wei Wu , Lu Qi , YinLin Li , Shanlin Zhong , Bo Zhang

OCR++: A Robust Framework For Information Extraction from Scholarly Articles

This paper proposes OCR++, an open-source framework designed for a variety of information extraction tasks from scholarly articles including metadata (title, author names, affiliation and e-mail), structure (section headings and body text,…

Digital Libraries · Computer Science 2016-09-26 Mayank Singh , Barnopriyo Barua , Priyank Palod , Manvi Garg , Sidhartha Satapathy , Samuel Bushi , Kumar Ayush , Krishna Sai Rohith , Tulasi Gamidi , Pawan Goyal , Animesh Mukherjee

Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting

Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types),…

Computation and Language · Computer Science 2023-10-31 Hejie Cui , Xinyu Fang , Zihan Zhang , Ran Xu , Xuan Kan , Xin Liu , Yue Yu , Manling Li , Yangqiu Song , Carl Yang

Learning from Web: Review of Approaches

Knowledge discovery is defined as non-trivial extraction of implicit, previously unknown and potentially useful information from given data. Knowledge extraction from web documents deals with unstructured, free-format documents whose number…

Neural and Evolutionary Computing · Computer Science 2007-05-23 Vitaly Schetinin

A Multilingual Information Extraction Pipeline for Investigative Journalism

We introduce an advanced information extraction pipeline to automatically process very large collections of unstructured textual data for the purpose of investigative journalism. The pipeline serves as a new input processor for the upcoming…

Computation and Language · Computer Science 2018-09-17 Gregor Wiedemann , Seid Muhie Yimam , Chris Biemann

Open Information Extraction on Scientific Text: An Evaluation

Open Information Extraction (OIE) is the task of the unsupervised creation of structured information from text. OIE is often used as a starting point for a number of downstream tasks including knowledge base construction, relation…

Computation and Language · Computer Science 2018-08-23 Paul Groth , Michael Lauruhn , Antony Scerri , Ron Daniel

Warp: a method for neural network interpretability applied to gene expression profiles

We show a proof of principle for warping, a method to interpret the inner working of neural networks in the context of gene expression analysis. Warping is an efficient way to gain insight to the inner workings of neural nets and make them…

Genomics · Quantitative Biology 2017-08-17 Trofimov Assya , Lemieux Sebastien , Perreault Claude

ExperienceWeaver: Optimizing Small-sample Experience Learning for LLM-based Clinical Text Improvement

Clinical text improvement is vital for healthcare efficiency but remains difficult due to limited high-quality data and the complex constraints of medical documentation. While Large Language Models (LLMs) show promise, current approaches…

Computation and Language · Computer Science 2026-02-03 Ziyan Xiao , Yinghao Zhu , Liang Peng , Lequan Yu

Rethinking Experience Utilization in Self-Evolving Language Model Agents

Self-evolving agents improve by accumulating and reusing experience from past interactions. Existing work has largely focused on how experience is constructed, represented, and updated, while paying less attention to how experience should…

Computation and Language · Computer Science 2026-05-11 Weixiang Zhao , Yingshuo Wang , Yichen Zhang , Yanyan Zhao , Yu Zhang , Yang Wu , Dandan Tu , Bing Qin , Ting Liu

Open Information Extraction

Open Information Extraction (Open IE) systems aim to obtain relation tuples with highly scalable extraction in portable across domain by identifying a variety of relation phrases and their arguments in arbitrary sentences. The first…

Computation and Language · Computer Science 2016-07-12 Duc-Thuan Vo , Ebrahim Bagheri

Getting To Know You: User Attribute Extraction from Dialogues

User attributes provide rich and useful information for user understanding, yet structured and easy-to-use attributes are often sparsely populated. In this paper, we leverage dialogues with conversational agents, which contain strong…

Computation and Language · Computer Science 2019-08-14 Chien-Sheng Wu , Andrea Madotto , Zhaojiang Lin , Peng Xu , Pascale Fung