Related papers: Structure and Semantics Preserving Document Repres…

ATLANTIC: Structure-Aware Retrieval-Augmented Language Model for Interdisciplinary Science

Large language models record impressive performance on many natural language processing tasks. However, their knowledge capacity is limited to the pretraining corpus. Retrieval augmentation offers an effective solution by retrieving context…

Computation and Language · Computer Science 2023-11-22 Sai Munikoti , Anurag Acharya , Sridevi Wagle , Sameera Horawalavithana

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Network

We present an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images. We consider document semantic structure extraction as a pixel-wise segmentation task, and propose a unified model…

Computer Vision and Pattern Recognition · Computer Science 2017-06-09 Xiao Yang , Ersin Yumer , Paul Asente , Mike Kraley , Daniel Kifer , C. Lee Giles

Semantic Modeling of Textual Relationships in Cross-Modal Retrieval

Feature modeling of different modalities is a basic problem in current research of cross-modal information retrieval. Existing models typically project texts and images into one embedding space, in which semantically similar information…

Multimedia · Computer Science 2019-06-13 Jing Yu , Chenghao Yang , Zengchang Qin , Zhuoqian Yang , Yue Hu , Weifeng Zhang

Reasoning with Latent Structure Refinement for Document-Level Relation Extraction

Document-level relation extraction requires integrating information within and across multiple sentences of a document and capturing complex interactions between inter-sentence entities. However, effective aggregation of relevant…

Computation and Language · Computer Science 2020-07-29 Guoshun Nan , Zhijiang Guo , Ivan Sekulić , Wei Lu

Learning to Search in Long Documents Using Document Structure

Reading comprehension models are based on recurrent neural networks that sequentially process the document tokens. As interest turns to answering more complex questions over longer documents, sequential reading of large portions of text…

Computation and Language · Computer Science 2018-09-11 Mor Geva , Jonathan Berant

Document Structure aware Relational Graph Convolutional Networks for Ontology Population

Ontologies comprising of concepts, their attributes, and relationships are used in many knowledge based AI systems. While there have been efforts towards populating domain specific ontologies, we examine the role of document structure in…

Artificial Intelligence · Computer Science 2022-04-14 Abhay M Shalghar , Ayush Kumar , Balaji Ganesan , Aswin Kannan , Akshay Parekh , Shobha G

A Multi-Resolution Word Embedding for Document Retrieval from Large Unstructured Knowledge Bases

Deep language models learning a hierarchical representation proved to be a powerful tool for natural language processing, text mining and information retrieval. However, representations that perform well for retrieval must capture semantic…

Information Retrieval · Computer Science 2019-05-24 Tolgahan Cakaloglu , Xiaowei Xu

Learning to Match Using Local and Distributed Representations of Text for Web Search

Models such as latent semantic analysis and those based on neural embeddings learn distributed representations of text, and match the query against the document in the latent semantic space. In traditional information retrieval models, on…

Information Retrieval · Computer Science 2016-10-27 Bhaskar Mitra , Fernando Diaz , Nick Craswell

Semantic Regularities in Document Representations

Recent work exhibited that distributed word representations are good at capturing linguistic regularities in language. This allows vector-oriented reasoning based on simple linear algebra between words. Since many different methods have…

Computation and Language · Computer Science 2016-03-25 Fei Sun , Jiafeng Guo , Yanyan Lan , Jun Xu , Xueqi Cheng

Unfolding the Structure of a Document using Deep Learning

Understanding and extracting of information from large documents, such as business opportunities, academic articles, medical documents and technical reports, poses challenges not present in short documents. Such large documents may be…

Computation and Language · Computer Science 2019-10-10 Muhammad Mahbubur Rahman , Tim Finin

Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval

The abundance of multimodal data (e.g. social media posts) has inspired interest in cross-modal retrieval methods. Popular approaches rely on a variety of metric learning losses, which prescribe what the proximity of image and text should…

Computer Vision and Pattern Recognition · Computer Science 2020-09-24 Christopher Thomas , Adriana Kovashka

Document Structure Measure for Hypernym discovery

Hypernym discovery is the problem of finding terms that have is-a relationship with a given term. We introduce a new context type, and a relatedness measure to differentiate hypernyms from other types of semantic relationships. Our Document…

Computation and Language · Computer Science 2018-12-03 Aswin Kannan , Shanmukha C Guttula , Balaji Ganesan , Hima P Karanam , Arun Kumar

A Concept-Centered Hypertext Approach to Case-Based Retrieval

The goal of case-based retrieval is to assist physicians in the clinical decision making process, by finding relevant medical literature in large archives. We propose a research that aims at improving the effectiveness of case-based…

Information Retrieval · Computer Science 2018-11-28 Stefano Marchesin

Understanding and representing the semantics of large structured documents

Understanding large, structured documents like scholarly articles, requests for proposals or business reports is a complex and difficult task. It involves discovering a document's overall purpose and subject(s), understanding the function…

Computation and Language · Computer Science 2018-07-27 Muhammad Mahbubur Rahman , Tim Finin

Modeling Structural Similarities between Documents for Coherence Assessment with Graph Convolutional Networks

Coherence is an important aspect of text quality, and various approaches have been applied to coherence modeling. However, existing methods solely focus on a single document's coherence patterns, ignoring the underlying correlation between…

Computation and Language · Computer Science 2023-06-13 Wei Liu , Xiyan Fu , Michael Strube

Learning from similarity and information extraction from structured documents

The automation of document processing is gaining recent attention due to the great potential to reduce manual work through improved methods and hardware. Neural networks have been successfully applied before - even though they have been…

Computation and Language · Computer Science 2021-06-15 Martin Holeček

Embedding Semantic Relations into Word Representations

Learning representations for semantic relations is important for various tasks such as analogy detection, relational search, and relation classification. Although there have been several proposals for learning representations for individual…

Computation and Language · Computer Science 2015-05-04 Danushka Bollegala , Takanori Maehara , Ken-ichi Kawarabayashi

Structural Text Segmentation of Legal Documents

The growing complexity of legal cases has lead to an increasing interest in legal information retrieval systems that can effectively satisfy user-specific information needs. However, such downstream systems typically require documents to be…

Computation and Language · Computer Science 2021-05-18 Dennis Aumiller , Satya Almasian , Sebastian Lackner , Michael Gertz

Structured Knowledge Representation for Image Retrieval

We propose a structured approach to the problem of retrieval of images by content and present a description logic that has been devised for the semantic indexing and retrieval of images containing complex objects. As other approaches do, we…

Artificial Intelligence · Computer Science 2011-09-08 E. Di Sciascio , F. M. Donini , M. Mongiello

Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the…

Computation and Language · Computer Science 2014-06-17 Misha Denil , Alban Demiraj , Nal Kalchbrenner , Phil Blunsom , Nando de Freitas