Related papers: Explaining Relationships Between Scientific Docume…

OpenMSD: Towards Multilingual Scientific Documents Similarity Measurement

We develop and evaluate multilingual scientific documents similarity measurement models in this work. Such models can be used to find related works in different languages, which can help multilingual researchers find and explore papers more…

Computation and Language · Computer Science 2023-09-20 Yang Gao , Ji Ma , Ivan Korotkov , Keith Hall , Dana Alon , Don Metzler

Explaining Relationships Among Research Papers

Due to the rapid pace of research publications, keeping up to date with all the latest related papers is very time-consuming, even with daily feed tools. There is a need for automatically generated, short, customized literature reviews of…

Computation and Language · Computer Science 2025-05-19 Xiangci Li , Jessica Ouyang

A Scientific Information Extraction Dataset for Nature Inspired Engineering

Nature has inspired various ground-breaking technological developments in applications ranging from robotics to aerospace engineering and the manufacturing of medical devices. However, accessing the information captured in scientific…

Computation and Language · Computer Science 2020-05-27 Ruben Kruiper , Julian F. V. Vincent , Jessica Chen-Burger , Marc P. Y. Desmulliez , Ioannis Konstas

Matching Article Pairs with Graphical Decomposition and Convolutions

Identifying the relationship between two articles, e.g., whether two articles published from different sources describe the same breaking news, is critical to many document understanding tasks. Existing approaches for modeling and matching…

Computation and Language · Computer Science 2019-05-29 Bang Liu , Di Niu , Haojie Wei , Jinghong Lin , Yancheng He , Kunfeng Lai , Yu Xu

Machine Understanding of Scientific Language

Scientific information expresses human understanding of nature. This knowledge is largely disseminated in different forms of text, including scientific papers, news articles, and discourse among people on social media. While important for…

Computation and Language · Computer Science 2025-07-01 Dustin Wright

Unfolding the Structure of a Document using Deep Learning

Understanding and extracting of information from large documents, such as business opportunities, academic articles, medical documents and technical reports, poses challenges not present in short documents. Such large documents may be…

Computation and Language · Computer Science 2019-10-10 Muhammad Mahbubur Rahman , Tim Finin

Learning Semantic Correspondences in Technical Documentation

We consider the problem of translating high-level textual descriptions to formal representations in technical documentation as part of an effort to model the meaning of such documentation. We focus specifically on the problem of learning…

Computation and Language · Computer Science 2017-09-18 Kyle Richardson , Jonas Kuhn

Large Language Models for Full-Text Methods Assessment: A Case Study on Mediation Analysis

Systematic reviews are crucial for synthesizing scientific evidence but remain labor-intensive, especially when extracting detailed methodological information. Large language models (LLMs) offer potential for automating methodological…

Computation and Language · Computer Science 2025-10-14 Wenqing Zhang , Trang Nguyen , Elizabeth A. Stuart , Yiqun T. Chen

Entity Recognition and Relation Extraction from Scientific and Technical Texts in Russian

This paper is devoted to the study of methods for information extraction (entity recognition and relation classification) from scientific texts on information technology. Scientific publications provide valuable information into…

Computation and Language · Computer Science 2020-12-29 Elena Bruches , Alexey Pauls , Tatiana Batura , Vladimir Isachenko

Matching with Text Data: An Experimental Evaluation of Methods for Matching Documents and of Measuring Match Quality

Matching for causal inference is a well-studied problem, but standard methods fail when the units to match are text documents: the high-dimensional and rich nature of the data renders exact matching infeasible, causes propensity scores to…

Methodology · Statistics 2019-03-15 Reagan Mozer , Luke Miratrix , Aaron Russell Kaufman , L. Jason Anastasopoulos

Inferring Scientific Cross-Document Coreference and Hierarchy with Definition-Augmented Relational Reasoning

We address the fundamental task of inferring cross-document coreference and hierarchy in scientific texts, which has important applications in knowledge graph construction, search, recommendation and discovery. Large Language Models (LLMs)…

Computation and Language · Computer Science 2026-02-04 Lior Forer , Tom Hope

Mining and searching association relation of scientific papers based on deep learning

There is a complex correlation among the data of scientific papers. The phenomenon reveals the data characteristics, laws, and correlations contained in the data of scientific and technological papers in specific fields, which can realize…

Digital Libraries · Computer Science 2022-04-26 Jie Song , Meiyu Liang , Zhe Xue , Feifei Kou , Ang Li

Coherence-Based Distributed Document Representation Learning for Scientific Documents

Distributed document representation is one of the basic problems in natural language processing. Currently distributed document representation methods mainly consider the context information of words or sentences. These methods do not take…

Computation and Language · Computer Science 2022-01-11 Shicheng Tan , Shu Zhao , Yanping Zhang

An Interdisciplinary Outlook on Large Language Models for Scientific Research

In this paper, we describe the capabilities and constraints of Large Language Models (LLMs) within disparate academic disciplines, aiming to delineate their strengths and limitations with precision. We examine how LLMs augment scientific…

Computation and Language · Computer Science 2023-11-10 James Boyko , Joseph Cohen , Nathan Fox , Maria Han Veiga , Jennifer I-Hsiu Li , Jing Liu , Bernardo Modenesi , Andreas H. Rauch , Kenneth N. Reid , Soumi Tribedi , Anastasia Visheratina , Xin Xie

ATLANTIC: Structure-Aware Retrieval-Augmented Language Model for Interdisciplinary Science

Large language models record impressive performance on many natural language processing tasks. However, their knowledge capacity is limited to the pretraining corpus. Retrieval augmentation offers an effective solution by retrieving context…

Computation and Language · Computer Science 2023-11-22 Sai Munikoti , Anurag Acharya , Sridevi Wagle , Sameera Horawalavithana

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle…

Computer Vision and Pattern Recognition · Computer Science 2024-09-12 Renqiu Xia , Song Mao , Xiangchao Yan , Hongbin Zhou , Bo Zhang , Haoyang Peng , Jiahao Pi , Daocheng Fu , Wenjie Wu , Hancheng Ye , Shiyang Feng , Bin Wang , Chao Xu , Conghui He , Pinlong Cai , Min Dou , Botian Shi , Sheng Zhou , Yongwei Wang , Bin Wang , Junchi Yan , Fei Wu , Yu Qiao

Understanding the Logical and Semantic Structure of Large Documents

Current language understanding approaches focus on small documents, such as newswire articles, blog posts, product reviews and discussion forum entries. Understanding and extracting information from large documents like legal briefs,…

Computation and Language · Computer Science 2017-09-05 Muhammad Mahbubur Rahman , Tim Finin

Towards a Semantic Search Engine for Scientific Articles

Because of the data deluge in scientific publication, finding relevant information is getting harder and harder for researchers and readers. Building an enhanced scientific search engine by taking semantic relations into account poses a…

Information Retrieval · Computer Science 2017-09-29 Bastien Latard , Jonathan Weber , Germain Forestier , Michel Hassenforder

An Informational Space Based Semantic Analysis for Scientific Texts

One major problem in Natural Language Processing is the automatic analysis and representation of human language. Human language is ambiguous and deeper understanding of semantics and creating human-to-machine interaction have required an…

Computation and Language · Computer Science 2022-06-01 Neslihan Suzen , Alexander N. Gorban , Jeremy Levesley , Evgeny M. Mirkes

Explainable Semantic Text Relations: A Question-Answering Framework for Comparing Document Content

Understanding semantic relations between two texts is crucial for many information and document management tasks, in which one must determine whether the content fully overlaps, is completely superseded by another document, or overlaps only…

Computation and Language · Computer Science 2025-12-02 Yehudit Aperstein , Alon Gottlib , Gal Benita , Alexander Apartsin