Related papers: Reference String Extraction Using Line-Based Condi…
Citation parsing is fundamental for search engines within academia and the protection of intellectual property. Meticulous extraction is further needed when evaluating the similarity of documents and calculating their citation impact.…
Although information extraction and coreference resolution appear together in many applications, most current systems perform them as ndependent steps. This paper describes an approach to integrated inference for extraction and coreference…
Accurately segmenting a citation string into fields for authors, titles, etc. is a challenging task because the output typically obeys various global constraints. Previous work has shown that modeling soft constraints, where the model is…
In this paper, we show the possibility of using a linear Conditional Random Fields (CRF) for terminology extraction from a specialized text corpus.
A procedure for bibliographic author metadata extraction from scholarly texts is presented. The author segments are identified based on capitalization and line break patterns. Two main author layout templates, which can retrieve from a…
Field normalized citation rates are well-established indicators for research performance from the broadest aggregation levels such as countries, down to institutes and research teams. When applied to still more specialized publication sets…
Citation matching is a challenging task due to different problems such as the variety of citation styles, mistakes in reference strings and the quality of identified reference segments. The classic citation matching configuration used in…
In recent years extracting relevant information from biomedical and clinical texts such as research articles, discharge summaries, or electronic health records have been a subject of many research efforts and shared challenges. Relation…
This paper presents the results of research on supervised extractive text summarisation for scientific articles. We show that a simple sequential tagging model based only on the text within a document achieves high results against a simple…
A key data preparation step in Text Mining, Term Extraction selects the terms, or collocation of words, attached to specific concepts. In this paper, the task of extracting relevant collocations is achieved through a supervised learning…
In this paper we present a general method for information extraction that exploits the features of data compression techniques. We first define and focus our attention on the so-called "dictionary" of a sequence. Dictionaries are…
Citation recommendation describes the task of recommending citations for a given text. Due to the overload of published scientific works in recent years on the one hand, and the need to cite the most appropriate publications when writing…
We present data augmentation techniques for process extraction tasks in scientific publications. We cast the process extraction task as a sequence labeling task where we identify all the entities in a sentence and label them according to…
With the development of Internet technology, the phenomenon of information overload is becoming more and more obvious. It takes a lot of time for users to obtain the information they need. However, keyphrases that summarize document…
Researchers and scientists increasingly find themselves in the position of having to quickly understand large amounts of technical material. Our goal is to effectively serve this need by using bibliometric text mining and summarization…
Relation Extraction (RE) aims to label relations between groups of marked entities in raw text. Most current RE models learn context-aware representations of the target entities that are then used to establish relation between them. This…
Every field of research consists of multiple application areas with various techniques routinely used to solve problems in these wide range of application areas. With the exponential growth in research volumes, it has become difficult to…
Automatically extracting key information from scientific documents has the potential to help scientists work more efficiently and accelerate the pace of scientific progress. Prior work has considered extracting document-level entity…
Most existing work on event extraction has focused on sentence-level texts and presumes the identification of a trigger-span -- a word or phrase in the input that evokes the occurrence of an event of interest. Event arguments are then…
Citation recommendation is intended to assist researchers in the process of searching for relevant papers to cite by recommending appropriate citations for a given input text. Existing test collections for this task are noisy and unreliable…