Related papers: Improving Term Extraction Using Particle Swarm Opt…
Automatic text summarization tools have a great impact on many fields, such as medicine, law, and scientific research in general. As information overload increases, automatic summaries allow handling the growing volume of documents, usually…
Terminology extraction, also known as term extraction, is a subtask of information extraction. The goal of terminology extraction is to extract relevant words or phrases from a given corpus automatically. This paper focuses on the…
Purpose: Terminology is the set of technical words or expressions used in specific contexts, which denotes the core concept in a formal discipline and is usually applied in the fields of machine translation, information retrieval,…
A key data preparation step in Text Mining, Term Extraction selects the terms, or collocation of words, attached to specific concepts. In this paper, the task of extracting relevant collocations is achieved through a supervised learning…
The project, under industrial funding, presented in this publication aims at the semantic analysis of a normative document describing requirements applicable to electrical appliances. The objective of the project is to build a semantic…
Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. The difficulty or impossibility of customising them to new domains is an additional…
Automated terminology extraction refers to the task of extracting meaningful terms from domain-specific texts. This paper proposes a novel machine learning approach to terminology extraction, which combines features from traditional term…
Particle swarm optimization is a popular method for solving difficult optimization problems. There have been attempts to formulate the method in formal probabilistic or stochastic terms (e.g. bare bones particle swarm) with the aim to…
This report presents an empirical evaluation of four algorithms for automatically extracting keywords and keyphrases from documents. The four algorithms are compared using five different collections of documents. For each document, we have…
Automatic term extraction (ATE) is a Natural Language Processing (NLP) task that eases the effort of manually identifying terms from domain-specific corpora by providing a list of candidate terms. As units of knowledge in a specific field…
Keyword extraction is the process of identifying the words or phrases that express the main concepts of text to the best of one's ability. Electronic infrastructure creates a considerable amount of text every day and at all times. This…
Agriculture is a key component in any country's development. Domain-specific knowledge resources serve to gain insight into the domain. Existing knowledge resources such as AGROVOC and NAL Thesaurus are developed and maintained by the…
Keyphrase extraction is a textual information processing task concerned with the automatic extraction of representative and characteristic phrases from a document that express all the key aspects of its content. Keyphrases constitute a…
Tracking developments in the highly dynamic data-technology landscape are vital to keeping up with novel technologies and tools, in the various areas of Artificial Intelligence (AI). However, It is difficult to keep track of all the…
We present data augmentation techniques for process extraction tasks in scientific publications. We cast the process extraction task as a sequence labeling task where we identify all the entities in a sentence and label them according to…
Keyword extraction is a fundamental task in natural language processing that facilitates mapping of documents to a concise set of representative single and multi-word phrases. Keywords from text documents are primarily extracted using…
In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to…
A typical IR system that delivers and stores information is affected by problem of matching between user query and available content on web. Use of Ontology represents the extracted terms in form of network graph consisting of nodes, edges,…
In recent years, many recommender systems have utilized textual data for topic extraction to enhance interpretability. However, our findings reveal a noticeable deficiency in the coherence of keywords within topics, resulting in low…
Keyphrase extraction from a given document is the task of automatically extracting salient phrases that best describe the document. This paper proposes a novel unsupervised graph-based ranking method to extract high-quality phrases from a…