Related papers: Tools for Terminology Processing
We are presenting a set of multilingual text analysis tools that can help analysts in any field to explore large document collections quickly in order to determine whether the documents contain information of interest, and to find the…
This paper describes a new system for semi-automatically building, extending and managing a terminological thesaurus---a multilingual terminology dictionary enriched with relationships between the terms themselves to form a thesaurus. The…
This paper is devoted to the extraction of entities and semantic relations between them from scientific texts, where we consider scientific terms as entities. In this paper, we present a dataset that includes annotations for two tasks and…
This document concerns data readiness in the context of machine learning and Natural Language Processing. It describes how an organization may proceed to identify, make available, validate, and prepare data to facilitate automated analysis…
Automatic term extraction (ATE) is a Natural Language Processing (NLP) task that eases the effort of manually identifying terms from domain-specific corpora by providing a list of candidate terms. As units of knowledge in a specific field…
Historical Document Processing is the process of digitizing written material from the past for future use by historians and other scholars. It incorporates algorithms and software tools from various subfields of computer science, including…
Grading of examination papers is a hectic, time-labor intensive task and is often subjected to inefficiency and bias in checking. This research project is a primitive experiment in the automation of grading of theoretical answers written in…
Consolidated access to current and reliable terms from different subject fields and languages is necessary for content creators and translators. Terminology is also needed in AI applications such as machine translation, speech recognition,…
The project, under industrial funding, presented in this publication aims at the semantic analysis of a normative document describing requirements applicable to electrical appliances. The objective of the project is to build a semantic…
Over the last ten years, we have seen a significant increase in industrial data, tremendous improvement in computational power, and major theoretical advances in machine learning. This opens up an opportunity to use modern machine learning…
Citation analysis is one of the most frequently used methods in research evaluation. We are seeing significant growth in citation analysis through bibliometric metadata, primarily due to the availability of citation databases such as the…
Modern information systems are changing the idea of "data processing" to the idea of "concept processing", meaning that instead of processing words, such systems process semantic concepts which carry meaning and share contexts with other…
The term legal research generally refers to the process of identifying and retrieving appropriate information necessary to support legal decision making from past case records. At present, the process is mostly manual, but some traditional…
Purpose: Terminology is the set of technical words or expressions used in specific contexts, which denotes the core concept in a formal discipline and is usually applied in the fields of machine translation, information retrieval,…
Natural language processing tools have become frequently used in social sciences such as economics, political science, and sociology. Many publications apply topic modeling to elicit latent topics in text corpora and their development over…
Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However,…
Automatically recognized terminology is widely used for various domain-specific texts processing tasks, such as machine translation, information retrieval or sentiment analysis. However, there is still no agreement on which methods are best…
We present an approach for detecting multiword phrases in mathematical text corpora. The method used is based on characteristic features of mathematical terminology. It makes use of a software tool named Lingo which allows to identify words…
We describe our ongoing research that centres on the application of natural language processing (NLP) to software engineering and systems development activities. In particular, this paper addresses the use of NLP in the requirements…
Autoformalization has emerged as a term referring to the automation of formalization - specifically, the formalization of mathematics using interactive theorem provers (proof assistants). Its rapid development has been driven by progress in…