Related papers: Semantic classifier approach to document classific…
Information on different fields which are collected by users requires appropriate management and organization to be structured in a standard way and retrieved fast and more easily. Document classification is a conventional method to…
We propose to model the text classification process as a sequential decision process. In this process, an agent learns to classify documents into topics while reading the document sentences sequentially and learns to stop as soon as enough…
Comparing document semantics is one of the toughest tasks in both Natural Language Processing and Information Retrieval. To date, on one hand, the tools for this task are still rare. On the other hand, most relevant methods are devised from…
Content based Document Classification is one of the biggest challenges in the context of free text mining. Current algorithms on document classifications mostly rely on cluster analysis based on bag-of-words approach. However that method is…
Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper…
Text alignment finds application in tasks such as citation recommendation and plagiarism detection. Existing alignment methods operate at a single, predefined level and cannot learn to align texts at, for example, sentence and document…
Different semantic interpretation tasks such as text entailment and question answering require the classification of semantic relations between terms or entities within text. However, in most cases it is not possible to assign a direct…
In this work, we formulate \textbf{T}ext \textbf{C}lassification as a \textbf{M}atching problem between the text and the labels, and propose a simple yet effective framework named TCM. Compared with previous text classification approaches,…
Developed so far, multi-document summarization has reached its bottleneck due to the lack of sufficient training data and diverse categories of documents. Text classification just makes up for these deficiencies. In this paper, we propose a…
Sentence extraction based summarization methods has some limitations as it doesn't go into the semantics of the document. Also, it lacks the capability of sentence generation which is intuitive to humans. Here we present a novel method to…
We propose in this paper a new, hybrid document embedding approach in order to address the problem of document similarities with respect to the technical content. To do so, we employ a state-of-the-art graph techniques to first extract the…
Text classification helps analyse texts for semantic meaning and relevance, by mapping the words against this hierarchy. An analysis of various types of texts is invaluable to understanding both their semantic meaning, as well as their…
Insufficient or even unavailable training data of emerging classes is a big challenge of many classification tasks, including text classification. Recognising text documents of classes that have never been seen in the learning stage,…
Automatic text categorization is a complex and useful task for many natural language processing applications. Recent approaches to text categorization focus more on algorithms than on resources involved in this operation. In contrast to…
The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize…
Text document classification is an important task for diverse natural language processing based applications. Traditional machine learning approaches mainly focused on reducing dimensionality of textual data to perform classification. This…
Text categorization is the task of assigning labels to documents written in a natural language, and it has numerous real-world applications including sentiment analysis as well as traditional topic assignment tasks. In this paper, we…
Semantic search technology has received more attention in the last years. Compared with the keyword based search, semantic search is used to excavate the latent semantics information and help users find the information items that they want…
Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…
This study proposes a text classification algorithm based on large language models, aiming to address the limitations of traditional methods in capturing long-range dependencies, understanding contextual semantics, and handling class…