Related papers: Document classification methods

Text Classification: A Perspective of Deep Learning Methods

In recent years, with the rapid development of information on the Internet, the number of complex texts and documents has increased exponentially, which requires a deeper understanding of deep learning methods in order to accurately…

Computation and Language · Computer Science 2023-09-26 Zhongwei Wan

Classifying text using machine learning models and determining conversation drift

Text classification helps analyse texts for semantic meaning and relevance, by mapping the words against this hierarchy. An analysis of various types of texts is invaluable to understanding both their semantic meaning, as well as their…

Machine Learning · Computer Science 2022-11-16 Chaitanya Chadha , Vandit Gupta , Deepak Gupta , Ashish Khanna

Web Content Classification: A Survey

As the information contained within the web is increasing day by day, organizing this information could be a necessary requirement.The data mining process is to extract information from a data set and transform it into an understandable…

Information Retrieval · Computer Science 2014-05-22 Prabhjot Kaur

Text Classification Algorithms: A Survey

In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine…

Machine Learning · Computer Science 2020-05-21 Kamran Kowsari , Kiana Jafari Meimandi , Mojtaba Heidarysafa , Sanjana Mendu , Laura E. Barnes , Donald E. Brown

A Comprehensive Survey of Text Classification Techniques and Their Research Applications: Observational and Experimental Insights

The exponential growth of textual data presents substantial challenges in management and analysis, notably due to high storage and processing costs. Text classification, a vital aspect of text mining, provides robust solutions by enabling…

Computation and Language · Computer Science 2025-01-22 Kamal Taha , Paul D. Yoo , Chan Yeun , Aya Taha

Text Classification using Data Mining

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman , Farhana Haider , Ahmed Ryadh Hasan

Machine learning approach for text and document mining

Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a…

Information Retrieval · Computer Science 2014-06-09 Vishwanath Bijalwan , Pinki Kumari , Jordan Pascual , Vijay Bhaskar Semwal

Text Classification using Artificial Intelligence

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman

Test Model for Text Categorization and Text Summarization

Text Categorization is the task of automatically sorting a set of documents into categories from a predefined set and Text Summarization is a brief and accurate representation of input text such that the output covers the most important…

Information Retrieval · Computer Science 2013-05-14 Khushboo Thakkar , Urmila Shrawankar

The Effectiveness of Classification on Information Retrieval System (Case Study)

Large amount of unstructured designed information is difficult to deal with. Obtaining specific information is a hard mission and takes a lot of time. Information Retrieval System (IR) is a way to solve this kind of problem. IR is a good…

Information Retrieval · Computer Science 2018-04-03 Maher Abdullah , Mohammed GH. I. Al Zamil

Text Classification and Distributional features techniques in Datamining and Warehousing

Text Categorization is traditionally done by using the term frequency and inverse document frequency.This type of method is not very good because, some words which are not so important may appear in the document .The term frequency of…

Information Retrieval · Computer Science 2016-11-25 Srikanth Bethu , G Charless Babu , J Vinoda , E Priyadarshini , M Raghavendra rao

A Novel Approach to Document Classification using WordNet

Content based Document Classification is one of the biggest challenges in the context of free text mining. Current algorithms on document classifications mostly rely on cluster analysis based on bag-of-words approach. However that method is…

Information Retrieval · Computer Science 2015-12-15 Koushiki Sarkar , Ritwika Law

Machine Learning in Automated Text Categorization

The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize…

Information Retrieval · Computer Science 2021-09-21 Fabrizio Sebastiani

A survey on phrase structure learning methods for text classification

Text classification is a task of automatic classification of text into one of the predefined categories. The problem of text classification has been widely studied in different communities like natural language processing, data mining and…

Computation and Language · Computer Science 2014-06-24 Reshma Prasad , Mary Priya Sebastian

A hybrid learning algorithm for text classification

Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper…

Neural and Evolutionary Computing · Computer Science 2010-09-27 S. M. Kamruzzaman , Farhana Haider

Efficient Classification of Long Documents Using Transformers

Several methods have been proposed for classifying long textual documents using Transformers. However, there is a lack of consensus on a benchmark to enable a fair comparison among different approaches. In this paper, we provide a…

Computation and Language · Computer Science 2022-03-23 Hyunji Hayley Park , Yogarshi Vyas , Kashif Shah

Survey on Multi-Document Summarization: Systematic Literature Review

In this era of information technology, abundant information is available on the internet in the form of web pages and documents on any given topic. Finding the most relevant and informative content out of these huge number of documents,…

Computers and Society · Computer Science 2023-12-21 Uswa Ihsan , Humaira Ashraf , NZ Jhanjhi

A Multi-Modal Multilingual Benchmark for Document Image Classification

Document image classification is different from plain-text document classification and consists of classifying a document by understanding the content and structure of documents such as forms, emails, and other such documents. We show that…

Computation and Language · Computer Science 2023-10-26 Yoshinari Fujinuma , Siddharth Varia , Nishant Sankaran , Srikar Appalaraju , Bonan Min , Yogarshi Vyas

HDLTex: Hierarchical Deep Learning for Text Classification

The continually increasing number of documents produced each year necessitates ever improving information processing methods for searching, retrieving, and organizing text. Central to these information processing methods is document…

Machine Learning · Computer Science 2018-03-29 Kamran Kowsari , Donald E. Brown , Mojtaba Heidarysafa , Kiana Jafari Meimandi , Matthew S. Gerber , Laura E. Barnes

DOC: Deep Open Classification of Text Documents

Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in…

Computation and Language · Computer Science 2017-09-27 Lei Shu , Hu Xu , Bing Liu