Related papers: An Automated Text Categorization Framework based o…

A pipeline and comparative study of 12 machine learning models for text classification

Text-based communication is highly favoured as a communication method, especially in business environments. As a result, it is often abused by sending malicious messages, e.g., spam emails, to deceive users into relaying personal…

Information Retrieval · Computer Science 2022-04-14 Annalisa Occhipinti , Louis Rogers , Claudio Angione

Text Categorization via Similarity Search: An Efficient and Effective Novel Algorithm

We present a supervised learning algorithm for text categorization which has brought the team of authors the 2nd place in the text categorization division of the 2012 Cybersecurity Data Mining Competition (CDMC'2012) and a 3rd prize…

Information Retrieval · Computer Science 2013-07-11 Hubert Haoyang Duan , Vladimir Pestov , Varun Singla

Machine Learning in Automated Text Categorization

The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize…

Information Retrieval · Computer Science 2021-09-21 Fabrizio Sebastiani

Minimally Supervised Categorization of Text with Metadata

Document categorization, which aims to assign a topic label to each document, plays a fundamental role in a wide variety of applications. Despite the success of existing studies in conventional supervised document classification, they are…

Computation and Language · Computer Science 2023-10-24 Yu Zhang , Yu Meng , Jiaxin Huang , Frank F. Xu , Xuan Wang , Jiawei Han

Text Classification using Data Mining

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman , Farhana Haider , Ahmed Ryadh Hasan

Text Classification using Artificial Intelligence

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman

A survey on phrase structure learning methods for text classification

Text classification is a task of automatic classification of text into one of the predefined categories. The problem of text classification has been widely studied in different communities like natural language processing, data mining and…

Computation and Language · Computer Science 2014-06-24 Reshma Prasad , Mary Priya Sebastian

Many-Class Text Classification with Matching

In this work, we formulate \textbf{T}ext \textbf{C}lassification as a \textbf{M}atching problem between the text and the labels, and propose a simple yet effective framework named TCM. Compared with previous text classification approaches,…

Computation and Language · Computer Science 2022-05-24 Yi Song , Yuxian Gu , Minlie Huang

Text classification in shipping industry using unsupervised models and Transformer based supervised models

Obtaining labelled data in a particular context could be expensive and time consuming. Although different algorithms, including unsupervised learning, semi-supervised learning, self-learning have been adopted, the performance of text…

Computation and Language · Computer Science 2022-12-26 Ying Xie , Dongping Song

Machine learning approach for text and document mining

Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a…

Information Retrieval · Computer Science 2014-06-09 Vishwanath Bijalwan , Pinki Kumari , Jordan Pascual , Vijay Bhaskar Semwal

Classifying text using machine learning models and determining conversation drift

Text classification helps analyse texts for semantic meaning and relevance, by mapping the words against this hierarchy. An analysis of various types of texts is invaluable to understanding both their semantic meaning, as well as their…

Machine Learning · Computer Science 2022-11-16 Chaitanya Chadha , Vandit Gupta , Deepak Gupta , Ashish Khanna

Adaptable and Reliable Text Classification using Large Language Models

Text classification is fundamental in Natural Language Processing (NLP), and the advent of Large Language Models (LLMs) has revolutionized the field. This paper introduces an adaptable and reliable text classification paradigm, which…

Computation and Language · Computer Science 2024-12-10 Zhiqiang Wang , Yiran Pang , Yanbin Lin , Xingquan Zhu

A Comprehensive Survey of Text Classification Techniques and Their Research Applications: Observational and Experimental Insights

The exponential growth of textual data presents substantial challenges in management and analysis, notably due to high storage and processing costs. Text classification, a vital aspect of text mining, provides robust solutions by enabling…

Computation and Language · Computer Science 2025-01-22 Kamal Taha , Paul D. Yoo , Chan Yeun , Aya Taha

Low-Resource Fast Text Classification Based on Intra-Class and Inter-Class Distance Calculation

In recent years, text classification methods based on neural networks and pre-trained models have gained increasing attention and demonstrated excellent performance. However, these methods still have some limitations in practical…

Computation and Language · Computer Science 2024-12-16 Yanxu Mao , Peipei Liu , Tiehan Cui , Congying Liu , Datao You

Expanding the Text Classification Toolbox with Cross-Lingual Embeddings

Most work in text classification and Natural Language Processing (NLP) focuses on English or a handful of other languages that have text corpora of hundreds of millions of words. This is creating a new version of the digital divide: the…

Computation and Language · Computer Science 2019-03-28 Meryem M'hamdi , Robert West , Andreea Hossmann , Michael Baeriswyl , Claudiu Musat

Performance Analysis of Supervised Machine Learning Algorithms for Text Classification

The demand for text classification is growing significantly in web searching, data mining, web ranking, recommendation systems, and so many other fields of information and technology. This paper illustrates the text classification process…

Computation and Language · Computer Science 2025-09-03 Sadia Zaman Mishu , S M Rafiuddin

Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or…

Computation and Language · Computer Science 2023-09-26 Muberra Ozmen , Joseph Cotnareanu , Mark Coates

LLM-as-classifier: Semi-Supervised, Iterative Framework for Hierarchical Text Classification using Large Language Models

The advent of Large Language Models (LLMs) has provided unprecedented capabilities for analyzing unstructured text data. However, deploying these models as reliable, robust, and scalable classifiers in production environments presents…

Computation and Language · Computer Science 2025-08-25 Doohee You , Andy Parisi , Zach Vander Velden , Lara Dantas Inojosa

On the Automated Classification of Web Sites

In this paper we discuss several issues related to automated text classification of web sites. We analyze the nature of web content and metadata in relation to requirements for text features. We find that HTML metatags are a good source of…

Information Retrieval · Computer Science 2007-05-23 John M. Pierre

Investigating the Working of Text Classifiers

Text classification is one of the most widely studied tasks in natural language processing. Motivated by the principle of compositionality, large multilayer neural network models have been employed for this task in an attempt to effectively…

Computation and Language · Computer Science 2018-08-07 Devendra Singh Sachan , Manzil Zaheer , Ruslan Salakhutdinov