Related papers: Text Ranking and Classification using Data Compres…

Embedding Compression for Text Classification Using Dictionary Screening

In this paper, we propose a dictionary screening method for embedding compression in text classification tasks. The key purpose of this method is to evaluate the importance of each keyword in the dictionary. To this end, we first train a…

Computation and Language · Computer Science 2022-11-24 Jing Zhou , Xinru Jing , Muyu Liu , Hansheng Wang

Semantic Text Compression for Classification

We study semantic compression for text where meanings contained in the text are conveyed to a source decoder, e.g., for classification. The main motivator to move to such an approach of recovering the meaning without requiring exact…

Information Theory · Computer Science 2023-09-20 Emrecan Kutay , Aylin Yener

FastText.zip: Compressing text classification models

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory. After considering different solutions inspired by the hashing literature, we propose a method…

Computation and Language · Computer Science 2016-12-19 Armand Joulin , Edouard Grave , Piotr Bojanowski , Matthijs Douze , Hérve Jégou , Tomas Mikolov

On the role of autocorrelations in texts

The task of finding a criterion allowing to distinguish a text from an arbitrary set of words is rather relevant in itself, for instance, in the aspect of development of means for internet-content indexing or separating signals and noise in…

Computation and Language · Computer Science 2007-10-02 D. V. Lande , A. A. Snarskii

Classifying text using machine learning models and determining conversation drift

Text classification helps analyse texts for semantic meaning and relevance, by mapping the words against this hierarchy. An analysis of various types of texts is invaluable to understanding both their semantic meaning, as well as their…

Machine Learning · Computer Science 2022-11-16 Chaitanya Chadha , Vandit Gupta , Deepak Gupta , Ashish Khanna

Text Classification with Compression Algorithms

This work concerns a comparison of SVM kernel methods in text categorization tasks. In particular I define a kernel function that estimates the similarity between two objects computing by their compressed lengths. In fact, compression…

Machine Learning · Computer Science 2012-10-30 Antonio Giuliano Zippo

Investigating the Working of Text Classifiers

Text classification is one of the most widely studied tasks in natural language processing. Motivated by the principle of compositionality, large multilayer neural network models have been employed for this task in an attempt to effectively…

Computation and Language · Computer Science 2018-08-07 Devendra Singh Sachan , Manzil Zaheer , Ruslan Salakhutdinov

A Novel Approach to Compress Centralized Text Data using Indexed Dictionary

Data compression is very important feature in terms of saving the memory space. In this proposal, an indexed dictionary based compression is used for text data, where the word's reference in dictionary is used for compression. This approach…

Other Computer Science · Computer Science 2015-12-23 Vivek Dimri , Prof. Ranjit Biswas

Text Classification Algorithms: A Survey

In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine…

Machine Learning · Computer Science 2020-05-21 Kamran Kowsari , Kiana Jafari Meimandi , Mojtaba Heidarysafa , Sanjana Mendu , Laura E. Barnes , Donald E. Brown

Text Classification with Few Examples using Controlled Generalization

Training data for text classification is often limited in practice, especially for applications with many output classes or involving many related classification problems. This means classifiers must generalize from limited evidence, but…

Computation and Language · Computer Science 2020-05-19 Abhijit Mahabal , Jason Baldridge , Burcu Karagol Ayan , Vincent Perot , Dan Roth

Text Classification using Data Mining

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman , Farhana Haider , Ahmed Ryadh Hasan

Integrating a Lexical Database and a Training Collection for Text Categorization

Automatic text categorization is a complex and useful task for many natural language processing applications. Recent approaches to text categorization focus more on algorithms than on resources involved in this operation. In contrast to…

cmp-lg · Computer Science 2008-02-03 Jose Maria Gomez Hidalgo , Manuel de Buenaga Rodriguez

Text classification using machine learning methods

In this paper we present the results of an experiment aimed to use machine learning methods to obtain models that can be used for the automatic classification of products. In order to apply automatic classification methods, we transformed…

Computation and Language · Computer Science 2025-02-28 Bogdan Oancea

Lightweight Conceptual Dictionary Learning for Text Classification Using Information Compression

We propose a novel, lightweight supervised dictionary learning framework for text classification based on data compression and representation. This two-phase algorithm initially employs the Lempel-Ziv-Welch (LZW) algorithm to construct a…

Computation and Language · Computer Science 2024-05-06 Li Wan , Tansu Alpcan , Margreta Kuijper , Emanuele Viterbo

A Comprehensive Survey of Text Classification Techniques and Their Research Applications: Observational and Experimental Insights

The exponential growth of textual data presents substantial challenges in management and analysis, notably due to high storage and processing costs. Text classification, a vital aspect of text mining, provides robust solutions by enabling…

Computation and Language · Computer Science 2025-01-22 Kamal Taha , Paul D. Yoo , Chan Yeun , Aya Taha

Improve Text Classification Accuracy with Intent Information

Text classification, a core component of task-oriented dialogue systems, attracts continuous research from both the research and industry community, and has resulted in tremendous progress. However, existing method does not consider the use…

Computation and Language · Computer Science 2022-12-16 Yifeng Xie

Reranking with Compressed Document Representation

Reranking, the process of refining the output of a first-stage retriever, is often considered computationally expensive, especially with Large Language Models. Borrowing from recent advances in document compression for RAG, we reduce the…

Information Retrieval · Computer Science 2025-05-22 Hervé Déjean , Stéphane Clinchant

Ranking LLMs by compression

We conceptualize the process of understanding as information compression, and propose a method for ranking large language models (LLMs) based on lossless data compression. We demonstrate the equivalence of compression length under…

Artificial Intelligence · Computer Science 2024-06-21 Peijia Guo , Ziguang Li , Haibo Hu , Chao Huang , Ming Li , Rui Zhang

Text Classification using Artificial Intelligence

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman

Text Classification using Association Rule with a Hybrid Concept of Naive Bayes Classifier and Genetic Algorithm

Text classification is the automated assignment of natural language texts to predefined categories based on their content. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman , Farhana Haider , Ahmed Ryadh Hasan