Related papers: Arabic Text Categorization Algorithm using Vector …

Arabic Text Mining

The rapid growth of the internet has increased the number of online texts. This led to the rapid growth of the number of online texts in the Arabic language. The enormous amount of text must be organized into classes to make the analysis…

Information Retrieval · Computer Science 2022-11-08 Sumaia Mohammed AL-Ghuribi , Shahrul Azman Mohd Noah

Improving Persian Document Classification Using Semantic Relations between Words

With the increase of information, document classification as one of the methods of text mining, plays vital role in many management and organizing information. Document classification is the process of assigning a document to one or more…

Information Retrieval · Computer Science 2014-12-30 Saeed Parseh , Ahmad Baraani

Classifying text using machine learning models and determining conversation drift

Text classification helps analyse texts for semantic meaning and relevance, by mapping the words against this hierarchy. An analysis of various types of texts is invaluable to understanding both their semantic meaning, as well as their…

Machine Learning · Computer Science 2022-11-16 Chaitanya Chadha , Vandit Gupta , Deepak Gupta , Ashish Khanna

Machine learning approach for text and document mining

Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a…

Information Retrieval · Computer Science 2014-06-09 Vishwanath Bijalwan , Pinki Kumari , Jordan Pascual , Vijay Bhaskar Semwal

Evaluating Various Tokenizers for Arabic Text Classification

The first step in any NLP pipeline is to split the text into individual tokens. The most obvious and straightforward approach is to use words as tokens. However, given a large text corpus, representing all the words is not efficient in…

Computation and Language · Computer Science 2021-09-30 Zaid Alyafeai , Maged S. Al-shaibani , Mustafa Ghaleb , Irfan Ahmad

Text Classification using Artificial Intelligence

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman

Arabic Text Recognition in Video Sequences

In this paper, we propose a robust approach for text extraction and recognition from Arabic news video sequence. The text included in video sequences is an important needful for indexing and searching system. However, this text is difficult…

Multimedia · Computer Science 2013-08-16 M. Ben Halima , H. Karray , A. M. Alimi

Text Classification using Data Mining

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman , Farhana Haider , Ahmed Ryadh Hasan

Test Model for Text Categorization and Text Summarization

Text Categorization is the task of automatically sorting a set of documents into categories from a predefined set and Text Summarization is a brief and accurate representation of input text such that the output covers the most important…

Information Retrieval · Computer Science 2013-05-14 Khushboo Thakkar , Urmila Shrawankar

Effects of term weighting approach with and without stop words removing on Arabic text classification

Classifying text is a method for categorizing documents into pre-established groups. Text documents must be prepared and represented in a way that is appropriate for the algorithms used for data mining prior to classification. As a result,…

Computation and Language · Computer Science 2024-02-26 Esra'a Alhenawi , Ruba Abu Khurma , Pedro A. Castillo , Maribel G. Arenas

Text Classification with Compression Algorithms

This work concerns a comparison of SVM kernel methods in text categorization tasks. In particular I define a kernel function that estimates the similarity between two objects computing by their compressed lengths. In fact, compression…

Machine Learning · Computer Science 2012-10-30 Antonio Giuliano Zippo

Rational Kernels for Arabic Stemming and Text Classification

In this paper, we address the problems of Arabic Text Classification and stemming using Transducers and Rational Kernels. We introduce a new stemming technique based on the use of Arabic patterns (Pattern Based Stemmer). Patterns are…

Computation and Language · Computer Science 2015-02-27 Attia Nehar , Djelloul Ziadi , Hadda Cherroun

Applying Vector Space Model (VSM) Techniques in Information Retrieval for Arabic Language

Information Retrieval (IR) allows the storage, management, processing and retrieval of information, documents, websites, etc. Building an IR system for any language is imperative. This is evident through the massive conducted efforts to…

Information Retrieval · Computer Science 2018-01-16 Bilal Abu-Salih

Efficient Measuring of Readability to Improve Documents Accessibility for Arabic Language Learners

This paper presents an approach based on supervised machine learning methods to build a classifier that can identify text complexity in order to present Arabic language learners with texts suitable to their levels. The approach is based on…

Computation and Language · Computer Science 2021-09-20 Sadik Bessou , Ghozlane Chenni

Arabic Language Text Classification Using Dependency Syntax-Based Feature Selection

We study the performance of Arabic text classification combining various techniques: (a) tfidf vs. dependency syntax, for feature selection and weighting; (b) class association rules vs. support vector machines, for classification. The…

Computation and Language · Computer Science 2014-10-21 Yannis Haralambous , Yassir Elidrissi , Philippe Lenca

Text Classification: A Perspective of Deep Learning Methods

In recent years, with the rapid development of information on the Internet, the number of complex texts and documents has increased exponentially, which requires a deeper understanding of deep learning methods in order to accurately…

Computation and Language · Computer Science 2023-09-26 Zhongwei Wan

Improving Sentiment Analysis in Arabic Using Word Representation

The complexities of Arabic language in morphology, orthography and dialects makes sentiment analysis for Arabic more challenging. Also, text feature extraction from short messages like tweets, in order to gauge the sentiment, makes this…

Computation and Language · Computer Science 2018-10-17 Abdulaziz M. Alayba , Vasile Palade , Matthew England , Rahat Iqbal

Comparing Lexical and Semantic Vector Search Methods When Classifying Medical Documents

Classification is a common AI problem, and vector search is a typical solution. This transforms a given body of text into a numerical representation, known as an embedding, and modern improvements to vector search focus on optimising speed…

Information Retrieval · Computer Science 2025-06-04 Lee Harris

Automatic Arabic Dialect Identification Systems for Written Texts: A Survey

Arabic dialect identification is a specific task of natural language processing, aiming to automatically predict the Arabic dialect of a given text. Arabic dialect identification is the first step in various natural language processing…

Computation and Language · Computer Science 2020-09-29 Maha J. Althobaiti

Text Classification Algorithms: A Survey

In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine…

Machine Learning · Computer Science 2020-05-21 Kamran Kowsari , Kiana Jafari Meimandi , Mojtaba Heidarysafa , Sanjana Mendu , Laura E. Barnes , Donald E. Brown