Related papers: Improving Persian Document Classification Using Se…

Arabic Text Categorization Algorithm using Vector Evaluation Method

Text categorization is the process of grouping documents into categories based on their contents. This process is important to make information retrieval easier, and it became more important due to the huge textual information available…

Information Retrieval · Computer Science 2015-01-08 Ashraf Odeh , Aymen Abu-Errub , Qusai Shambour , Nidal Turab

Effects of term weighting approach with and without stop words removing on Arabic text classification

Classifying text is a method for categorizing documents into pre-established groups. Text documents must be prepared and represented in a way that is appropriate for the algorithms used for data mining prior to classification. As a result,…

Computation and Language · Computer Science 2024-02-26 Esra'a Alhenawi , Ruba Abu Khurma , Pedro A. Castillo , Maribel G. Arenas

The Challenges of Persian User-generated Textual Content: A Machine Learning-Based Approach

Over recent years a lot of research papers and studies have been published on the development of effective approaches that benefit from a large amount of user-generated content and build intelligent predictive models on top of them. This…

Computation and Language · Computer Science 2021-01-21 Mohammad Kasra Habib

Text Classification using Data Mining

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman , Farhana Haider , Ahmed Ryadh Hasan

PerSum: Novel Systems for Document Summarization in Persian

In this paper we explore the problem of document summarization in Persian language from two distinct angles. In our first approach, we modify a popular and widely cited Persian document summarization framework to see how it works on a…

Computation and Language · Computer Science 2016-06-13 Saeid Parvandeh , Shibamouli Lahiri , Fahimeh Boroumand

Automatic Real-word Error Correction in Persian Text

Automatic spelling correction stands as a pivotal challenge within the ambit of natural language processing (NLP), demanding nuanced solutions. Traditional spelling correction techniques are typically only capable of detecting and…

Computation and Language · Computer Science 2024-07-23 Seyed Mohammad Sadegh Dashti , Amid Khatibi Bardsiri , Mehdi Jafari Shahbazzadeh

A comprehensive study on Frequent Pattern Mining and Clustering categories for topic detection in Persian text stream

Topic detection is a complex process and depends on language because it somehow needs to analyze text. There have been few studies on topic detection in Persian, and the existing algorithms are not remarkable. Therefore, we aimed to study…

Computation and Language · Computer Science 2024-03-18 Elnaz Zafarani-Moattar , Mohammad Reza Kangavari , Amir Masoud Rahmani

Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings

Since the amount of information on the internet is growing rapidly, it is not easy for a user to find relevant information for his/her query. To tackle this issue, much attention has been paid to Automatic Document Summarization. The key…

Computation and Language · Computer Science 2019-02-05 Kamal Al-Sabahi , Zhang Zuping , Yang Kang

A Deep Learning-Based Approach for Measuring the Domain Similarity of Persian Texts

In this paper, we propose a novel approach for measuring the degree of similarity between categories of two pieces of Persian text, which were published as descriptions of two separate advertisements. We built an appropriate dataset for…

Computation and Language · Computer Science 2019-09-27 Hossein Keshavarz , Shohreh Tabatabayi Seifi , Mohammad Izadi

A Simple and Effective Approach for Fine Tuning Pre-trained Word Embeddings for Improved Text Classification

This work presents a new and simple approach for fine-tuning pretrained word embeddings for text classification tasks. In this approach, the class in which a term appears, acts as an additional contextual variable during the fine tuning…

Computation and Language · Computer Science 2019-12-17 Amr Al-Khatib , Samhaa R. El-Beltagy

Improving Information Retrieval Results for Persian Documents using FarsNet

In this paper, we propose a new method for query expansion, which uses FarsNet (Persian WordNet) to find similar tokens related to the query and expand the semantic meaning of the query. For this purpose, we use synonymy relations in…

Information Retrieval · Computer Science 2018-11-05 Adel Rahimi , Mohammad Bahrani

Text Classification using Artificial Intelligence

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement…

Information Retrieval · Computer Science 2010-09-28 S. M. Kamruzzaman

Document Relevance Evaluation via Term Distribution Analysis Using Fourier Series Expansion

In addition to the frequency of terms in a document collection, the distribution of terms plays an important role in determining the relevance of documents for a given search query. In this paper, term distribution analysis using Fourier…

Information Retrieval · Computer Science 2009-07-18 Patricio Galeas , Ralph Kretschmer , Bernd Freisleben

A novel approach to sentiment analysis in Persian using discourse and external semantic information

Sentiment analysis attempts to identify, extract and quantify affective states and subjective information from various types of data such as text, audio, and video. Many approaches have been proposed to extract the sentiment of individuals…

Computation and Language · Computer Science 2020-07-21 Rahim Dehkharghani , Hojjat Emami

Text Classification: A Perspective of Deep Learning Methods

In recent years, with the rapid development of information on the Internet, the number of complex texts and documents has increased exponentially, which requires a deeper understanding of deep learning methods in order to accurately…

Computation and Language · Computer Science 2023-09-26 Zhongwei Wan

TF-CR: Weighting Embeddings for Text Classification

Text classification, as the task consisting in assigning categories to textual instances, is a very common task in information science. Methods learning distributed representations of words, such as word embeddings, have become popular in…

Computation and Language · Computer Science 2020-12-15 Arkaitz Zubiaga

Enhancing Pashto Text Classification using Language Processing Techniques for Single And Multi-Label Analysis

Text classification has become a crucial task in various fields, leading to a significant amount of research on developing automated text classification systems for national and international languages. However, there is a growing need for…

Computation and Language · Computer Science 2023-05-08 Mursal Dawodi , Jawid Ahmad Baktash

PEYMA: A Tagged Corpus for Persian Named Entities

The goal in the NER task is to classify proper nouns of a text into classes such as person, location, and organization. This is an important preprocessing step in many NLP tasks such as question-answering and summarization. Although many…

Computation and Language · Computer Science 2018-01-31 Mahsa Sadat Shahshahani , Mahdi Mohseni , Azadeh Shakery , Heshaam Faili

PESTS: Persian_English Cross Lingual Corpus for Semantic Textual Similarity

One of the components of natural language processing that has received a lot of investigation recently is semantic textual similarity. In computational linguistics and natural language processing, assessing the semantic similarity of words,…

Computation and Language · Computer Science 2024-09-06 Mohammad Abdous , Poorya Piroozfar , Behrouz Minaei Bidgoli

Comparative Study of Long Document Classification

The amount of information stored in the form of documents on the internet has been increasing rapidly. Thus it has become a necessity to organize and maintain these documents in an optimum manner. Text classification algorithms study the…

Computation and Language · Computer Science 2022-02-22 Vedangi Wagh , Snehal Khandve , Isha Joshi , Apurva Wani , Geetanjali Kale , Raviraj Joshi