Related papers: Document Classification Using Distributed Machine …
Data sizes that cannot be processed by conventional data storage and analysis systems are named as Big Data.It also refers to nex technologies developed to store, process and analyze large amounts of data. Automatic information retrieval…
Text Document classification aims in associating one or more predefined categories based on the likelihood suggested by the training set of labeled documents. Many machine learning algorithms play a vital role in training the system with…
In this study, book summaries and categories taken from book sites were classified using word embedding methods, natural language processing techniques and machine learning algorithms. In addition, one hot encoding, Word2Vec and Term…
A public dataset, with a variety of properties suitable for sentiment analysis [1], event prediction, trend detection and other text mining applications, is needed in order to be able to successfully perform analysis studies. The vast…
The readability assessment deals with estimating the level of difficulty in reading texts.Many readability tests, which do not indicate execution efficiency, have been applied on specific texts to measure the reading grade level in science…
Text classification systems will help to solve the text clustering problem in the Azerbaijani language. There are some text-classification applications for foreign languages, but we tried to build a newly developed system to solve this…
Text classification has seen an increased use in both academic and industry settings. Though rule based methods have been fairly successful, supervised machine learning has been shown to be most successful for most languages, where most…
Recently, there are unprecedented data growth originating from different online platforms which contribute to big data in terms of volume, velocity, variety and veracity (4Vs). Given this nature of big data which is unstructured, performing…
In this paper we present the results of an experiment aimed to use machine learning methods to obtain models that can be used for the automatic classification of products. In order to apply automatic classification methods, we transformed…
Text document classification is an important task for diverse natural language processing based applications. Traditional machine learning approaches mainly focused on reducing dimensionality of textual data to perform classification. This…
The demand for text classification is growing significantly in web searching, data mining, web ranking, recommendation systems, and so many other fields of information and technology. This paper illustrates the text classification process…
False news has received attention from both the general public and the scholarly world. Such false information has the ability to affect public perception, giving nefarious groups the chance to influence the results of public events like…
This article presents an evaluation of several machine learning methods applied to automated text classification, alongside the design of a demonstrative system for unbalanced document categorization and distribution. The study focuses on…
Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those…
There have been many recent advances in the structure and measurement of distributed language models: those that map from words to a vector-space that is rich in information about word choice and composition. This vector-space is the…
Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper…
Document categorization is a technique where the category of a document is determined. In this paper three well-known supervised learning techniques which are Support Vector Machine(SVM), Na\"ive Bayes(NB) and Stochastic Gradient…
Efficient distributed numerical word representation models (word embeddings) combined with modern machine learning algorithms have recently yielded considerable improvement on automatic document classification tasks. However, the…
The effective utilization at scale of complex machine learning (ML) techniques for HEP use cases poses several technological challenges, most importantly on the actual implementation of dedicated end-to-end data pipelines. A solution to…
In software industries, individuals at different levels from customer to an engineer apply diverse mechanisms to detect to which class a particular bug should be allocated. Sometimes while a simple search in Internet might help, in many…