English
Related papers

Related papers: Scalable Feature Subset Selection for Big Data usi…

200 papers

Feature subset selection (FSS) using a wrapper approach is essentially a combinatorial optimization problem having two objective functions namely cardinality of the selected-feature-subset, which should be minimized and the corresponding…

Neural and Evolutionary Computing · Computer Science 2022-02-09 Yelleti Vivek , Vadlamani Ravi , P. Radhakrishna

Feature subset selection (FSS) for classification is inherently a bi-objective optimization problem, where the task is to obtain a feature subset which yields the maximum possible area under the receiver operator characteristic curve (AUC)…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-20 Yelleti Vivek , Vadlamani Ravi , P. Radha Krishna

With the advent of extremely high dimensional datasets, dimensionality reduction techniques are becoming mandatory. Among many techniques, feature selection has been growing in interest as an important tool to identify relevant features on…

Feature selection (FS) has become an indispensable task in dealing with today's highly complex pattern recognition problems with massive number of features. In this study, we propose a new wrapper approach for FS based on binary…

Machine Learning · Statistics 2016-03-08 Vural Aksakalli , Milad Malekipirbazari

CFS (Correlation-Based Feature Selection) is an FS algorithm that has been successfully applied to classification problems in many domains. We describe Distributed CFS (DiCFS) as a completely redesigned, scalable, parallel and distributed…

Machine Learning · Computer Science 2019-02-01 Raul-Jose Palma-Mendoza , Luis de-Marcos , Daniel Rodriguez , Amparo Alonso-Betanzos

With the emergence of the big data age, the issue of how to obtain valuable knowledge from a dataset efficiently and accurately has attracted increasingly attention from both academia and industry. This paper presents a Parallel Random…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-26 Jianguo Chen , Kenli Li , Zhuo Tang , Kashif Bilal , Shui Yu , Chuliang Weng , Keqin Li

Recently, many evolutionary computation methods have been developed to solve the feature selection problem. However, the studies focused mainly on small-scale issues, resulting in stagnation issues in local optima and numerical instability…

Neural and Evolutionary Computing · Computer Science 2021-10-28 Xubin Wang , Yunhe Wang , Ka-Chun Wong , Xiangtao Li

Data collection for scientific applications is increasing exponentially and is forecasted to soon reach peta- and exabyte scales. Applications which process and analyze scientific data must be scalable and focus on execution performance to…

Instrumentation and Methods for Astrophysics · Physics 2018-10-09 Thomas Devine , Katerina Goseva-Popstojanova , Di Pang

Genome sequencing projects are rapidly increasing the number of high-dimensional protein sequence datasets. Clustering a high-dimensional protein sequence dataset using traditional machine learning approaches poses many challenges. Many…

Quantitative Methods · Quantitative Biology 2022-04-27 Preeti Jha , Aruna Tiwari , Neha Bharill , Milind Ratnaparkhe , Om Prakash Patel , Nilagiri Harshith , Mukkamalla Mounika , Neha Nagendra

Parallel batch processing machines have extensive applications in the semiconductor manufacturing process. However, the problem models in previous studies regard parallel batch processing as a fixed processing stage in the machining…

Neural and Evolutionary Computing · Computer Science 2024-09-30 Feige Liu , Xin Li , Chao Lu , Wenying Gong

Motivation: Alignment-free distance and similarity functions (AF functions, for short) are a well established alternative to two and multiple sequence alignments for many genomic, metagenomic and epigenomic tasks. Due to data-intensive…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-26 Umberto Ferraro Petrillo , Francesco Palini , Giuseppe Cattaneo , Raffaele Giancarlo

Detecting rare and diverse anomalies in highly imbalanced datasets-such as Advanced Persistent Threats (APTs) in cybersecurity-remains a fundamental challenge for machine learning systems. Active learning offers a promising direction by…

Machine Learning · Computer Science 2026-02-04 Sidahmed Benabderrahmane , Petko Valtchev , James Cheney , Talal Rahwan

Supervised learning algorithms are nowadays successfully scaling up to datasets that are very large in volume, leveraging the potential of in-memory cluster-computing Big Data frameworks. Still, massive datasets with a number of…

Machine Learning · Computer Science 2018-05-11 Luca Venturini , Elena Baralis , Paolo Garza

The proliferation of mobile devices, such as smartphones and Internet of Things (IoT) gadgets, results in the recent mobile big data (MBD) era. Collecting MBD is unprofitable unless suitable analytics and learning methods are utilized for…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-16 Mohammad Abu Alsheikh , Dusit Niyato , Shaowei Lin , Hwee-Pink Tan , Zhu Han

In Machine Learning, the parent set identification problem is to find a set of random variables that best explain selected variable given the data and some predefined scoring function. This problem is a critical component to structure…

Artificial Intelligence · Computer Science 2019-01-09 Subhadeep Karan , Jaroslaw Zola

Many evolutionary algorithms (EAs) take advantage of parallel evaluation of candidates. However, if evaluation times vary significantly, many worker nodes (i.e.,\ compute clients) are idle much of the time, waiting for the next generation…

Neural and Evolutionary Computing · Computer Science 2024-01-02 Jason Liang , Hormoz Shahrzad , Risto Miikkulainen

Training deep networks is expensive and time-consuming with the training period increasing with data size and growth in model parameters. In this paper, we provide a framework for distributed training of deep networks over a cluster of CPUs…

Machine Learning · Statistics 2017-08-22 Disha Shrivastava , Santanu Chaudhury , Dr. Jayadeva

Learning from imbalanced data is among the most challenging areas in contemporary machine learning. This becomes even more difficult when considered the context of big data that calls for dedicated architectures capable of high-performance…

Machine Learning · Computer Science 2022-11-16 William C. Sleeman , Bartosz Krawczyk

As data volumes grow across applications, analytics of large amounts of data is becoming increasingly important. Big data processing frameworks such as Apache Hadoop, Apache AsterixDB, and Apache Spark have been built to meet this demand. A…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-15 Avinash Kumar

We propose PESA, a novel approach combining Particle Swarm Optimisation (PSO), Evolution Strategy (ES), and Simulated Annealing (SA) in a hybrid Algorithm, inspired from reinforcement learning. PESA hybridizes the three algorithms by…

Neural and Evolutionary Computing · Computer Science 2020-09-21 Majdi I. Radaideh , Koroush Shirvan
‹ Prev 1 2 3 10 Next ›