Related papers: CloudSTRUCTURE: infer population STRUCTURE on the …
We present a multi-purpose genetic algorithm, designed and implemented with GPGPU / CUDA parallel computing technology. The model was derived from our CPU serial implementation, named GAME (Genetic Algorithm Model Experiment). It was…
Cloud computing recently developed into a viable alternative to on-premises systems for executing high-performance computing (HPC) applications. With the emergence of new vendors and hardware options, there is now a growing need to…
Advances in genome sequencing technologies generate massive amounts of sequence data that are increasingly analyzed and shared through public repositories. On-demand infrastructure services on cloud computing platforms enable the processing…
CPU is undoubtedly the most important resource of the computer system. Recent advances in software and system architecture have increased processing complexity, as computing is now distributed and parallel. CloudSim represents the…
In this work, we propose a library that enables on a cloud the creation and management of tree data structures from a cloud client. As a proof of concept, we implement a new cloud service CloudTree. With CloudTree, users are able to…
We present a multi-purpose genetic algorithm, designed and implemented with GPGPU / CUDA parallel computing technology. The model was derived from a multi-core CPU serial implementation, named GAME, already scientifically successfully…
In this paper, we provide a novel perspective on the underlying structure of real-world data with ground-truth clusters via characterization of an abundantly observed yet often overlooked density-geometry correlation, that manifests itself…
Crowdwork often entails tackling cognitively-demanding and time-consuming tasks. Crowdsourcing can be used for complex annotation tasks, from medical imaging to geospatial data, and such data powers sensitive applications, such as health…
Clustering is a difficult and widely-studied data mining task, with many varieties of clustering algorithms proposed in the literature. Nearly all algorithms use a similarity measure such as a distance metric (e.g. Euclidean distance) to…
With the increasing number of Machine and Deep Learning applications in High Energy Physics, easy access to dedicated infrastructure represents a requirement for fast and efficient R&D. This work explores different types of cloud services…
Skyline computation is an essential database operation that has many applications in multi-criteria decision making scenarios such as recommender systems. Existing algorithms have focused on checking point domination, which lack efficiency…
With the advantages that cloud computing offers in terms of platform as a service, software as a service, and infrastructure as a service, data engineers and data scientists are able to leverage cloud computing for their ETL/ELT (extract,…
High intensive computation applications can usually take days to months to finish an execution. During this time, it is common to have variations of the available resources when considering that such hardware is usually shared among a…
Hierarchical clustering is a popular unsupervised data analysis method. For many real-world applications, we would like to exploit prior information about the data that imposes constraints on the clustering hierarchy, and is not captured by…
In the last decade the broad scope of complex networks has led to a rapid progress. In this area a particular interest has the study of community structures. The analysis of this type of structure requires the formalization of the intuitive…
This study concerns with the diagnosis of aerospace structure defects by applying a HPC parallel implementation of a novel learning algorithm, named U-BRAIN. The Soft Computing approach allows advanced multi-parameter data processing in…
A first step in exploring population structure in crop plants and other organisms is to define the number of subpopulations that exist for a given data set. The genetic marker data sets being generated have become increasingly large over…
This paper presents a novel application of Genetic Algorithms(GAs) to quantify the performance of Platform as a Service (PaaS), a cloud service model that plays a critical role in both industry and academia. While Cloud benchmarks are not…
As a kind of basic machine learning method, clustering algorithms group data points into different categories based on their similarity or distribution. We present a clustering algorithm by finding hyper-planes to distinguish the data…
Clustering algorithms are often used to find subpopulations in exploratory data analysis workflows. Not only the clusters themselves, but also their shape can represent meaningful subpopulations. In this paper, we present FLASC, an…