English
Related papers

Related papers: BSP Sorting: An experimental Study

200 papers

We propose new sequential sorting operations by adapting techniques and methods used for designing parallel sorting algorithms. Although the norm is to parallelize a sequential algorithm to improve performance, we adapt a contrarian…

Data Structures and Algorithms · Computer Science 2016-09-01 Alexandros V Gerbessiotis

We investigate distributed memory parallel sorting algorithms that scale to the largest available machines and are robust with respect to input size and distribution of the input elements. The main outcome is that four sorting algorithms…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-17 Michael Axtmann , Peter Sanders

We present and evaluate GPU Bucket Sort, a parallel deterministic sample sort algorithm for many-core GPUs. Our method is considerably faster than Thrust Merge (Satish et.al., Proc. IPDPS 2009), the best comparison-based sorting algorithm…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-02-25 Frank Dehne , Hamidreza Zaboli

This paper introduces a novel K-means clustering algorithm, an advancement on the conventional Big-means methodology. The proposed method efficiently integrates parallel processing, stochastic sampling, and competitive optimization to…

Machine Learning · Computer Science 2024-03-28 Rustam Mussabayev , Ravil Mussabayev

Previous parallel sorting algorithms do not scale to the largest available machines, since they either have prohibitive communication volume or prohibitive critical path length. We describe algorithms that are a viable compromise and…

Data Structures and Algorithms · Computer Science 2015-02-26 Michael Axtmann , Timo Bingmann , Peter Sanders , Christian Schulz

Most machine learning and deep neural network algorithms rely on certain iterative algorithms to optimise their utility/cost functions, e.g. Stochastic Gradient Descent. In distributed learning, the networked nodes have to work…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-06 Liang Wang , Ben Catterall , Richard Mortier

The bulk synchronous parallel (BSP) is a celebrated synchronization model for general-purpose parallel computing that has successfully been employed for distributed training of machine learning models. A prevalent shortcoming of the BSP is…

Machine Learning · Computer Science 2020-01-07 Xing Zhao , Manos Papagelis , Aijun An , Bao Xin Chen , Junfeng Liu , Yonggang Hu

Chance constrained program is computationally intractable due to the existence of chance constraints, which are randomly disturbed and should be satisfied with a probability. This paper proposes a two-layer randomized algorithm to address…

Optimization and Control · Mathematics 2019-11-11 Xun Shen , Jiancang Zhuang , Xingguo Zhang

Integer sorting on multicores and GPUs can be realized by a variety of approaches that include variants of distribution-based methods such as radix-sort, comparison-oriented algorithms such as deterministic regular sampling and random…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-08-31 Alexandros V. Gerbessiotis

Machine learning models, and deep neural networks in particular, are increasingly deployed in risk-sensitive domains such as healthcare, environmental forecasting, and finance, where reliable quantification of predictive uncertainty is…

Machine Learning · Computer Science 2026-04-07 Asena Karolin Özdemir , Lars H. Heyen , Arvid Weyrauch , Achim Streit , Markus Götz , Charlotte Debus

In this paper, we study randomized methods for feedback design of uncertain systems. The first contribution is to derive the sample complexity of various constrained control problems. In particular, we show the key role played by the…

Systems and Control · Computer Science 2014-07-22 T. Alamo , R. Tempo , A. Luque , D. R. Ramirez

In this paper we develop optimal algorithms in the binary-forking model for a variety of fundamental problems, including sorting, semisorting, list ranking, tree contraction, range minima, and ordered set union, intersection and difference.…

Data Structures and Algorithms · Computer Science 2020-06-26 Guy E. Blelloch , Jeremy T. Fineman , Yan Gu , Yihan Sun

We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We…

Machine Learning · Computer Science 2016-06-17 Aryan Mokhtari , Alec Koppel , Alejandro Ribeiro

The goal of ranking and selection (R&S) procedures is to identify the best stochastic system from among a finite set of competing alternatives. Such procedures require constructing estimates of each system's performance, which can be…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-17 Eric C. Ni , Dragos F. Ciocan , Shane G. Henderson , Susan R. Hunter

We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We…

Machine Learning · Computer Science 2016-03-23 Aryan Mokhtari , Alec Koppel , Alejandro Ribeiro

This work aims to improve the sample efficiency of parallel large-scale ranking and selection (R&S) problems by leveraging correlation information. We modify the commonly used "divide and conquer" framework in parallel computing by adding a…

Methodology · Statistics 2026-02-16 Zishi Zhang , Yijie Peng

Semisort is a fundamental algorithmic primitive widely used in the design and analysis of efficient parallel algorithms. It takes input as an array of records and a function extracting a \emph{key} per record, and reorders them so that…

Data Structures and Algorithms · Computer Science 2023-04-21 Xiaojun Dong , Yunshu Wu , Zhongqi Wang , Laxman Dhulipala , Yan Gu , Yihan Sun

There have been many proposals for sorting integers on multicores/GPUs that include radix-sort and its variants or other approaches that exploit specialized hardware features of a particular multicore architecture. Comparison-based…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-01 Alexandros V. Gerbessiotis

In this paper we present a deterministic parallel algorithm solving the multiple selection problem in congested clique model. In this problem for given set of elements S and a set of ranks $K = \{k_1 , k_2 , ..., k_r \}$ we are asking for…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-21 Krzysztof Nowicki

The bulk synchronous parallel (BSP) model struggles with irregular workloads due to rigid global communication. While fine-grained asynchronous BSP (FA-BSP) improves overlap, existing implementations typically rely on a limiting…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-26 Minyu Cheng , Jiakun Yan , Marc Snir
‹ Prev 1 2 3 10 Next ›