Related papers: MergeShuffle: A Very Fast, Parallel Random Permuta…
Shuffling is the process of rearranging a sequence of elements into a random order such that any permutation occurs with equal probability. It is an important building block in a plethora of techniques used in virtually all scientific…
Frequently, randomly organized data is needed to avoid an anomalous operation of other algorithms and computational processes. An analogy is that a deck of cards is ordered within the pack, but before a game of poker or solitaire the deck…
Linear-time algorithms that are traditionally used to shuffle data on CPUs, such as the method of Fisher-Yates, are not well suited to implementation on GPUs due to inherent sequential dependencies, and existing parallel shuffling…
In this paper, we present several improvements in the parallelization of the in-place merge algorithm, which merges two contiguous sorted arrays into one with an O(T) space complexity (where T is the number of threads). The approach divides…
Mergesort is one of the few efficient sorting algorithms and, despite being the oldest one, often still the method of choice today. In contrast to some alternative algorithms, it always runs efficiently using O(n log n) element comparisons…
Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory…
We consider the problem of sampling $n$ numbers from the range $\{1,\ldots,N\}$ without replacement on modern architectures. The main result is a simple divide-and-conquer scheme that makes sequential algorithms more cache efficient and…
We show that any permutation of ${1,2,...,N}$ can be written as the product of two involutions. As a consequence, any permutation of the elements of an array can be performed in-place in parallel in time O(1). In the case where the…
In this paper we present a random shuffling scheme to apply with adaptive sorting algorithms. Adaptive sorting algorithms utilize the presortedness present in a given sequence. We have probabilistically increased the amount of presortedness…
Random Reshuffling (RR) is an algorithm for minimizing finite-sum functions that utilizes iterative gradient descent steps in conjunction with data reshuffling. Often contrasted with its sibling Stochastic Gradient Descent (SGD), RR is…
Modern parallel computing devices, such as the graphics processing unit (GPU), have gained significant traction in scientific and statistical computing. They are particularly well-suited to data-parallel algorithms such as the particle…
Merging two sorted arrays is a prominent building block for sorting and other functions. Its efficient parallelization requires balancing the load among compute cores, minimizing the extra work brought about by parallelization, and…
The redundancy of Convolutional neural networks not only depends on weights but also depends on inputs. Shuffling is an efficient operation for mixing channel information but the shuffle order is usually pre-defined. To reduce the…
Many parallel algorithms which solve basic problems in computer science use auxiliary space linear in the input to facilitate conflict-free computation. There has been significant work on improving these parallel algorithms to be in-place,…
This note makes an observation that significantly simplifies a number of previous parallel, two-way merge algorithms based on binary search and sequential merge in parallel. First, it is shown that the additional merge step of distinguished…
We develop a novel parallel resampling algorithm for fully parallelized particle filters, which is designed with GPUs (graphics processing units) or similar parallel computing devices in mind. With our new algorithm, a full cycle of…
We reconsider a recently published algorithm (Dalkilic et al.) for merging lists by way of the perfect shuffle. The original publication gave only experimental results which, although consistent with linear execution time on the samples…
Random graphs (or networks) have gained a significant increase of interest due to its popularity in modeling and simulating many complex real-world systems. Degree sequence is one of the most important aspects of these systems. Random…
Random reshuffling, which randomly permutes the dataset each epoch, is widely adopted in model training because it yields faster convergence than with-replacement sampling. Recent studies indicate greedily chosen data orderings can further…
We present a simple, work-optimal and synchronization-free solution to the problem of stably merging in parallel two given, ordered arrays of m and n elements into an ordered array of m+n elements. The main contribution is a new, simple,…