Related papers: Cache-Oblivious Selection in Sorted X+Y Matrices
In this paper we consider sorting in the cache-oblivious model of Frigo, Leiserson, Prokop, and Ramachandran (1999). We introduce a new simple sorting algorithm in that model which has asymptotically optimal IO complexity $O(\frac{n}{B}…
We use soft heaps to obtain simpler optimal algorithms for selecting the $k$-th smallest item, and the set of~$k$ smallest items, from a heap-ordered tree, from a collection of sorted lists, and from $X+Y$, where $X$ and $Y$ are two…
We present data-oblivious algorithms in the external-memory model for compaction, selection, and sorting. Motivation for such problems comes from clients who use outsourced data storage services and wish to mask their data access patterns.…
While a lot of work in theoretical computer science has gone into optimizing the runtime and space usage of data structures, such work very often neglects a very important component of modern computers: the cache. In doing so, very often,…
We present two cache-oblivious sorting-based convex hull algorithms in the Binary Forking Model. The first is an algorithm for a presorted set of points which achieves $O(n)$ work, $O(\log n)$ span, and $O(n/B)$ serial cache complexity,…
We propose a conceptually simple oblivious sort and oblivious random permutation algorithms called bucket oblivious sort and bucket oblivious random permutation. Bucket oblivious sort uses $6n\log n$ time (measured by the number of memory…
Given string $S[1..N]$ and integer $k$, the {\em suffix selection} problem is to determine the $k$th lexicographically smallest amongst the suffixes $S[i... N]$, $1 \leq i \leq N$. We study the suffix selection problem in the cache-aware…
A mesh is a graph that divides physical space into regularly-shaped regions. Meshes computations form the basis of many applications, e.g. finite-element methods, image rendering, and collision detection. In one important mesh primitive,…
We present priority queues in the cache-oblivious external memory model with block size $B$ and main memory size $M$ that support on $N$ elements, operation \textsc{UPDATE} (combination of \textsc{INSERT} and \textsc{DECREASEKEY}) in $O…
In the multiple-selection problem one is given an unsorted array $S$ of $N$ elements and an array of $q$ query ranks $r_1<\cdots<r_q$, and the task is to return, in sorted order, the $q$ elements in $S$ of rank $r_1, \ldots, r_q$,…
Frigo et al. proposed an ideal cache model and a recursive technique to design sequential cache-efficient algorithms in a cache-oblivious fashion. Ballard et al. pointed out that it is a fundamental open problem to extend the technique to…
In many applications, it is of interest to approximate data, given by mxn matrix A, by a matrix B of at most rank k, which is much smaller than m and n. The best approximation is given by singular value decomposition, which is too time…
In an array of N elements, M positions and M elements are "marked". We show how to permute the elements in the array so that all marked elements end in marked positions, in time O(N) (in the standard word-RAM model), deterministically, and…
We consider the problem of laying out a tree with fixed parent/child structure in hierarchical memory. The goal is to minimize the expected number of block transfers performed during a search along a root-to-leaf path, subject to a given…
We show that several versions of Floyd and Rivest's algorithm Select for finding the $k$th smallest of $n$ elements require at most $n+\min\{k,n-k\}+o(n)$ comparisons on average and with high probability. This rectifies the analysis of…
We show that several versions of Floyd and Rivest's algorithm Select for finding the $k$th smallest of $n$ elements require at most $n+\min\{k,n-k\}+o(n)$ comparisons on average and with high probability. This rectifies the analysis of…
Selection on the Cartesian product is a classic problem in computer science. Recently, an optimal algorithm for selection on $X+Y$, based on soft heaps, was introduced. By combining this approach with layer-ordered heaps (LOHs), an…
Classic cache-oblivious parallel matrix multiplication algorithms achieve optimality either in time or space, but not both, which promotes lots of research on the best possible balance or tradeoff of such algorithms. We study modern…
Selection and sorting the Cartesian sum, $X+Y$, are classic and important problems. Here, a new algorithm is presented, which generates the top $k$ values of the form $X_i+Y_j$. The algorithm relies only on median-of-medians and is simple…
We investigate effects of ordering in blocked matrix--matrix multiplication. We find that submatrices do not have to be stored contiguously in memory to achieve near optimal performance. Instead it is the choice of execution order of the…