Related papers: Efficient Resource Oblivious Algorithms for Multic…
We present a deterministic sorting algorithm, SPMS (Sample, Partition, and Merge Sort), that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts $n$ elements in $O(n \log n)$ time cache-obliviously with an…
This paper analyzes the cache miss cost of algorithms when scheduled using randomized work stealing (RWS) in a parallel environment, taking into account the effects of false sharing. First, prior analyses (due to Acar et al.) are extended…
In this paper we present randomized algorithms for sorting and convex hull that achieves optimal performance (for speed-up and cache misses) on the multicore model with private cache model. Our algorithms are cache oblivious and generalize…
We analyze the caching overhead incurred by a class of multithreaded algorithms when scheduled by an arbitrary scheduler. We obtain bounds that match or improve upon the well-known $O(Q+S \cdot (M/B))$ caching cost for the randomized work…
Multi-core processors improve performance, but they can create unpredictability owing to shared resources such as caches interfering. Cache partitioning is used to alleviate the Worst-Case Execution Time (WCET) estimation by isolating the…
Memory bandwidth regulation and cache partitioning are widely used techniques for achieving predictable timing in real-time computing systems. Combined with partitioned scheduling, these methods require careful co-allocation of tasks and…
The latest trends in high-performance computing systems show an increasing demand on the use of a large scale multicore systems in a efficient way, so that high compute-intensive applications can be executed reasonably well. However, the…
Major chip manufacturers have all introduced Multithreaded processors. These processors are used for running a variety of workloads. Efficient resource utilization is an important design aspect in such processors. Particularly, it is…
As secure processors such as Intel SGX (with hyperthreading) become widely adopted, there is a growing appetite for private analytics on big data. Most prior works on data-oblivious algorithms adopt the classical PRAM model to capture…
Scheduling query execution plans is a particularly complex problem in shared-nothing parallel systems, where each site consists of a collection of local time-shared (e.g., CPU(s) or disk(s)) and space-shared (e.g., memory) resources and…
Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation in processing graphs. Recently, size, variety, and structural complexity of these networks has grown dramatically.…
Systems for processing big data---e.g., Hadoop, Spark, and massively parallel databases---need to run workloads on behalf of multiple tenants simultaneously. The abundant disk-based storage in these systems is usually complemented by a…
Heterogeneous multi core processors can offer diverse computing capabilities. The efficiency of Market Basket Analysis Algorithm can be improved with heterogeneous multi core processors. Market basket analysis algorithm utilises apriori…
CPU-GPU heterogeneous architectures are now commonly used in a wide variety of computing systems from mobile devices to supercomputers. Maximizing the throughput for multi-programmed workloads on such systems is indispensable as one single…
Randomized parallel algorithms for many fundamental problems achieve optimal linear work in expectation, but upgrading this guarantee to hold with high probability (whp) remains a recurring theoretical challenge. In this paper, we address…
Frigo et al. proposed an ideal cache model and a recursive technique to design sequential cache-efficient algorithms in a cache-oblivious fashion. Ballard et al. pointed out that it is a fundamental open problem to extend the technique to…
A framework is proposed for the design and analysis of \emph{network-oblivious algorithms}, namely, algorithms that can run unchanged, yet efficiently, on a variety of machines characterized by different degrees of parallelism and…
Multithreaded Multi-core processors are prevalent today and are used for solving some of the important problems in computing. Resource imbalance can negatively impact overall performance in such processors. Hence balanced resource…
The goal of ranking and selection (R&S) procedures is to identify the best stochastic system from among a finite set of competing alternatives. Such procedures require constructing estimates of each system's performance, which can be…
Balanced hypergraph partitioning is an NP-hard problem with many applications, e.g., optimizing communication in distributed data placement problems. The goal is to place all nodes across $k$ different blocks of bounded size, such that…