Related papers: The Parallel Persistent Memory Model

Emulating a large memory with a collection of small ones

Sequential computation is well understood but does not scale well with current technology. Within the next decade, systems will contain large numbers of processors with potentially thousands of processors per chip. Despite this, many…

Hardware Architecture · Computer Science 2015-11-17 James Hanlon

The Impact of Memory Models on Software Reliability in Multiprocessors

The memory consistency model is a fundamental system property characterizing a multiprocessor. The relative merits of strict versus relaxed memory models have been widely debated in terms of their impact on performance, hardware complexity…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-04-07 Alexander Jaffe , Thomas Moscibroda , Laura Effinger-Dean , Luis Ceze , Karin Strauss

Deterministic Computations on a PRAM with Static Processor and Memory Faults

We consider Parallel Random Access Machine (PRAM) which has some processors and memory cells faulty. The faults considered are static, i.e., once the machine starts to operate, the operational/faulty status of PRAM components does not…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-12 Bogdan S. Chlebus , Leszek Gasieniec , Andrzej Pelc

Embarrassingly Parallel Time Series Analysis for Large Scale Weak Memory Systems

Second order stationary models in time series analysis are based on the analysis of essential statistics whose computations follow a common pattern. In particular, with a map-reduce nomenclature, most of these operations can be modeled as…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-23 Francois Belletti , Evan Sparks , Michael Franklin , Alexandre M. Bayen

Deterministic Consistency: A Programming Model for Shared Memory Parallelism

The difficulty of developing reliable parallel software is generating interest in deterministic environments, where a given program and input can yield only one possible result. Languages or type systems can enforce determinism in new code,…

Operating Systems · Computer Science 2010-02-01 Amittai Aviram , Bryan Ford

Extending the Nested Parallel Model to the Nested Dataflow Model with Provably Efficient Schedulers

The nested parallel (a.k.a. fork-join) model is widely used for writing parallel programs. However, the two composition constructs, i.e. "$\parallel$" (parallel) and "$;$" (serial), are insufficient in expressing "partial dependencies" or…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-02-16 David Dinh , Harsha Vardhan Simhadri , Yuan Tang

Fast Concurrent Primitives Despite Contention

We study the problem of constructing concurrent objects in a setting where $P$ processes run in parallel and interact through a shared memory that is subject to write contention. Our goal is to transform hardware primitives that are subject…

Data Structures and Algorithms · Computer Science 2026-04-17 Michael A. Bender , Guy E. Blelloch , Martin Farach-Colton , Yang Hu , Rob Johnson , Rotem Oshman , Renfei Zhou

Efficient Distributed Quantum Computing

We provide algorithms for efficiently addressing quantum memory in parallel. These imply that the standard circuit model can be simulated with low overhead by the more realistic model of a distributed quantum computer. As a result, the…

Quantum Physics · Physics 2013-03-13 Robert Beals , Stephen Brierley , Oliver Gray , Aram Harrow , Samuel Kutin , Noah Linden , Dan Shepherd , Mark Stather

Algorithms in the Ultra-Wide Word Model

The effective use of parallel computing resources to speed up algorithms in current multi-core parallel architectures remains a difficult challenge, with ease of programming playing a key role in the eventual success of various parallel…

Data Structures and Algorithms · Computer Science 2014-12-09 Arash Farzan , Alejandro López-Ortiz , Patrick K. Nicholson , Alejandro Salinger

A sufficient condition for a linear speedup in competitive parallel computing

In competitive parallel computing, the identical copies of a code in a phase of a sequential program are assigned to processor cores and the result of the fastest core is adopted. In the literature, it is reported that a superlinear speedup…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-22 Naoki Yonezawa

Lectures on Parallel Computing

These lecture notes are designed to accompany an imaginary, virtual, undergraduate, one or two semester course on fundamentals of Parallel Computing as well as to serve as background and reference for graduate courses on High-Performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-02 Jesper Larsson Träff

Delay-Free Concurrency on Faulty Persistent Memory

Non-volatile memory (NVM) promises persistent main memory that remains correct despite loss of power. This has sparked a line of research into algorithms that can recover from a system crash. Since caches are expected to remain volatile,…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-22 Naama Ben-David , Guy E. Blelloch , Michal Friedman , Yuanhao Wei

Methods for Partitioning Data to Improve Parallel Execution Time for Sorting on Heterogeneous Clusters

The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-16 Christophe Cérin , Jean-Christophe Dubacq , Jean-Louis Roch , the SafeScale Collaboration

Efficient Parallel Simulations of Asynchronous Cellular Arrays

A definition for a class of asynchronous cellular arrays is proposed. An example of such asynchrony would be independent Poisson arrivals of cell iterations. The Ising model in the continuous time formulation of Glauber falls into this…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Boris D. Lubachevsky

Regional Consistency: Programmability and Performance for Non-Cache-Coherent Systems

Parallel programmers face the often irreconcilable goals of programmability and performance. HPC systems use distributed memory for scalability, thereby sacrificing the programmability advantages of shared memory programming models.…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-01-21 Bharath Ramesh , Calvin J. Ribbens , Srinidhi Varadarajan

A Problem-Specific Fault-Tolerance Mechanism for Asynchronous, Distributed Systems

The idle computers on a local area, campus area, or even wide area network represent a significant computational resource---one that is, however, also unreliable, heterogeneous, and opportunistic. This type of resource has been used…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Adriana Iamnitchi , Ian Foster

An Easy-to-use Scalable Framework for Parallel Recursive Backtracking

Supercomputers are equipped with an increasingly large number of cores to use computational power as a way of solving problems that are otherwise intractable. Unfortunately, getting serial algorithms to run in parallel to take advantage of…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-12-31 Faisal N. Abu-Khzam , Khuzaima Daudjee , Amer E. Mouawad , Naomi Nishimura

A Scalable Shared-Memory Parallel Simplex for Large-Scale Linear Programming

The Simplex tableau has been broadly used and investigated in the industry and academia. With the advent of the big data era, ever larger problems are posed to be solved in ever larger machines whose architecture type did not exist in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-29 Demetrios Coutinho , Felipe O. Lins e Silva , Daniel Aloise , Samuel , Xavier-de-Souza

On the Hardness of Massively Parallel Computation

We investigate whether there are inherent limits of parallelization in the (randomized) massively parallel computation (MPC) model by comparing it with the (sequential) RAM model. As our main result, we show the existence of hard functions…

Data Structures and Algorithms · Computer Science 2020-08-18 Kai-Min Chung , Kuan-Yi Ho , Xiaorui Sun

Memory Models for C/C++ Programmers

The memory model is the crux of the concurrency semantics of shared-memory systems. It defines the possible values that a read operation is allowed to return for any given set of write operations performed by a concurrent program, thereby…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-14 Manuel Pöter , Jesper Larsson Träff