Related papers: Amdahl's and Gustafson-Barsis laws revisited
In this work I present a generalization of Amdahl's law on the limits of a parallel implementation with many processors. In particular I establish some mathematical relations involving the number of processors and the dimension of the…
The problem of learning parallel computer performance is investigated in the context of multicore processors. Given a fixed workload, the effect of varying system configuration on performance is sought. Conventionally, the performance…
The paper explains why Amdahl's Law shall be interpreted specifically for distributed parallel systems and why it generated so many debates, discussions, and abuses. We set up a general model and list many of the terms affecting parallel…
This paper reinterprets Amdahl's law in terms of execution time and applies this simple model to supercomputing. The systematic discussion results in practical formulas enabling to calculate expected running time using large number of…
A parallel program can be represented as a directed acyclic graph. An important performance bound is the time to execute the critical path through the graph. We show how this performance metric is related to Amdahl speedup and the degree of…
This work analyses the effects of sequential-to-parallel synchronization and inter-core communication on multicore performance, speedup and scaling. A modification of Amdahl law is formulated, to reflect the finding that parallel speedup is…
The author's research of topologies of parallel computing systems and the tasks solved with them, including the corresponding tools of their modeling, is summarized in the present paper. The original topological model of such systems is…
Efficient engineered systems require scalability. A scalable system has increasing performance with increasing system size. In an ideal case, the increase in performance (e.g., speedup) corresponds to the number of units that are added to…
Designing concurrent data structures should follow some basic rules. By separating the algorithms into two phases, we present guidelines for scalable data structures, with a analysis model based on the Amadal's law. To the best of our…
The modification of Amdahl's law for the case of increment of processor elements in a computer system is considered. The coefficient $k$ linking accelerations of parallel and parallel specialized computer systems is determined. The limiting…
In high performance computing environments, we observe an ongoing increase in the available numbers of cores. This development calls for re-emphasizing performance (scalability) analysis and speedup laws as suggested in the literature…
Classical Amdahl's Law conceptualized the limit of speedup for an era of fixed serial-parallel decomposition and homogeneous replication. Modern heterogeneous systems need a different conceptual framework: constrained resources must be…
With the spread of multi- and many-core processors more and more typical task is to re-implement some source code written originally for a single processor to run on more than one cores. Since it is a serious investment, it is important to…
After Amdahl's trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size,…
The universal scalability law (USL) is an analytic model used to quantify application scaling. It is universal because it subsumes Amdahl's law and Gustafson linearized scaling as special cases. Using simulation, we show: (i) that the USL…
The longitudinal tracking engine of the particle accelerator simulation application PyHEADTAIL shows a heavy potential for parallelisation. For basic beam circulation, the tracking functionality with the leap-frog algorithm is extracted and…
The multiprocessor effect refers to the loss of computing cycles due to processing overhead. Amdahl's law and the Multiprocessing Factor (MPF) are two scaling models used in industry and academia for estimating multiprocessor capacity in…
The paper highlights that the cooperation of the components of the computing systems receives even more focus in the coming age of exascale computing. It discovers that inherent performance limitations exist and identifies the major…
Several works have shown linear speedup is achieved by an asynchronous parallel implementation of stochastic coordinate descent so long as there is not too much parallelism. More specifically, it is known that if all updates are of similar…
Two time scale stochastic approximation algorithms emulate singularly perturbed deterministic differential equations in a certain limiting sense, i.e., the interpolated iterates on each time scale approach certain differential equations in…