Related papers: A Note on Parallel Algorithmic Speedup Bounds
In this work I present a generalization of Amdahl's law on the limits of a parallel implementation with many processors. In particular I establish some mathematical relations involving the number of processors and the dimension of the…
With the spread of multi- and many-core processors more and more typical task is to re-implement some source code written originally for a single processor to run on more than one cores. Since it is a serious investment, it is important to…
Several works have shown linear speedup is achieved by an asynchronous parallel implementation of stochastic coordinate descent so long as there is not too much parallelism. More specifically, it is known that if all updates are of similar…
After Amdahl's trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size,…
We seek tight bounds on the viable parallelism in asynchronous implementations of coordinate descent that achieves linear speedup. We focus on asynchronous coordinate descent (ACD) algorithms on convex functions which consist of the sum of…
The problem of learning parallel computer performance is investigated in the context of multicore processors. Given a fixed workload, the effect of varying system configuration on performance is sought. Conventionally, the performance…
In high performance computing environments, we observe an ongoing increase in the available numbers of cores. This development calls for re-emphasizing performance (scalability) analysis and speedup laws as suggested in the literature…
The author's research of topologies of parallel computing systems and the tasks solved with them, including the corresponding tools of their modeling, is summarized in the present paper. The original topological model of such systems is…
Evaluating how well a whole system or set of subsystems performs is one of the primary objectives of performance testing. We can tell via performance assessment if the architecture implementation meets the design objectives. Performance…
Parametric linear programming is central in polyhedral computations and in certain control applications.We propose a task-based scheme for parallelizing it, with quasi-linear speedup over large problems.
Parametric linear programming is a central operation for polyhedral computations, as well as in certain control applications.Here we propose a task-based scheme for parallelizing it, with quasi-linear speedup over large problems.This type…
To satisfy the increasing performance needs of modern cyber-physical systems, multiprocessor architectures are increasingly utilized. To efficiently exploit their potential parallelism in hard real-time systems, appropriate task models and…
The computing performance today is developing mainly using parallelized sequential computing, in many forms. The paper scrutinizes whether the performance of that type of computing has an upper limit. The simple considerations point out…
Parallel programming models can encourage performance portability by moving the responsibility for work assignment and data distribution from the programmer to a runtime system. However, analyzing the resulting implicit memory allocations,…
The modification of Amdahl's law for the case of increment of processor elements in a computer system is considered. The coefficient $k$ linking accelerations of parallel and parallel specialized computer systems is determined. The limiting…
The paper highlights that the cooperation of the components of the computing systems receives even more focus in the coming age of exascale computing. It discovers that inherent performance limitations exist and identifies the major…
In competitive parallel computing, the identical copies of a code in a phase of a sequential program are assigned to processor cores and the result of the fastest core is adopted. In the literature, it is reported that a superlinear speedup…
This work analyses the effects of sequential-to-parallel synchronization and inter-core communication on multicore performance, speedup and scaling. A modification of Amdahl law is formulated, to reflect the finding that parallel speedup is…
Classical Amdahl's Law conceptualized the limit of speedup for an era of fixed serial-parallel decomposition and homogeneous replication. Modern heterogeneous systems need a different conceptual framework: constrained resources must be…
The need for scalable numerical solutions has motivated the development of asynchronous parallel algorithms, where a set of nodes run in parallel with little or no synchronization, thus computing with delayed information. This paper studies…