Related papers: Recent Developments in Iterative Methods for Reduc…
Motivated by large-scale optimization problems arising in the context of machine learning, there have been several advances in the study of asynchronous parallel and distributed optimization methods during the past decade. Asynchronous…
In recent years, various means of efficiently detecting changepoints in the univariate setting have been proposed, with one popular approach involving minimising a penalised cost function using dynamic programming. In some situations, these…
In view of the existing limitations of sequential computing, parallelization has emerged as an alternative in order to improve the speedup of numerical simulations. In the framework of evolutionary problems, space-time parallel methods…
Sorting is one of the most fundamental problems in the field of computer science. With the rapid development of manycore processors, it shows great importance to design efficient parallel sort algorithm on manycore architecture. This paper…
Nowadays, parallel computing is ubiquitous in several application fields, both in engineering and science. The computations rely on the floating-point arithmetic specified by the IEEE754 Standard. In this context, an elementary brick of…
The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For…
Parallel computing has played an important role in speeding up convex optimization methods for big data analytics and large-scale machine learning (ML). However, the scalability of these optimization methods is inhibited by the cost of…
Sequential computation is well understood but does not scale well with current technology. Within the next decade, systems will contain large numbers of processors with potentially thousands of processors per chip. Despite this, many…
Asynchronous iterations are more and more investigated for both scaling and fault-resilience purpose on high performance computing platforms. While so far, they have been exclusively applied within space domain decomposition frameworks,…
Numerical solution of partial differential equations on parallel computers using domain decomposition usually requires synchronization and communication among the processors. These operations often have a significant overhead in terms of…
Arrival of multicore systems has enforced a new scenario in computing, the parallel and distributed algorithms are fast replacing the older sequential algorithms, with many challenges of these techniques. The distributed algorithms provide…
Recent years have witnessed the surge of asynchronous parallel (async-parallel) iterative algorithms due to problems involving very large-scale data and a large number of decision variables. Because of asynchrony, the iterates are computed…
Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk…
In the past, efforts were taken to improve the performance of a processor via frequency scaling. However, industry has reached the limits of increasing the frequency and therefore concurrent execution of instructions on multiple cores seems…
Analyzing big data in a highly dynamic environment becomes more and more critical because of the increasingly need for end-to-end processing of this data. Modern data flows are quite complex and there are not efficient, cost-based,…
We introduce and analyze different strategies for the parallel-in-time integration method PFASST to recover from hard faults and subsequent data loss. Since PFASST stores solutions at multiple time steps on different processors, information…
Two characteristics that make convex decomposition algorithms attractive are simplicity of operations and generation of parallelizable structures. In principle, these schemes require that all coordinates update at the same time, i.e., they…
Asynchronous iterations arise naturally in parallel computing if one wants to solve large problems with a minimization of the idle times. This paper presents an original model of asynchronous iterations for a time-domain decomposition…
Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…
Nowadays, we are to find out solutions to huge computing problems very rapidly. It brings the idea of parallel computing in which several machines or processors work cooperatively for computational tasks. In the past decades, there are a…