Related papers: Efficient Parallel Computation of the Estimated Co…

Efficient Parallelization of Short-Range Molecular Dynamics Simulations on Many-Core Systems

This article introduces a highly parallel algorithm for molecular dynamics simulations with short-range forces on single node multi- and many-core systems. The algorithm is designed to achieve high parallel speedups for strongly…

Computational Physics · Physics 2013-11-20 R. Meyer

Accelerating Matrix Multiplication: A Performance Comparison Between Multi-Core CPU and GPU

Matrix multiplication is a foundational operation in scientific computing and machine learning, yet its computational complexity makes it a significant bottleneck for large-scale applications. The shift to parallel architectures, primarily…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-30 Mufakir Qamar Ansari , Mudabir Qamar Ansari

Automatic Parallelization of Sequential Programs

Prior work on Automatically Scalable Computation (ASC) suggests that it is possible to parallelize sequential computation by building a model of whole-program execution, using that model to predict future computations, and then…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-09-21 Peter Kraft , Amos Waterland , Daniel Y Fu , Anitha Gollamudi , Shai Szulanski , Margo Seltzer

Emulating a large memory with a collection of small ones

Sequential computation is well understood but does not scale well with current technology. Within the next decade, systems will contain large numbers of processors with potentially thousands of processors per chip. Despite this, many…

Hardware Architecture · Computer Science 2015-11-17 James Hanlon

Parallel Tiled QR Factorization for Multicore Architectures

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these…

Numerical Analysis · Mathematics 2008-08-12 Alfredo Buttari , Julien Langou , Jakub Kurzak , Jack Dongarra

Parallelizing Program Execution on Distributed Quantum Systems via Compiler/Hardware Co-Design

As quantum computers continue to improve and support larger, more complex computations, smart control hardware and compilers are needed to efficiently leverage the capabilities of these systems. This paper introduces a novel approach to…

Quantum Physics · Physics 2025-11-19 Folkert de Ronde , Alexander Knapen , Stephan Wong , Sebastian Feld

Bayesian Quantum Multiphase Estimation Algorithm

Quantum phase estimation (QPE) is the key subroutine of several quantum computing algorithms as well as a central ingredient in quantum computational chemistry and quantum simulation. While QPE strategies have focused on the estimation of a…

Quantum Physics · Physics 2021-07-26 Valentin Gebhart , Augusto Smerzi , Luca Pezzè

PaREM: A Novel Approach for Parallel Regular Expression Matching

Regular expression matching is essential for many applications, such as finding patterns in text, exploring substrings in large DNA sequences, or lexical analysis. However, sequential regular expression matching may be time-prohibitive for…

Formal Languages and Automata Theory · Computer Science 2015-06-30 Suejb Memeti , Sabri Pllana

A Scalable Shared-Memory Parallel Simplex for Large-Scale Linear Programming

The Simplex tableau has been broadly used and investigated in the industry and academia. With the advent of the big data era, ever larger problems are posed to be solved in ever larger machines whose architecture type did not exist in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-29 Demetrios Coutinho , Felipe O. Lins e Silva , Daniel Aloise , Samuel , Xavier-de-Souza

Parallel Computation of functions of matrices and their action on vectors

We present a novel class of methods to compute functions of matrices or their action on vectors that are suitable for parallel programming. Solving appropriate simple linear systems of equations in parallel (or computing the inverse of…

Numerical Analysis · Mathematics 2022-10-10 Sergio Blanes

Highly Parallel Sparse Matrix-Matrix Multiplication

Generalized sparse matrix-matrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-09 Aydın Buluç , John R. Gilbert

Parallel inversion of huge covariance matrices

An extremely common bottleneck encountered in statistical learning algorithms is inversion of huge covariance matrices, examples being in evaluating Gaussian likelihoods for a large number of data points. We propose general parallel…

Methodology · Statistics 2013-12-09 Anjishnu Banerjee , Joshua Vogelstein , David Dunson

Merge Path - A Visually Intuitive Approach to Parallel Merging

Merging two sorted arrays is a prominent building block for sorting and other functions. Its efficient parallelization requires balancing the load among compute cores, minimizing the extra work brought about by parallelization, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-06-23 Oded Green , Saher Odeh , Yitzhak Birk

Parallel training of linear models without compromising convergence

In this paper we analyze, evaluate, and improve the performance of training generalized linear models on modern CPUs. We start with a state-of-the-art asynchronous parallel training algorithm, identify system-level performance bottlenecks,…

Machine Learning · Computer Science 2018-12-20 Nikolas Ioannou , Celestine Dünner , Kornilios Kourtis , Thomas Parnell

Etude de la Distribution de Calculs Creux sur une Grappe Multi-coeurs

Nowadays, high performance computing is becoming more and more important in different fields research and industry, such as medical imaging and diagnostics, mathematics as well as oil exploration. It refers to intensive computing in some…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-07 Mouadh Ayachi

Parallelisation of a Common Changepoint Detection Method

In recent years, various means of efficiently detecting changepoints in the univariate setting have been proposed, with one popular approach involving minimising a penalised cost function using dynamic programming. In some situations, these…

Methodology · Statistics 2018-10-09 S. O. Tickle , I. A. Eckley , P. Fearnhead , K. Haynes

Fast Parallel Algorithms for Enumeration of Simple, Temporal, and Hop-Constrained Cycles

Cycles are one of the fundamental subgraph patterns and being able to enumerate them in graphs enables important applications in a wide variety of fields, including finance, biology, chemistry, and network science. However, to enable cycle…

Data Structures and Algorithms · Computer Science 2023-07-18 Jovan Blanuša , Kubilay Atasu , Paolo Ienne

Effective Parallelisation for Machine Learning

We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to growing amounts of data as well as growing needs for accurate and confident predictions in critical applications. In contrast to other…

Machine Learning · Computer Science 2018-10-09 Michael Kamp , Mario Boley , Olana Missura , Thomas Gärtner

Scalable Fine-Grained Parallel Cycle Enumeration Algorithms

Enumerating simple cycles has important applications in computational biology, network science, and financial crime analysis. In this work, we focus on parallelising the state-of-the-art simple cycle enumeration algorithms by Johnson and…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-01 Jovan Blanuša , Paolo Ienne , Kubilay Atasu

Parallel Adaptive Sampling with almost no Synchronization

Approximation via sampling is a widespread technique whenever exact solutions are too expensive. In this paper, we present techniques for an efficient parallelization of adaptive (a. k. a. progressive) sampling algorithms on multi-threaded…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-25 Alexander van der Grinten , Eugenio Angriman , Henning Meyerhenke