English
Related papers

Related papers: MPI Implementation Profiling for Better Applicatio…

200 papers

The progression of communication in the Message Passing Interface (MPI) is not well defined, yet it is critical for application performance, particularly in achieving effective computation and communication overlap. The opaque nature of MPI…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-16 Hui Zhou , Robert Latham , Ken Raffenetti , Yanfei Guo , Rajeev Thakur

The Message Passing Interface (MPI) is the most commonly used application programming interface for process communication on current large-scale parallel systems. Due to the scale and complexity of modern parallel architectures, it is…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-09-05 Sascha Hunold , Alexandra Carpen-Amarie , Felix Donatus Lübbe , Jesper Larsson Träff

The large variety of production implementations of the message passing interface (MPI) each provide unique and varying underlying algorithms. Each emerging supercomputer supports one or a small number of system MPI installations, tuned for…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-15 Amanda Bienz , Derek Schafer , Anthony Skjellum

The Message Passing Interface (MPI) is the prevalent programming model used on today's supercomputers. Therefore, MPI library developers are looking for the best possible performance (shortest run-time) of individual MPI functions across…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-05-30 Sascha Hunold , Alexandra Carpen-Amarie

Hybrid MPI+threads programming is gaining prominence, but, in practice, applications perform slower with it compared to the MPI everywhere model. The most critical challenge to the parallel efficiency of MPI+threads applications is slow…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-30 Rohit Zambre , Aparna Chandramowlishwaran

MPI is the most widely used interface for high-performance computing (HPC) workloads. Its success lies in its embrace of libraries and ability to evolve while maintaining backward compatibility for older codes, enabling them to run on new…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-23 Jeff R. Hammond , Lisandro Dalcin , Erik Schnetter , Marc Pérache , Jean-Baptiste Besnard , Jed Brown , Gonzalo Brito Gadeschi , Joseph Schuchart , Simon Byrne , Hui Zhou

The increasing complexity of HPC architectures and the growing adoption of irregular scientific algorithms demand efficient support for asynchronous, multithreaded communication. This need is especially pronounced with Asynchronous…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-27 Jiakun Yan , Marc Snir , Yanfei Guo

Profiling techniques are used extensively at different parts of the computing stack to achieve many goals. One major goal is to make a piece of software execute more efficiently on a specific hardware platform, where efficiency spans…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-07 Chris Quackenbush , Mohamed Zahran

As HPC system architectures and the applications running on them continue to evolve, the MPI standard itself must evolve. The trend in current and future HPC systems toward powerful nodes with multiple CPU cores and multiple GPU…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-20 Hui Zhou , Ken Raffenetti , Yanfei Guo , Thomas Gillis , Robert Latham , Rajeev Thakur

Understanding and visualizing the full-stack performance trade-offs and interplay between HPC applications, MPI libraries, the communication fabric, and the file system is a challenging endeavor. Designing a holistic profiling and…

Graphics · Computer Science 2021-09-20 Pouya Kousha , Quentin Anthony , Hari Subramoni , Dhabaleswar K. Panda

The Message Passing Interface (MPI) has been extremely successful as a portable way to program high-performance parallel computers. This success has occurred in spite of the view of many that message passing is difficult and that other…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 William D. Gropp

Offload of MPI collectives to network devices, e.g., NICs and switches, is being implemented as an effective mechanism to improve application performance by reducing inter- and intra-node communication and bypassing MPI software layers.…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-01 Pouya Haghi , Ryan Marshall , Po Hao Chen , Anthony Skjellum , Martin Herbordt

Composability is one of seven reasons for the long-standing and continuing success of MPI. Extending MPI by composing its operations with user-level operations provides useful integration with the progress engine and completion notification…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-09-27 Derek Schafer , Sheikh Ghafoor , Daniel Holmes , Martin Ruefenacht , Anthony Skjellum

The current trend of multicore architectures on shared memory systems underscores the need of parallelism. While there are some programming model to express parallelism, thread programming model has become a standard to support these system…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-12-13 D. T. Hasta , A. B. Mutiara

Comprehending the performance bottlenecks at the core of the intricate hardware-software interactions exhibited by highly parallel programs on HPC clusters is crucial. This paper sheds light on the issue of automatically asynchronous MPI…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-06 Ayesha Afzal , Georg Hager , Stefano Markidis , Gerhard Wellein

Message Passing Interface (MPI) is widely used to implement parallel programs. Although Windowsbased architectures provide the facilities of parallel execution and multi-threading, little attention has been focused on using MPI on these…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-05-31 Alaa Ismail Elnashar

In this paper, we detail how two types of distributed coordinator election algorithms can be compared in terms of performance based on an evaluation on the High Performance Computing (HPC) infrastructure. An experimental approach based on…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-09 Filip De Turck

Data streams are a sequence of data flowing between source and destination processes. Streaming is widely used for signal, image and video processing for its efficiency in pipelining and effectiveness in reducing demand for memory. The goal…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-08-07 Ivy Bo Peng , Stefano Markidis , Roberto Gioiosa , Gokcen Kestor , Erwin Laure

Asynchronous programming models (APM) are gaining more and more traction, allowing applications to expose the available concurrency to a runtime system tasked with coordinating the execution. While MPI has long provided support for…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-23 Joseph Schuchart , Philipp Samfass , Christoph Niethammer , José Gracia , George Bosilca

The Message Passing Interface (MPI) is the de facto standard message-passing infrastructure for developing parallel applications. Two decades after the first version of the library specification, MPI-based applications are nowadays…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-12-11 Eduardo R. B. Marques , Francisco Martins , Vasco T. Vasconcelos , Nicholas Ng , Nuno Martins
‹ Prev 1 2 3 10 Next ›