Related papers: Debugging Tool for Localizing Faulty Processes in …
Message Passing Interface (MPI) is the most commonly used paradigm in writing parallel programs since it can be employed not only within a single processing node but also across several connected ones. Data flow analysis concepts,…
The Message Passing Interface specification (MPI) defines a portable message-passing API used to program parallel computers. MPI programs manifest a number of challenges on what concerns correctness: sent and expected values in…
Message Passing Interfaces (MPI) plays an important role in parallel computing. Many parallel applications are implemented as MPI programs. The existing methods of bug detection for MPI programs have the shortage of providing both input and…
Debugging parallel and distributed programs is a difficult activitiy due to the multiplicity of sequential bugs, the existence of malign effects like race conditions and deadlocks, and the huge amounts of data that have to be processed.…
Software debugging is a very time-consuming process, which is even worse for multi-threaded programs, due to the non-deterministic behavior of thread-scheduling algorithms. However, the debugging time may be greatly reduced, if automatic…
The Message Passing Interface (MPI) is the de facto standard message-passing infrastructure for developing parallel applications. Two decades after the first version of the library specification, MPI-based applications are nowadays…
Message Passing Interface (MPI) is a foundational technology in high-performance computing (HPC), widely used for large-scale simulations and distributed training (e.g., in machine learning frameworks such as PyTorch and TensorFlow).…
The Message Passing Interface (MPI) is the most commonly used application programming interface for process communication on current large-scale parallel systems. Due to the scale and complexity of modern parallel architectures, it is…
Message passing is widely used in industry to develop programs consisting of several distributed communicating components. Developing functionally correct message passing software is very challenging due to the concurrent nature of message…
Unlike traditional programs (such as operating systems or word processors) which have large amounts of code, machine learning tasks use programs with relatively small amounts of code (written in machine learning libraries), but voluminous…
Tracing back the instruction execution sequence to debug a multicore system can be very time-consuming because the relationships of the instructions can be very complex. For instructions that cannot be checked by the environment immediately…
Fault localization is one of the most time-consuming and error-prone parts of software debugging. There are several tools for helping developers in the fault localization process, however, they mostly target programs written in Java and…
The Message Passing Interface (MPI) is widely used in parallel, high-performance programming, yet writing bug-free software that uses MPI remains difficult. We introduce DafnyMPI, a novel, scalable approach to formally verifying MPI…
Program correctness is one of the most difficult challenges in parallel programming. Message Passing Interface MPI is widely used in writing parallel applications. Since MPI is not a compiled language, the programmer will be enfaced with…
Identifying errors in parallel MPI programs is a challenging task. Despite the growing number of verification tools, debugging parallel programs remains a significant challenge. This paper is the first to utilize embedding and deep learning…
A model of MPI synchronization communication programs is presented and its three basic simplified models are also defined. A series of theorems and methods for deciding whether deadlocks will occur among the three models are given and…
Low Density Parity Check (LDPC) codes are linear error correcting codes used in communication systems for Forward Error Correction (FEC). But, intensive computation is required for encoding and decoding of LDPC codes, making it difficult…
Software testing helps developers to identify bugs. However, awareness of bugs is only the first step. Finding and correcting the faulty program components is equally hard and essential for high-quality software. Fault localization…
Program errors can occur in any type of programming, and can manifest in a variety of ways, such as unexpected output, crashes, or performance issues. And program error diagnosis can often be too abstract or technical for developers to…
The validation process for microprocessors is a very complex task that consumes substantial engineering time during the design process. Bugs that degrade overall system performance, without affecting its functional correctness, are…