English
Related papers

Related papers: Introducing Molly: Distributed Memory Parallelizat…

200 papers

Molly is a program that compiles cryptographic protocol roles written in a high-level notation into straight-line programs in an intermediate-level imperative language, suitable for implementation in a conventional programming language. We…

Cryptography and Security · Computer Science 2023-11-27 Daniel J. Dougherty , Joshua D. Guttman

We present a set of programming tools (classes and functions written in C++ and based on Message Passing Interface) for fast development of generic parallel (and non-parallel) lattice simulations. They are collectively called MDP 1.2. These…

High Energy Physics - Lattice · Physics 2009-10-31 Massimo Di Pierro

MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of…

Matrix Distributed Processing (MDP) is a C++ library for fast development of efficient parallel algorithms. It constitues the core of FermiQCD. MDP enables programmers to focus on algorithms, while parallelization is dealt with…

High Energy Physics - Lattice · Physics 2007-05-23 Massimo Di Pierro

There are numerous examples of problems in symbolic algebra in which the required storage grows far beyond the limitations even of the distributed RAM of a cluster. Often this limitation determines how large a problem one can solve in…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-06-11 Daniel Kunkle

We introduce Model-Distributed Inference for Large-Language Models (MDI-LLM), a novel framework designed to facilitate the deployment of state-of-the-art large-language models (LLMs) across low-power devices at the edge. This is…

Machine Learning · Computer Science 2025-05-27 Davide Macario , Hulya Seferoglu , Erdem Koyuncu

Coordination protocols help programmers of distributed systems reason about the effects of transactions on the state of the system, but they're not cheap. Coordination protocols may involve multiple rounds of communication, which can hurt…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-21 Rolando Garcia , Giulia Guidi

Message Passing Interface (MPI) is widely used to implement parallel programs. Although Windowsbased architectures provide the facilities of parallel execution and multi-threading, little attention has been focused on using MPI on these…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-05-31 Alaa Ismail Elnashar

We present Matrix Distributed Processing, a C++ library for fast development of efficient parallel algorithms. MDP is based on MPI and consists of a collection of C++ classes and functions such as lattice, site and field. Once an algorithm…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Massimo Di Pierro

We present POLO --- a C++ library for large-scale parallel optimization research that emphasizes ease-of-use, flexibility and efficiency in algorithm design. It uses multiple inheritance and template programming to decompose algorithms into…

Optimization and Control · Mathematics 2018-10-09 Arda Aytekin , Martin Biel , Mikael Johansson

Machine Learning and Data Mining (MLDM) algorithms are becoming increasingly important in analyzing large volume of data generated by simulations, experiments and mobile devices. With increasing data volume, distributed memory systems (such…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-08-21 Abhinav Vishnu , Charles Siegel , Jeffrey Daily

As quantum computers continue to improve and support larger, more complex computations, smart control hardware and compilers are needed to efficiently leverage the capabilities of these systems. This paper introduces a novel approach to…

Quantum Physics · Physics 2025-11-19 Folkert de Ronde , Alexander Knapen , Stephan Wong , Sebastian Feld

With the rapid growth of large language models (LLMs), a wide range of methods have been developed to distribute computation and memory across hardware devices for efficient training and inference. While existing surveys provide descriptive…

Machine Learning · Computer Science 2026-02-11 Hossam Amer , Rezaul Karim , Ali Pourranjbar , Weiwei Zhang , Walid Ahmed , Boxing Chen

While modern parallel computing systems provide high performance resources, utilizing them to the highest extent requires advanced programming expertise. Programming for parallel computing systems is much more difficult than programming for…

Programming Languages · Computer Science 2017-04-06 Adrian Calvo Chozas , Suejb Memeti , Sabri Pllana

This paper presents some of our findings on the scalability of parallel 3D mesh generation on distributed memory machines. The primary objective of this study was to evaluate a distributed memory approach for implementing a 3D parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-25 Polykarpos Thomadakis , Nikos Chrisochoides

Many reasoning, planning, and problem-solving tasks share an intrinsic algorithmic nature: correctly simulating each step is a sufficient condition to solve them correctly. We collect pairs of naturalistic and synthetic reasoning tasks to…

In this paper we describe, implement, and test the performance of distributed memory simulations of quantum circuits on the MSU Laconia Top500 supercomputer. Using OpenMP and MPI hybrid parallelization, we first use a distributed…

Quantum Physics · Physics 2018-06-25 Ryan LaRose

As the artificial intelligence community advances into the era of large models with billions of parameters, distributed training and inference have become essential. While various parallelism strategies-data, model, sequence, and…

Machine Learning · Computer Science 2025-03-13 Ruifeng She , Bowen Pang , Kai Li , Zehua Liu , Tao Zhong

What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization…

We provide a mathematically proven parallelization scheme for particle methods on distributed-memory computer systems. Particle methods are a versatile and widely used class of algorithms for computer simulations and numerical predictions…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-05 Johannes Pahlke , Ivo F. Sbalzarini
‹ Prev 1 2 3 10 Next ›