Related papers: Efficient Multidimensional Data Redistribution for…

ReSHAPE: A Framework for Dynamic Resizing and Scheduling of Homogeneous Applications in a Parallel Environment

Applications in science and engineering often require huge computational resources for solving problems within a reasonable time frame. Parallel supercomputers provide the computational infrastructure for solving such problems. A…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Rajesh Sudarsan , Calvin J. Ribbens

ReStore: In-Memory REplicated STORagE for Rapid Recovery in Fault-Tolerant Algorithms

Fault-tolerant distributed applications require mechanisms to recover data lost via a process failure. On modern cluster systems it is typically impractical to request replacement resources after such a failure. Therefore, applications have…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-26 Lukas Hübner , Demian Hespe , Peter Sanders , Alexandros Stamatakis

Memory-efficient array redistribution through portable collective communication

Modern large-scale deep learning workloads highlight the need for parallel execution across many devices in order to fit model data into hardware accelerator memories. In these settings, array redistribution may be required during a…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-29 Norman A. Rink , Adam Paszke , Dimitrios Vytiniotis , Georg Stefan Schmid

Optimizing Distributed Protocols with Query Rewrites [Technical Report]

Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale. New scalable distributed protocols are developed through careful analysis and rewrites, but this process is…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-08 David Chu , Rithvik Panchapakesan , Shadaj Laddad , Lucky Katahanas , Chris Liu , Kaushik Shivakumar , Natacha Crooks , Joseph M. Hellerstein , Heidi Howard

How to Optimally Allocate Resources for Coded Distributed Computing?

Today's data centers have an abundance of computing resources, hosting server clusters consisting of as many as tens or hundreds of thousands of machines. To execute a complex computing task over a data center, it is natural to distribute…

Information Theory · Computer Science 2017-02-24 Qian Yu , Songze Li , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

Resolvable Designs for Speeding up Distributed Computing

Distributed computing frameworks such as MapReduce are often used to process large computational jobs. They operate by partitioning each job into smaller tasks executed on different servers. The servers also need to exchange intermediate…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-20 Konstantinos Konstantinidis , Aditya Ramamoorthy

Multi-Resource Parallel Query Scheduling and Optimization

Scheduling query execution plans is a particularly complex problem in shared-nothing parallel systems, where each site consists of a collection of local time-shared (e.g., CPU(s) or disk(s)) and space-shared (e.g., memory) resources and…

Databases · Computer Science 2014-04-01 Minos Garofalakis , Yannis Ioannidis

ReStorEdge: An edge computing system with reuse semantics

This paper investigates an edge computing system where requests are processed by a set of replicated edge servers. We investigate a class of applications where similar queries produce identical results. To reduce processing overhead on the…

Emerging Technologies · Computer Science 2024-05-29 Adrian-Cristian Nicolaescu , Spyridon Mastorakis , Md Washik Al Azad , David Griffin , Miguel Rio

Migrate when necessary: toward partitioned reclaiming for soft real-time tasks

This paper presents a new strategy for scheduling soft real-time tasks on multiple identical cores. The proposed approach is based on partitioned CPU reservations and it uses a reclaiming mechanism to reduce the number of missed deadlines.…

Operating Systems · Computer Science 2019-05-01 Houssam Eddine Zahaf , Giuseppe Lipari , Luca Abeni , Houssam-Eddine Zahaf

Adaptive Job Scheduling in Quantum Clouds Using Reinforcement Learning

Present-day quantum systems face critical bottlenecks, including limited qubit counts, brief coherence intervals, and high susceptibility to errors-all of which obstruct the execution of large and complex circuits. The advancement of…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-13 Waylon Luo , Jiapeng Zhao , Tong Zhan , Qiang Guan

Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach

As the artificial intelligence community advances into the era of large models with billions of parameters, distributed training and inference have become essential. While various parallelism strategies-data, model, sequence, and…

Machine Learning · Computer Science 2025-03-13 Ruifeng She , Bowen Pang , Kai Li , Zehua Liu , Tao Zhong

Efficiently Scheduling Parallel DAG Tasks on Identical Multiprocessors

Parallel real-time embedded applications can be modelled as directed acyclic graphs (DAGs) whose nodes model subtasks and whose edges model precedence constraints among subtasks. Efficiently scheduling such parallel tasks can be challenging…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-24 Shardul Lendve , Konstantinos Bletsas , Pedro F. Souto

The Reverse Cuthill-McKee Algorithm in Distributed-Memory

Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct solvers, maximize locality in iterative solvers, and improve performance in graph algorithms. Except for naturally parallelizable ordering…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-27 Ariful Azad , Mathias Jacquelin , Aydin Buluc , Esmond G. Ng

Efficient Distributed-Memory Parallel Matrix-Vector Multiplication with Wide or Tall Unstructured Sparse Matrices

This paper presents an efficient technique for matrix-vector and vector-transpose-matrix multiplication in distributed-memory parallel computing environments, where the matrices are unstructured, sparse, and have a substantially larger…

Mathematical Software · Computer Science 2018-12-04 Jonathan Eckstein , Gyorgy Matyasfalvi

A distributed-memory hierarchical solver for general sparse linear systems

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it…

Numerical Analysis · Mathematics 2017-12-21 Chao Chen , Hadi Pouransari , Sivasankaran Rajamanickam , Erik G. Boman , Eric Darve

Analysis of Distributed Algorithms for Big-data

The parallel and distributed processing are becoming de facto industry standard, and a large part of the current research is targeted on how to make computing scalable and distributed, dynamically, without allocating the resources on…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-10 Rajendra Purohit , K R Chowdhary , S D Purohit

Distributed Caching for Complex Querying of Raw Arrays

As applications continue to generate multi-dimensional data at exponentially increasing rates, fast analytics to extract meaningful results is becoming extremely important. The database community has developed array databases that alleviate…

Databases · Computer Science 2018-03-19 Weijie Zhao , Florin Rusu , Bin Dong , Kesheng Wu , Anna Y. Q. Ho , Peter Nugent

A 2D Parallel Triangle Counting Algorithm for Distributed-Memory Architectures

Triangle counting is a fundamental graph analytic operation that is used extensively in network science and graph mining. As the size of the graphs that needs to be analyzed continues to grow, there is a requirement in developing scalable…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-24 Ancy Sarah Tom , George Karypis

Parallel and sequential reclaiming in multicore real-time global scheduling

When integrating hard, soft and non-real-time tasks in general purpose operating systems, it is necessary to provide temporal isolation so that the timing properties of one task do not depend on the behaviour of the others. However, strict…

Operating Systems · Computer Science 2016-03-11 Luca Abeni , Giuseppe Lipari , Andrea Parri , Youcheng Sun

Recursive Matrix Algorithms in Commutative Domain for Cluster with Distributed Memory

We give an overview of the theoretical results for matrix block-recursive algorithms in commutative domains and present the results of experiments that we conducted with new parallel programs based on these algorithms on a supercomputer…

Symbolic Computation · Computer Science 2019-03-12 Gennadi Malaschonok , Evgeni Ilchenko