Related papers: Mapping Matters: Application Process Mapping on 3-…

Comparison of Three Job Mapping Algorithms for Supercomputer Resource Managers

Performance of supercomputer depends on the quality of resource manager, one of its functions is assignment of jobs to the nodes of clusters or MPP computers. Parts of parallel programs interact with each other with different intensity, and…

Performance · Computer Science 2022-12-26 A. V. Baranov , E. A. Kiselev , B. M. Shabanov , A. A. Sorokin , P. N. Telegin

A Novel Process Mapping Strategy in Clustered Environments

Nowadays the number of available processing cores within computing nodes which are used in recent clustered environments, are growing up with a rapid rate. Despite this trend, the number of available network interfaces in such computing…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-07-13 Mohsen Soryani , Morteza Analoui , Ghobad Zarrinchian

Towards a decentralized algorithm for mapping network and computational resources for distributed data-flow computations

Several high-throughput distributed data-processing applications require multi-hop processing of streams of data. These applications include continual processing on data streams originating from a network of sensors, composing a multimedia…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-03-26 Shah Asaduzzaman , Muthucumaru Maheswaran

Shared-Memory Hierarchical Process Mapping

Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another. Typically, the communication pattern between tasks is sparse…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-03 Christian Schulz , Henning Woydt

Resource Allocation Strategies for In-Network Stream Processing

In this paper we consider the operator mapping problem for in-network stream processing applications. In-network stream processing consists in applying a tree of operators in steady-state to multiple data objects that are continually…

Distributed, Parallel, and Cluster Computing · Computer Science 2008-07-11 Anne Benoit , Henri Casanova , Veronika Rehn-Sonigo , Yves Robert

Improving the Performance and Resilience of MPI Parallel Jobs with Topology and Fault-Aware Process Placement

HPC systems keep growing in size to meet the ever-increasing demand for performance and computational resources. Apart from increased performance, large scale systems face two challenges that hinder further growth: energy efficiency and…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-06 Ioannis Vardas , Manolis Ploumidis , Manolis Marazakis

A Design-Time/Run-Time Application Mapping Methodology for Predictable Execution Time in MPSoCs

Executing multiple applications on a single MPSoC brings the major challenge of satisfying multiple quality requirements regarding real-time, energy, etc. Hybrid application mapping denotes the combination of design-time analysis with…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-22 Andreas Weichslgartner , Stefan Wildermann , Deepak Gangadharan , Michael Glaß , Jürgen Teich

Optimizing Latency and Reliability of Pipeline Workflow Applications

Mapping applications onto heterogeneous platforms is a difficult challenge, even for simple application patterns such as pipeline graphs. The problem is even more complex when processors are subject to failure during the execution of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2008-03-26 Anne Benoit , Veronika Rehn-Sonigo , Yves Robert

DFModel: Design Space Optimization of Large-Scale Systems Exploiting Dataflow Mappings

We propose DFModel, a modeling framework for mapping dataflow computation graphs onto large-scale systems. Mapping a workload to a system requires optimizing dataflow mappings at various levels, including the inter-chip (between chips)…

Hardware Architecture · Computer Science 2024-12-24 Sho Ko , Nathan Zhang , Olivia Hsu , Ardavan Pedram , Kunle Olukotun

Better Process Mapping and Sparse Quadratic Assignment

Communication and topology aware process mapping is a powerful approach to reduce communication time in parallel applications with known communication patterns on large, distributed memory systems. We address the problem as a quadratic…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-23 Christian Schulz , Jesper Larsson Träff , Konrad von Kirchbach

Algorithms for Mapping Parallel Processes onto Grid and Torus Architectures

Static mapping is the assignment of parallel processes to the processing elements (PEs) of a parallel system, where the assignment does not change during the application's lifetime. In our scenario we model an application's computations and…

Data Structures and Algorithms · Computer Science 2015-03-03 Roland Glantz , Henning Meyerhenke , Alexander Noe

A survey on scheduling and mapping techniques in 3D Network-on-chip

Network-on-Chips (NoCs) have been widely employed in the design of multiprocessor system-on-chips (MPSoCs) as a scalable communication solution. NoCs enable communications between on-chip Intellectual Property (IP) cores and allow those…

Hardware Architecture · Computer Science 2022-11-07 Simran Preet Kaur , Manojit Ghose , Ananya Pathak , Rutuja Patole

Measuring and Understanding Throughput of Network Topologies

High throughput is of particular interest in data center and HPC networks. Although myriad network topologies have been proposed, a broad head-to-head comparison across topologies and across traffic patterns is absent, and the right way to…

Networking and Internet Architecture · Computer Science 2016-11-16 Sangeetha Abdu Jyothi , Ankit Singla , P. Brighten Godfrey , Alexandra Kolla

A Heuristic Approach to Protocol Tuning for High Performance Data Transfers

Obtaining optimal data transfer performance is of utmost importance to today's data-intensive distributed applications and wide-area data replication services. Doing so necessitates effectively utilizing available network bandwidth and…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-08-21 Engin Arslan , Tevfik Kosar

GPU-Accelerated Algorithms for Process Mapping

Process mapping asks to assign vertices of a task graph to processing elements of a supercomputer such that the computational workload is balanced while the communication cost is minimized. Motivated by the recent success of GPU-based graph…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-16 Petr Samoldekin , Christian Schulz , Henning Woydt

PMEvo: Portable Inference of Port Mappings for Out-of-Order Processors by Evolutionary Optimization

Achieving peak performance in a computer system requires optimizations in every layer of the system, be it hardware or software. A detailed understanding of the underlying hardware, and especially the processor, is crucial to optimize…

Hardware Architecture · Computer Science 2020-04-22 Fabian Ritter , Sebastian Hack

Parallel Mapper

The construction of Mapper has emerged in the last decade as a powerful and effective topological data analysis tool that approximates and generalizes other topological summaries, such as the Reeb graph, the contour tree, split, and joint…

Computer Vision and Pattern Recognition · Computer Science 2020-09-15 Mustafa Hajij , Basem Assiri , Paul Rosen

Co-Scheduling Algorithms for High-Throughput Workload Execution

This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several…

Data Structures and Algorithms · Computer Science 2013-05-01 Guillaume Aupy , Manu Shantharam , Anne Benoit , Yves Robert , Padma Raghavan

Energy-efficient Runtime Resource Management for Adaptable Multi-application Mapping

Modern embedded computing platforms consist of a high amount of heterogeneous resources, which allows executing multiple applications on a single device. The number of running application on the system varies with time and so does the…

Systems and Control · Electrical Eng. & Systems 2020-02-19 Robert Khasanov , Jeronimo Castrillon

High-Quality Hierarchical Process Mapping

Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation when processing graphs on a parallel computer. When a topology of a distributed system is known an important task…

Data Structures and Algorithms · Computer Science 2020-01-23 Marcelo Fonseca Faraj , Alexander van der Grinten , Henning Meyerhenke , Jesper Larsson Träff , Christian Schulz