Related papers: Partitioning Uncertain Workflows
When partitioning workflows in realistic scenarios, the knowledge of the processing units is often vague or unknown. A naive approach to addressing this issue is to perform many controlled experiments for different workloads, each…
Orchestrating service-oriented workflows is typically based on a design model that routes both data and control through a single point - the centralised workflow engine. This causes scalability problems that include the unnecessary…
A numerical framework based on network partition and operator splitting is developed to solve nonlinear differential equations of large-scale dynamic processes encountered in physics, chemistry and biology. Under the assumption that those…
This paper is about how to partition decision variables while decomposing a large-scale optimization problem for the best performance of distributed solution methods. Solving a large-scale optimization problem sequen- tially can be…
To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several…
Partitioning an input graph over a set of workers is a complex operation. Objectives are twofold: split the work evenly, so that every worker gets an equal share, and minimize edge cut to achieve a good work locality (i.e. workers can work…
The rigid gang task model is based on the idea of executing multiple threads simultaneously on a fixed number of processors to increase efficiency and performance. Although there is extensive literature on global rigid gang scheduling,…
The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the…
State-of-the-art data flow systems such as TensorFlow impose iterative calculations on large graphs that need to be partitioned on heterogeneous devices such as CPUs, GPUs, and TPUs. However, partitioning can not be viewed in isolation.…
Progress in science is deeply bound to the effective use of high-performance computing infrastructures and to the efficient extraction of knowledge from vast amounts of data. Such data comes from different sources that follow a cycle…
Partitioning a graph into balanced blocks such that few edges run between blocks is a key problem for large-scale distributed processing. A current trend for partitioning huge graphs are streaming algorithms, which use low computational…
Large-scale datasets in the form of knowledge graphs are often used in numerous domains, today. A knowledge graphs size often exceeds the capacity of a single computer system, especially if the graph must be stored in main memory. To…
Interactive urgent computing is a small but growing user of supercomputing resources. However there are numerous technical challenges that must be overcome to make supercomputers fully suited to the wide range of urgent workloads which…
Discrete Fracture Network models are largely used for very large scale geological flow simulations. For this reason numerical methods require an investigation of tools for efficient parallel solutions on High Performance Computing systems.…
A computational workflow, also known as workflow, consists of tasks that must be executed in a specific order to attain a specific goal. Often, in fields such as biology, chemistry, physics, and data science, among others, these workflows…
The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For…
Analyzing large graph data is an essential part of many modern applications, such as social networks. Due to its large computational complexity, distributed processing is frequently employed. This requires graph data to be divided across…
Scheduling is an important task allowing parallel systems to perform efficiently and reliably. For modern computation systems, divisible load is a special type of data which can be divided into arbitrary sizes and independently processed in…
Distributed computing frameworks such as MapReduce are often used to process large computational jobs. They operate by partitioning each job into smaller tasks executed on different servers. The servers also need to exchange intermediate…
Distributed computing platforms provide a robust mechanism to perform large-scale computations by splitting the task and data among multiple locations, possibly located thousands of miles apart geographically. Although such distribution of…