English

A Generalized Performance Evaluation Framework for Parallel Systems with Output Synchronization

Performance 2016-12-19 v1

Abstract

Frameworks, such as MapReduce and Hadoop are abundant nowadays. They seek to reap benefits of parallelization, albeit subject to a synchronization constraint at the output. Fork-Join (FJ) queuing models are used to analyze such systems. Arriving jobs are split into tasks each of which is mapped to exactly one server. A job leaves the system when all of its tasks are executed. As a metric of performance, we consider waiting times for both work-conserving and non-work conserving server systems under a mathematical set-up general enough to take into account possible phase-type behavior of the servers, and as suggested by recent evidences, bursty arrivals. To this end, we present a Markov-additive process framework for an FJ system and provide computable bounds on tail probabilities of steady-state waiting times, for both types of servers separately. We apply our results to three scenarios, namely, non-renewal (Markov-modulated) arrivals, servers showing phase-type behavior, and Markov-modulated arrivals and services. We compare our bounds against estimates obtained through simulations and also provide a theoretical conceptualization of provisions in FJ systems. Finally, we calibrate our model with real data traces, and illustrate how our bounds can be used to devise provisions.

Keywords

Cite

@article{arxiv.1612.05543,
  title  = {A Generalized Performance Evaluation Framework for Parallel Systems with Output Synchronization},
  author = {Wasiur R. KhudaBukhsh and Sounak Kar and Amr Rizk and Heinz Koeppl},
  journal= {arXiv preprint arXiv:1612.05543},
  year   = {2016}
}