English

On the Throughput Optimization in Large-Scale Batch-Processing Systems

Performance 2020-09-22 v1

Abstract

We analyze a data-processing system with nn clients producing jobs which are processed in \textit{batches} by mm parallel servers; the system throughput critically depends on the batch size and a corresponding sub-additive speedup function. In practice, throughput optimization relies on numerical searches for the optimal batch size, a process that can take up to multiple days in existing commercial systems. In this paper, we model the system in terms of a closed queueing network; a standard Markovian analysis yields the optimal throughput in ω(n4)\omega\left(n^4\right) time. Our main contribution is a mean-field model of the system for the regime where the system size is large. We show that the mean-field model has a unique, globally attractive stationary point which can be found in closed form and which characterizes the asymptotic throughput of the system as a function of the batch size. Using this expression we find the \textit{asymptotically} optimal throughput in O(1)O(1) time. Numerical settings from a large commercial system reveal that this asymptotic optimum is accurate in practical finite regimes.

Keywords

Cite

@article{arxiv.2009.09433,
  title  = {On the Throughput Optimization in Large-Scale Batch-Processing Systems},
  author = {Sounak Kar and Robin Rehrmann and Arpan Mukhopadhyay and Bastian Alt and Florin Ciucu and Heinz Koeppl and Carsten Binnig and Amr Rizk},
  journal= {arXiv preprint arXiv:2009.09433},
  year   = {2020}
}

Comments

15 pages

R2 v1 2026-06-23T18:40:15.043Z