English

StreamBed: capacity planning for stream processing

Distributed, Parallel, and Cluster Computing 2023-10-02 v4

Abstract

StreamBed is a capacity planning system for stream processing. It predicts, ahead of any production deployment, the resources that a query will require to process an incoming data rate sustainably, and the appropriate configuration of these resources. StreamBed builds a capacity planning model by piloting a series of runs of the target query in a small-scale, controlled testbed. We implement StreamBed for the popular Flink DSP engine. Our evaluation with large-scale queries of the Nexmark benchmark demonstrates that StreamBed can effectively and accurately predict capacity requirements for jobs spanning more than 1,000 cores using a testbed of only 48 cores.

Keywords

Cite

@article{arxiv.2309.03377,
  title  = {StreamBed: capacity planning for stream processing},
  author = {Guillaume Rosinosky and Donatien Schmitz and Etienne Rivière},
  journal= {arXiv preprint arXiv:2309.03377},
  year   = {2023}
}

Comments

14 pages, 11 figures. This project has been funded by the Walloon region (Belgium) through the Win2Wal project GEPICIAD