English

Aggregate-Driven Trace Visualizations for Performance Debugging

Distributed, Parallel, and Cluster Computing 2020-10-27 v1 Human-Computer Interaction

Abstract

Performance issues in cloud systems are hard to debug. Distributed tracing is a widely adopted approach that gives engineers visibility into cloud systems. Existing trace analysis approaches focus on debugging single request correctness issues but not debugging single request performance issues. Diagnosing a performance issue in a given request requires comparing the performance of the offending request with the aggregate performance of typical requests. Effective and efficient debugging of such issues faces three challenges: (i) identifying the correct aggregate data for diagnosis; (ii) visualizing the aggregated data; and (iii) efficiently collecting, storing, and processing trace data. We present TraVista, a tool designed for debugging performance issues in a single trace that addresses these challenges. TraVista extends the popular single trace Gantt chart visualization with three types of aggregate data - metric, temporal, and structure data, to contextualize the performance of the offending trace across all traces.

Keywords

Cite

@article{arxiv.2010.13681,
  title  = {Aggregate-Driven Trace Visualizations for Performance Debugging},
  author = {Vaastav Anand and Matheus Stolet and Thomas Davidson and Ivan Beschastnikh and Tamara Munzner and Jonathan Mace},
  journal= {arXiv preprint arXiv:2010.13681},
  year   = {2020}
}