Related papers: Enhancing Trace Visualizations for Microservices P…
Analysis of microservices' performance is a considerably challenging task due to the multifaceted nature of these systems. Each request to a microservices system might raise several Remote Procedure Calls (RPCs) to services deployed on…
The evolution of decentralized microservice-based systems is challenging. These challenges are classified into static and dynamic categories. Regarding the static perspective, documenting and visualizing the fluid application topology is…
Distributed systems are comprised of many components that communicate together to form an application. Distributed tracing gives us visibility into these complex interactions, but it can be difficult to reason about the system's behavior,…
With the evolution of microservice applications, the underlying architectures have become increasingly complex compared to their monolith counterparts. This mainly brings in the challenge of observability. By providing a deeper…
This work-in-progress report presents both the design and partial evaluation of distributed execution indexing, a technique for microservice applications that precisely identifies dynamic instances of inter-service remote procedure calls…
Performance tools for emerging heterogeneous exascale platforms must address two principal challenges when analyzing execution measurements. First, measurement of large-scale executions may record mountains of performance data. Second,…
Performance issues in cloud systems are hard to debug. Distributed tracing is a widely adopted approach that gives engineers visibility into cloud systems. Existing trace analysis approaches focus on debugging single request correctness…
Microservice system solutions are driving digital transformation; however, fundamental tools and system perspectives are missing to better observe, understand, and manage these systems, their properties, and their dependencies.…
The evolution of distributed architectures and programming paradigms for performance-oriented program development, challenge the state-of-the-art technology for performance tools. The area of high performance computing is rapidly expanding…
Microservices bring various benefits to software systems. They also bring decentralization and lose coupling across self-contained system parts. Since these systems likely evolve in a decentralized manner, they need to be monitored to…
Large-scale GPU traces play a critical role in identifying performance bottlenecks within heterogeneous High-Performance Computing (HPC) architectures. However, the sheer volume and complexity of a single trace of data make performance…
The rise of microservice architectures has revolutionized application design, fostering adaptability and resilience. These architectures facilitate scaling and encourage collaborative efforts among specialized teams, streamlining deployment…
Microservices are supporting digital transformation; however, fundamental tools and system perspectives are missing to better observe, understand, and manage these systems, their properties, and their dependencies. Microservices…
With the maturity of web services, containers, and cloud computing technologies, large services in traditional systems (e.g. the computation services of machine learning and artificial intelligence) are gradually being broken down into many…
Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large…
To improve customer experience, datacenter operators offer support for simplifying application and resource management. For example, running workloads of workflows on behalf of customers is desirable, but requires increasingly more…
Executing operational processes generates event data, which contain information on the executed process activities. Process mining techniques allow to systematically analyze event data to gain insights that are then used to optimize…
Serverless applications can be particularly difficult to troubleshoot, as these applications are often composed of various managed and partly managed services. Faults are often unpredictable and can occur at multiple points, even in simple…
Understanding the behavior of software in execution is a key step in identifying and fixing performance issues. This is especially important in high performance computing contexts where even minor performance tweaks can translate into large…
Fully understanding performance is a growing challenge when building next-generation cloud systems. Often these systems build on next-generation hardware, and evaluation in realistic physical testbeds is out of reach. Even when physical…