Related papers: Multithreaded Input-Sensitive Profiling
Embedded Systems combine one or more processor cores with dedicated logic running on an ASIC or FPGA to meet design goals at reasonable cost. It is achieved by profiling the application with variety of aspects like performance, memory…
While high-level languages come with significant readability and maintainability benefits, their performance remains difficult to predict. For example, programmers may unknowingly use language features inappropriately, which cause their…
Profiling tools (also known as profilers) play an important role in understanding program performance at runtime, such as hotspots, bottlenecks, and inefficiencies. While profilers have been proven to be useful, they give extra burden to…
Profiling is important for performance optimization by providing real-time observations and measurements of important parameters of hardware execution. Existing profiling tools for High-Level Synthesis (HLS) IPs running on FPGAs are far…
Modern real-time systems require accurate characterization of task timing behavior to ensure predictable performance, particularly on complex hardware architectures. Existing methods, such as worst-case execution time analysis, often fail…
Profiling techniques are used extensively at different parts of the computing stack to achieve many goals. One major goal is to make a piece of software execute more efficiently on a specific hardware platform, where efficiency spans…
Existing profilers for scripting languages (a.k.a. "glue" languages) like Python suffer from numerous problems that drastically limit their usefulness. They impose order-of-magnitude overheads, report information at too coarse a…
Background: We describe an informatics framework for researchers and clinical investigators to efficiently perform parameter sensitivity analysis and auto-tuning for algorithms that segment and classify image features in a large dataset of…
Instrumenting programs for performing run-time checking of properties, such as regular shapes, is a common and useful technique that helps programmers detect incorrect program behaviors. This is specially true in dynamic languages such as…
Memory profiling captures programs' dynamic memory behavior, assisting programmers in debugging, tuning, and enabling advanced compiler optimizations like speculation-based automatic parallelization. As each use case demands its unique…
Major chip manufacturers have all introduced Multithreaded processors. These processors are used for running a variety of workloads. Efficient resource utilization is an important design aspect in such processors. Depending on the workload,…
Performance engineering has become crucial for the cloud-native architecture. This architecture deploys multiple services, with each service representing an orchestration of containerized processes. OpenTelemetry is growing popular in the…
Effective performance profiling and analysis are essential for optimizing training and inference of deep learning models, especially given the growing complexity of heterogeneous computing environments. However, existing tools often lack…
High-end ARM processors are emerging in data centers and HPC systems, posing as a strong contender to x86 machines. Memory-centric profiling is an important approach for dissecting an application's bottlenecks on memory access and guiding…
In present study, in order to improve the performance and reduce the amount of power which is dissipated in heterogeneous multicore processors, the ability of detecting the program execution phases is investigated. The programs execution…
This paper proposes TASKPROF, a profiler that identifies parallelism bottlenecks in task parallel programs. It leverages the structure of a task parallel execution to perform fine-grained attribution of work to various parts of the program.…
Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods. However, a public benchmark with appropriate performance metrics for…
This paper introduces a new open-source tool for the dynamic analyzer Valgrind. The tool measures the amount of memory that is actively being used by a process at any given point in time. While there exist numerous tools to measure the…
Power awareness is fast becoming immensely important in computing, ranging from the traditional High Performance Computing applications, to the new generation of data centric workloads. In this work we describe our efforts towards a power…
Parallel applications are extremely challenging to achieve the optimal performance on the NUMA architecture, which necessitates the assistance of profiling tools. However, existing NUMA-profiling tools share some similar shortcomings, such…