Related papers: Static Reuse Profile Estimation for Array Applicat…

Static Estimation of Reuse Profiles for Arrays in Nested Loops

Efficient memory access patterns play a crucial role in determining the overall performance of applications by exploiting temporal and spatial locality, thus maximizing cache locality. The Reuse Distance Histogram (RDH) is a widely used…

Performance · Computer Science 2025-09-24 Abdur Razzak , Atanu Barai , Nandakishore Santhi , Abdel-Hameed A. Badawy

LLVM Static Analysis for Program Characterization and Memory Reuse Profile Estimation

Profiling various application characteristics, including the number of different arithmetic operations performed, memory footprint, etc., dynamically is time- and space-consuming. On the other hand, static analysis methods, although fast,…

Software Engineering · Computer Science 2023-11-28 Atanu Barai , Nandakishore Santhi , Abdur Razzak , Stephan Eidenbenz , Abdel-Hameed A. Badawy

Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while…

Other Computer Science · Computer Science 2014-01-21 Naznin Fauzia , Venmugil Elango , Mahesh Ravishankar , J. Ramanujam , Fabrice Rastello , Atanas Rountev , Louis-Noël Pouchet , P. Sadayappan

Fast Modeling L2 Cache Reuse Distance Histograms Using Combined Locality Information from Software Traces

To mitigate the performance gap between CPU and the main memory, multi-level cache architectures are widely used in modern processors. Therefore, modeling the behaviors of the downstream caches becomes a critical part of the processor…

Hardware Architecture · Computer Science 2020-10-13 Ming Ling , Jiancong Ge , Guangmin Wang

Modeling Shared Cache Performance of OpenMP Programs using Reuse Distance

Performance modeling of parallel applications on multicore computers remains a challenge in computational co-design due to the complex design of multicore processors including private and shared memory hierarchies. We present a Scalable…

Performance · Computer Science 2019-07-31 Atanu Barai , Gopinath Chennupati , Nandakishore Santhi , Abdel-Hameed A. Badawy , Stephan Eidenbenz

An Effective Early Multi-core System Shared Cache Design Method Based on Reuse-distance Analysis

In this paper, we proposed an effective and efficient multi-core shared-cache design optimization approach based on reuse-distance analysis of the data traces of target applications. Since data traces are independent of system hardware…

Performance · Computer Science 2021-09-13 Hsin-Yu Ho , Ren-Song Tsay

A Fast-and-Effective Early-Stage Multi-level Cache Optimization Method Based on Reuse-Distance Analysis

In this paper, we propose a practical and effective approach allowing designers to optimize multi-level cache size at the early system design phase. Our key contribution is to generalize the reuse distance analysis method and develop an…

Hardware Architecture · Computer Science 2021-09-13 Cheng-Lin Tsai , Ren-Song Tsay

A General Framework for Static Profiling of Parametric Resource Usage

Traditional static resource analyses estimate the total resource usage of a program, without executing it. In this paper we present a novel resource analysis whose aim is instead the static profiling of accumulated cost, i.e., to discover,…

Programming Languages · Computer Science 2016-10-18 Pedro Lopez-Garcia , Maximiliano Klemen , Umer Liqat , Manuel V. Hermenegildo

Decanting the Contribution of Instruction Types and Loop Structures in the Reuse of Traces

Reuse has been proposed as a microarchitecture-level mechanism to reduce the amount of executed instructions, collapsing dependencies and freeing resources for other instructions. Previous works have used reuse domains such as memory…

Hardware Architecture · Computer Science 2017-11-20 Andrey M. Coppieters , Sheila de Oliveira , Felipe M. G. França , Maurício L. Pilla , Amarildo T. da Costa

Statistical Program Slicing: a Hybrid Slicing Technique for Analyzing Deployed Software

Dynamic program slicing can significantly reduce the code developers need to inspect by narrowing it down to only a subset of relevant program statements. However, despite an extensive body of research showing its usefulness, dynamic…

Software Engineering · Computer Science 2022-01-04 Bogdan Alexandru Stoica , Swarup K. Sahoo , James R. Larus , Vikram S. Adve

Learning Forward Reuse Distance

Caching techniques are widely used in the era of cloud computing from applications, such as Web caches to infrastructures, Memcached and memory caches in computer architectures. Prediction of cached data can greatly help improve cache…

Machine Learning · Computer Science 2020-08-03 Pengcheng Li , Yongbin Gu

Measuring scheduling efficiency of RNNs for NLP applications

Recurrent neural networks (RNNs) have shown state of the art results for speech recognition, natural language processing, image captioning and video summarizing applications. Many of these applications run on low-power platforms, so their…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-04-09 Urmish Thakker , Ganesh Dasika , Jesse Beu , Matthew Mattina

Reuse Distance-based Copy-backs of Clean Cache Lines to Lower-level Caches

Cache plays a critical role in reducing the performance gap between CPU and main memory. A modern multi-core CPU generally employs a multi-level hierarchy of caches, through which the most recently and frequently used data are maintained in…

Hardware Architecture · Computer Science 2021-06-01 Rui Wang , Chundong Wang , Chongnan Ye

Automatic Map Density Selection for Locally-Performant Visual Place Recognition

A key challenge in translating Visual Place Recognition (VPR) from the lab to long-term deployment is ensuring a priori that a system can meet user-specified performance requirements across different parts of an environment, rather than…

Computer Vision and Pattern Recognition · Computer Science 2026-03-05 Somayeh Hussaini , Tobias Fischer , Michael Milford

LLM as an Execution Estimator: Recovering Missing Dependency for Practical Time-travelling Debugging

Determining the dynamic data dependency of a step that reads a variable $v$ is challenging. It typically requires either exhaustive instrumentation, which becomes prohibitively expensive when $v$ is defined within library calls, or repeated…

Software Engineering · Computer Science 2025-09-04 Yunrui Pei , Hongshu Wang , Wenjie Zhang , Yun Lin , Weiyu Kong , Jin song Dong

An Approach to Static Performance Guarantees for Programs with Run-time Checks

Instrumenting programs for performing run-time checking of properties, such as regular shapes, is a common and useful technique that helps programmers detect incorrect program behaviors. This is specially true in dynamic languages such as…

Programming Languages · Computer Science 2018-04-09 Maximiliano Klemen , Nataliia Stulova , Pedro Lopez-Garcia , José F. Morales , Manuel V. Hermenegildo

Reuse Detector: Improving the Management of STT-RAM SLLCs

Various constraints of Static Random Access Memory (SRAM) are leading to consider new memory technologies as candidates for building on-chip shared last-level caches (SLLCs). Spin-Transfer Torque RAM (STT-RAM) is currently postulated as the…

Hardware Architecture · Computer Science 2024-02-02 Roberto RodrÍguez-RodrÍguez , Javier DÍaz , Fernando Castro , Pablo IbÁÑez , Daniel Chaver , Víctor ViÑals , Juan Carlos Saez , Manuel Prieto-Matias , Luis Pinuel , Teresa Monreal , Jose María LlaberÍa

Addressing Variability in Reuse Prediction for Last-Level Caches

Last-Level Cache (LLC) represents the bulk of a modern CPU processor's transistor budget and is essential for application performance as LLC enables fast access to data in contrast to much slower main memory. However, applications with…

Hardware Architecture · Computer Science 2020-06-16 Priyank Faldu

Do Not Waste Your Rollouts: Recycling Search Experience for Efficient Test-Time Scaling

Test-Time Scaling enhances the reasoning capabilities of Large Language Models by allocating additional inference compute to broaden the exploration of the solution space. However, existing search strategies typically treat rollouts as…

Computation and Language · Computer Science 2026-05-06 Xinglin Wang , Jiayi Shi , Shaoxiong Feng , Peiwen Yuan , Yiwei Li , Yueqi Zhang , Chuyi Tan , Ji Zhang , Boyuan Pan , Yao Hu , Kan Li

Beyond Profiling: Scaling Profiling Data Usage to Multiple Applications

Profiling techniques are used extensively at different parts of the computing stack to achieve many goals. One major goal is to make a piece of software execute more efficiently on a specific hardware platform, where efficiency spans…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-07 Chris Quackenbush , Mohamed Zahran