English
Related papers

Related papers: Static Reuse Profile Estimation for Array Applicat…

200 papers

Efficient memory access patterns play a crucial role in determining the overall performance of applications by exploiting temporal and spatial locality, thus maximizing cache locality. The Reuse Distance Histogram (RDH) is a widely used…

Performance · Computer Science 2025-09-24 Abdur Razzak , Atanu Barai , Nandakishore Santhi , Abdel-Hameed A. Badawy

Profiling various application characteristics, including the number of different arithmetic operations performed, memory footprint, etc., dynamically is time- and space-consuming. On the other hand, static analysis methods, although fast,…

Software Engineering · Computer Science 2023-11-28 Atanu Barai , Nandakishore Santhi , Abdur Razzak , Stephan Eidenbenz , Abdel-Hameed A. Badawy

Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while…

To mitigate the performance gap between CPU and the main memory, multi-level cache architectures are widely used in modern processors. Therefore, modeling the behaviors of the downstream caches becomes a critical part of the processor…

Hardware Architecture · Computer Science 2020-10-13 Ming Ling , Jiancong Ge , Guangmin Wang

Performance modeling of parallel applications on multicore computers remains a challenge in computational co-design due to the complex design of multicore processors including private and shared memory hierarchies. We present a Scalable…

In this paper, we proposed an effective and efficient multi-core shared-cache design optimization approach based on reuse-distance analysis of the data traces of target applications. Since data traces are independent of system hardware…

Performance · Computer Science 2021-09-13 Hsin-Yu Ho , Ren-Song Tsay

In this paper, we propose a practical and effective approach allowing designers to optimize multi-level cache size at the early system design phase. Our key contribution is to generalize the reuse distance analysis method and develop an…

Hardware Architecture · Computer Science 2021-09-13 Cheng-Lin Tsai , Ren-Song Tsay

Traditional static resource analyses estimate the total resource usage of a program, without executing it. In this paper we present a novel resource analysis whose aim is instead the static profiling of accumulated cost, i.e., to discover,…

Programming Languages · Computer Science 2016-10-18 Pedro Lopez-Garcia , Maximiliano Klemen , Umer Liqat , Manuel V. Hermenegildo

Reuse has been proposed as a microarchitecture-level mechanism to reduce the amount of executed instructions, collapsing dependencies and freeing resources for other instructions. Previous works have used reuse domains such as memory…

Hardware Architecture · Computer Science 2017-11-20 Andrey M. Coppieters , Sheila de Oliveira , Felipe M. G. França , Maurício L. Pilla , Amarildo T. da Costa

Dynamic program slicing can significantly reduce the code developers need to inspect by narrowing it down to only a subset of relevant program statements. However, despite an extensive body of research showing its usefulness, dynamic…

Software Engineering · Computer Science 2022-01-04 Bogdan Alexandru Stoica , Swarup K. Sahoo , James R. Larus , Vikram S. Adve

Caching techniques are widely used in the era of cloud computing from applications, such as Web caches to infrastructures, Memcached and memory caches in computer architectures. Prediction of cached data can greatly help improve cache…

Machine Learning · Computer Science 2020-08-03 Pengcheng Li , Yongbin Gu

Recurrent neural networks (RNNs) have shown state of the art results for speech recognition, natural language processing, image captioning and video summarizing applications. Many of these applications run on low-power platforms, so their…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-04-09 Urmish Thakker , Ganesh Dasika , Jesse Beu , Matthew Mattina

Cache plays a critical role in reducing the performance gap between CPU and main memory. A modern multi-core CPU generally employs a multi-level hierarchy of caches, through which the most recently and frequently used data are maintained in…

Hardware Architecture · Computer Science 2021-06-01 Rui Wang , Chundong Wang , Chongnan Ye

A key challenge in translating Visual Place Recognition (VPR) from the lab to long-term deployment is ensuring a priori that a system can meet user-specified performance requirements across different parts of an environment, rather than…

Computer Vision and Pattern Recognition · Computer Science 2026-03-05 Somayeh Hussaini , Tobias Fischer , Michael Milford

Determining the dynamic data dependency of a step that reads a variable $v$ is challenging. It typically requires either exhaustive instrumentation, which becomes prohibitively expensive when $v$ is defined within library calls, or repeated…

Software Engineering · Computer Science 2025-09-04 Yunrui Pei , Hongshu Wang , Wenjie Zhang , Yun Lin , Weiyu Kong , Jin song Dong

Instrumenting programs for performing run-time checking of properties, such as regular shapes, is a common and useful technique that helps programmers detect incorrect program behaviors. This is specially true in dynamic languages such as…

Programming Languages · Computer Science 2018-04-09 Maximiliano Klemen , Nataliia Stulova , Pedro Lopez-Garcia , José F. Morales , Manuel V. Hermenegildo

Various constraints of Static Random Access Memory (SRAM) are leading to consider new memory technologies as candidates for building on-chip shared last-level caches (SLLCs). Spin-Transfer Torque RAM (STT-RAM) is currently postulated as the…

Last-Level Cache (LLC) represents the bulk of a modern CPU processor's transistor budget and is essential for application performance as LLC enables fast access to data in contrast to much slower main memory. However, applications with…

Hardware Architecture · Computer Science 2020-06-16 Priyank Faldu

Test-Time Scaling enhances the reasoning capabilities of Large Language Models by allocating additional inference compute to broaden the exploration of the solution space. However, existing search strategies typically treat rollouts as…

Computation and Language · Computer Science 2026-05-06 Xinglin Wang , Jiayi Shi , Shaoxiong Feng , Peiwen Yuan , Yiwei Li , Yueqi Zhang , Chuyi Tan , Ji Zhang , Boyuan Pan , Yao Hu , Kan Li

Profiling techniques are used extensively at different parts of the computing stack to achieve many goals. One major goal is to make a piece of software execute more efficiently on a specific hardware platform, where efficiency spans…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-07 Chris Quackenbush , Mohamed Zahran
‹ Prev 1 2 3 10 Next ›