English
Related papers

Related papers: LERC: Coordinated Cache Management for Data-Parall…

200 papers

Memory caches are being aggressively used in today's data-parallel systems such as Spark, Tez, and Piccolo. However, prevalent systems employ rather simple cache management policies--notably the Least Recently Used (LRU) policy--that are…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-03-27 Yinghao Yu , Wei Wang , Jun Zhang , Khaled Ben Letaief

Understanding the performance of data-parallel workloads when resource-constrained has significant practical importance but unfortunately has received only limited attention. This paper identifies, quantifies and demonstrates memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-02-15 Calin Iorgulescu , Florin Dinu , Aunn Raza , Wajih Ul Hassan , Willy Zwaenepoel

When compared to blocking concurrency, non-blocking concurrency can provide higher performance in parallel shared-memory contexts, especially in high contention scenarios. This paper proposes FLeeC, an application-level cache system based…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-17 André J. Costa , Nuno M. Preguiça , João M. Lourenço

In the era of big data and cloud computing, large amounts of data are generated from user applications and need to be processed in the datacenter. Data-parallel computing frameworks, such as Apache Spark, are widely used to perform such…

Performance · Computer Science 2018-05-09 Zhengyu Yang , Danlin Jia , Stratis Ioannidis , Ningfang Mi , Bo Sheng

Software caches optimize the performance of diverse storage systems, databases and other software systems. Existing works on software caches automatically resort to fully associative cache designs. Our work shows that limited associativity…

Hardware Architecture · Computer Science 2021-09-08 Dolev Adas , Gil Einziger , Roy Friedman

This study investigates the use of reinforcement learning to guide a general purpose cache manager decisions. Cache managers directly impact the overall performance of computer systems. They govern decisions about which objects should be…

Machine Learning · Computer Science 2019-10-01 Sami Alabed

Caches are used to reduce the speed differential between the CPU and memory to improve the performance of modern processors. However, attackers can use contention-based cache timing attacks to steal sensitive information from victim…

Cryptography and Security · Computer Science 2024-06-13 Quancheng Wang , Xige Zhang , Han Wang , Yuzhe Gu , Ming Tang

Data analytic applications built upon big data processing frameworks such as Apache Spark are an important class of applications. Many of these applications are not latency-sensitive and thus can run as batch jobs in data centers. By…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-03 Vicent Sanz Marco , Ben Taylor , Barry Porter , Zheng Wang

The system-level cache is a critical resource shared by processor cores and domain-specific accelerators in heterogeneous systems on chips (SoCs). The strict QoS requirements of accelerators, such as deadlines, can lead to severe…

Hardware Architecture · Computer Science 2026-05-21 Ayushi Agarwal , Anannya Mathur , Preeti Ranjan Panda

The rapid adoption of large language models (LLMs) is pushing AI accelerators toward increasingly powerful and specialized designs. Instead of further complicating software development with deeply hierarchical scratchpad memories (SPMs) and…

Hardware Architecture · Computer Science 2025-12-09 Zhongchun Zhou , Chengtao Lai , Yuhang Gu , Wei Zhang

Persistent memory provides high-performance data persistence at main memory. Memory writes need to be performed in strict order to satisfy storage consistency requirements and enable correct recovery from system crashes. Unfortunately,…

Hardware Architecture · Computer Science 2017-05-11 Youyou Lu , Jiwu Shu , Long Sun , Onur Mutlu

Cache prefetcher greatly eliminates compulsory cache misses, by fetching data from slower memory to faster cache before it is actually required by processors. Sophisticated prefetchers predict next use cache line by repeating program's…

Hardware Architecture · Computer Science 2017-12-05 Haoyuan Wang , Zhiwei Luo

Disaggregating memory from compute offers the opportunity to better utilize stranded memory in cloud data centers. It is important to cache data in the compute nodes and maintain cache coherence across multiple compute nodes. However, the…

Databases · Computer Science 2026-01-14 Ruihong Wang , Jianguo Wang , Walid G. Aref

In modern GPU inference, cache efficiency remains a major bottleneck, and heuristic policies such as \textsc{LRU} can perform far worse than the offline optimum. Existing learning-based caching systems improve hit rates mainly through…

Modern hardware systems are heavily underutilized when running large-scale graph applications. While many in-memory graph frameworks have made substantial progress in optimizing these applications, we show that it is still possible to…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-15 Yunming Zhang , Vladimir Kiriansky , Charith Mendis , Matei Zaharia , Saman Amarasinghe

As capacity and complexity of on-chip cache memory hierarchy increases, the service cost to the critical loads from Last Level Cache (LLC), which are frequently repeated, has become a major concern. The processor may stall for a…

Hardware Architecture · Computer Science 2016-08-09 Navid Khoshavi , Xunchao Chen , Jun Wang , Ronald F. DeMara

Cause-effect chains, as a widely used modeling method in real-time embedded systems, are extensively applied in various safety-critical domains. End-to-end latency, as a key real-time attribute of cause-effect chains, is crucial in many…

Systems and Control · Electrical Eng. & Systems 2026-01-29 Yixuan Zhu , Yinkang Gao , Bo Zhang , Xiaohang Gong , Binze Jiang , Lei Gong , Wenqi Lou , Teng Wang , Chao Wang , Xi Li , Xuehai Zhou

This paper presents a comprehensive comparison of distributed caching algorithms employed in modern distributed systems. We evaluate various caching strategies including Least Recently Used (LRU), Least Frequently Used (LFU), Adaptive…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-04 Helen Mayer , James Richards

Co-location and memory sharing between latency-critical services, such as key-value store and web search, and best-effort batch jobs is an appealing approach to improving memory utilization in multi-tenant datacenter systems. However, we…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-08 Aidi Pi , Junxian Zhao , Shaoqi Wang , Xiaobo Zhou

Current day processors employ multi-level cache hierarchy with one or two levels of private caches and a shared last-level cache (LLC). An efficient cache replacement policy at LLC is essential for reducing the off-chip memory transfer as…

Hardware Architecture · Computer Science 2013-07-25 Bijay Paikaray
‹ Prev 1 2 3 10 Next ›