Related papers: Multilevel Bidirectional Cache Filter

Decentralized Caching Schemes and Performance Limits in Two-layer Networks

We study the decentralized caching scheme in a two-layer network, which includes a sever, multiple helpers, and multiple users. Basically, the proposed caching scheme consists of two phases, i.e, placement phase and delivery phase. In the…

Information Theory · Computer Science 2018-10-12 Lin Zhang , Zhao Wang , Ming Xiao , Gang Wu , Ying-Chaang Liang , Shaoqian Li

The Bicameral Cache: a split cache for vector architectures

The Bicameral Cache is a cache organization proposal for a vector architecture that segregates data according to their access type, distinguishing scalar from vector references. Its aim is to avoid both types of references from interfering…

Hardware Architecture · Computer Science 2025-03-28 Susana Rebolledo , Borja Perez , Jose Luis Bosque , Peter Hsu

H2-Cache: A Novel Hierarchical Dual-Stage Cache for High-Performance Acceleration of Generative Diffusion Models

Diffusion models have emerged as state-of-the-art in image generation, but their practical deployment is hindered by the significant computational cost of their iterative denoising process. While existing caching techniques can accelerate…

Computer Vision and Pattern Recognition · Computer Science 2025-11-06 Mingyu Sung , Il-Min Kim , Sangseok Yun , Jae-Mo Kang

SwitchDelta: Asynchronous Metadata Updating for Distributed Storage with In-Network Data Visibility

Distributed storage systems typically maintain strong consistency between data nodes and metadata nodes by adopting ordered writes: 1) first installing data; 2) then updating metadata to make data visible.We propose SwitchDelta to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-26 Junru Li , Qing Wang , Zhe Yang , Shuo Liu , Jiwu Shu , Youyou Lu

A Survey of Novel Cache Hierarchy Designs for High Workloads

Traditional on-die, three-level cache hierarchy design is very commonly used but is also prone to latency, especially at the Level 2 (L2) cache. We discuss three distinct ways of improving this design in order to have better performance.…

Hardware Architecture · Computer Science 2021-01-26 Pranjal Singh Rajput , Sonnya Dellarosa , Kanya Satis

Fletch: File-System Metadata Caching in Programmable Switches

Fast and scalable metadata management across multiple metadata servers is crucial for distributed file systems to handle numerous files and directories. Client-side caching of frequently accessed metadata can mitigate server loads, but…

Hardware Architecture · Computer Science 2026-05-06 Qingxiu Liu , Jiazhen Cai , Siyuan Sheng , Yuhui Chen , Lu Tang , Zhirong Shen , Patrick P. C. Lee

Novel Decentralized Coded Caching through Coded Prefetching

We propose a new decentralized coded caching scheme for a two-phase caching network, where the data placed in user caches in the prefetching phase are random portions of a maximal distance separable (MDS) coded version of the original…

Information Theory · Computer Science 2018-06-27 Yi-Peng Wei , Sennur Ulukus

Nemo: A Low-Write-Amplification Cache for Tiny Objects on Log-Structured Flash Devices

Modern storage systems predominantly use flash-based SSDs as a cache layer due to their favorable performance and cost efficiency. However, in tiny-object workloads, existing flash cache designs still suffer from high write amplification.…

Hardware Architecture · Computer Science 2026-03-11 Xufeng Yang , Tingting Tan , Jingxin Hu , Congming Gao , Mingyang Liu , Tianyang Jiang , Jian Chen , Linbo Long , Yina Lv , Jiwu Shu

Reducing Load Latency with Cache Level Prediction

High load latency that results from deep cache hierarchies and relatively slow main memory is an important limiter of single-thread performance. Data prefetch helps reduce this latency by fetching data up the hierarchy before it is…

Hardware Architecture · Computer Science 2021-03-30 Majid Jalili , Mattan Erez

DSPatch: Dual Spatial Pattern Prefetcher

High main memory latency continues to limit performance of modern high-performance out-of-order cores. While DRAM latency has remained nearly the same over many generations, DRAM bandwidth has grown significantly due to higher frequencies,…

Hardware Architecture · Computer Science 2019-10-09 Rahul Bera , Anant V. Nori , Onur Mutlu , Sreenivas Subramoney

HACache: Leveraging Read Performance with Cache in a Heterogeneous Array

In cost-sensitive deployments, RAID arrays may combine SSDs with different performance levels. Such heterogeneity arises when aging SSDs degrade yet remain usable, or when failed drives are replaced with new devices of explicitly better…

Operating Systems · Computer Science 2026-04-03 Jialin Liu , Liang Shi , Dingcui Yu

Relative Performance of a Multi-level Cache with Last-Level Cache Replacement: An Analytic Review

Current day processors employ multi-level cache hierarchy with one or two levels of private caches and a shared last-level cache (LLC). An efficient cache replacement policy at LLC is essential for reducing the off-chip memory transfer as…

Hardware Architecture · Computer Science 2013-07-25 Bijay Paikaray

Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching

Diffusion Transformers have recently demonstrated unprecedented generative capabilities for various tasks. The encouraging results, however, come with the cost of slow inference, since each denoising step requires inference on a transformer…

Machine Learning · Computer Science 2024-11-19 Xinyin Ma , Gongfan Fang , Michael Bi Mi , Xinchao Wang

Flashield: a Key-value Cache that Minimizes Writes to Flash

As its price per bit drops, SSD is increasingly becoming the default storage medium for cloud application databases. However, it has not become the preferred storage medium for key-value caches, even though SSD offers more than 10x lower…

Operating Systems · Computer Science 2017-02-10 Assaf Eisenman , Asaf Cidon , Evgenya Pergament , Or Haimovich , Ryan Stutsman , Mohammad Alizadeh , Sachin Katti

Optimizing SSD Caches for Cloud Block Storage Systems Using Machine Learning Approaches

The growing demand for efficient cloud storage solutions has led to the widespread adoption of Solid-State Drives (SSDs) for caching in cloud block storage systems. The management of data writes to SSD caches plays a crucial role in…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-30 Chiyu Cheng , Chang Zhou , Yang Zhao , Jin Cao

Distributed Wear levelling of Flash Memories

For large scale distributed storage systems, flash memories are an excellent choice because flash memories consume less power, take lesser floor space for a target throughput and provide faster access to data. In a traditional distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-02-26 Srimugunthan , K. Gopinath

MAC: a novel systematically multilevel cache replacement policy for PCM memory

The rapid development of multi-core system and increase of data-intensive application in recent years call for larger main memory. Traditional DRAM memory can increase its capacity by reducing the feature size of storage cell. Now further…

Hardware Architecture · Computer Science 2016-06-13 Shenchen Ruan , Haixia Wang , Dongsheng Wang

Improved Approximation of Storage-Rate Tradeoff for Caching with Multiple Demands

Caching at the network edge has emerged as a viable solution for alleviating the severe capacity crunch in modern content centric wireless networks by leveraging network load-balancing in the form of localized content storage and delivery.…

Information Theory · Computer Science 2016-06-15 Avik Sengupta , Ravi Tandon

Satisfying Increasing Performance Requirements with Caching at the Application Level

Application-level caching is a form of caching that has been increasingly adopted to satisfy performance and throughput requirements. The key idea is to store the results of a computation, to improve performance by reusing instead of…

Software Engineering · Computer Science 2020-10-27 Jhonny Mertz , Ingrid Nunes , Luca Della Toffola , Marija Selakovic , Michael Pradel

Efficient Two-Level Scheduling for Concurrent Graph Processing

With the rapidly growing demand of graph processing in the real scene, they have to efficiently handle massive concurrent jobs. Although existing work enable to efficiently handle single graph processing job, there are plenty of memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-05 Jin Zhao