English
Related papers

Related papers: On Algorithmic Cache Optimization

200 papers

Accurate simulation techniques are indispensable to efficiently propose new memory or architectural organizations. As implementing new hardware concepts in real systems is often not feasible, cycle-accurate simulators employed together with…

Hardware Architecture · Computer Science 2024-02-02 Nicolas Bueno , Fernando Castro , Luis Pinuel , Jose Ignacio Gomez-Perez , Francky Catthoor

Cache eviction algorithms are used widely in operating systems, databases and other systems that use caches to speed up execution by caching data that is used by the application. There are many policies such as MRU (Most Recently Used), MFU…

Data Structures and Algorithms · Computer Science 2021-10-25 Dhruv Matani , Ketan Shah , Anirban Mitra

Classic cache-oblivious parallel matrix multiplication algorithms achieve optimality either in time or space, but not both, which promotes lots of research on the best possible balance or tradeoff of such algorithms. We study modern…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-14 Yuan Tang

While the cost of computation is an easy to understand local property, the cost of data movement on cached architectures depends on global state, does not compose, and is hard to predict. As a result, programmers often fail to consider the…

Performance · Computer Science 2020-01-07 Tobias Gysi , Tobias Grosser , Laurin Brandner , Torsten Hoefler

This paper presents a comprehensive comparison of distributed caching algorithms employed in modern distributed systems. We evaluate various caching strategies including Least Recently Used (LRU), Least Frequently Used (LFU), Adaptive…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-04 Helen Mayer , James Richards

Modern computer architectures share physical resources between different programs in order to increase area-, energy-, and cost-efficiency. Unfortunately, sharing often gives rise to side channels that can be exploited for extracting or…

Cryptography and Security · Computer Science 2017-01-24 Pablo Cañones , Boris Köpf , Jan Reineke

Modern processors use cache memory: a memory access that "hits" the cache returns early, while a "miss" takes more time. Given a memory access in a program, cache analysis consists in deciding whether this access is always a hit, always a…

Programming Languages · Computer Science 2019-09-24 David Monniaux , Valentin Touzeau

Memory hierarchy is used to compete the processors speed. Cache memory is the fast memory which is used to conduit the speed difference of memory and processor. The access patterns of Level 1 cache (L1) and Level 2 cache (L2) are different,…

Operating Systems · Computer Science 2010-03-23 Richa Gupta , Sanjiv Tokekar

The multiplication of matrices is an important arithmetic operation in computational mathematics. In the context of hierarchical matrices, this operation can be realized by the multiplication of structured block-wise low-rank matrices,…

Numerical Analysis · Mathematics 2018-05-24 Jürgen Dölz , Helmut Harbrecht , Michael D. Multerer

Matrix multiplication is one of the key operations in various engineering applications. Outsourcing large-scale matrix multiplication tasks to multiple distributed servers or cloud is desirable to speed up computation. However, security…

Information Theory · Computer Science 2018-06-04 Wei-Ting Chang , Ravi Tandon

We describe a model that enables us to analyze the running time of an algorithm in a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our model, an extension of Aggarwal and Vitter's I/O…

Hardware Architecture · Computer Science 2007-05-23 Sandeep Sen , Siddhartha Chatterjee , Neeraj Dumir

In this paper, we present algorithms to solve matrix multiplication problems in the MPC model. In particular, we consider the problem under various processor/memory constraints in the MPC model and prove the following results. 1.…

Computational Complexity · Computer Science 2025-09-30 Lakshya Joshi , Arya Deshmukh , Atharv Chhabra , Chetan Gupta

In modern GPU inference, cache efficiency remains a major bottleneck, and heuristic policies such as \textsc{LRU} can perform far worse than the offline optimum. Existing learning-based caching systems improve hit rates mainly through…

The biggest cost of computing with large matrices in any modern computer is related to memory latency and bandwidth. The average latency of modern RAM reads is 150 times greater than a clock step of the processor. Throughput is a little…

Data Structures and Algorithms · Computer Science 2013-03-04 Crysttian Arantes Paixão , Flávio Codeço Coelho

Designing problems using matrices is very important in Computer Science. Fields like graph computer, graphs theory, and machine learning use matrices very often to solve their own problems. The most often matrix operation is the…

Performance · Computer Science 2019-05-10 Andre G. C. Pacheco

This article introduces a novel family of decentralised caching policies, applicable to wireless networks with finite storage at the edge-nodes (stations). These policies, that are based on the Least-Recently-Used replacement principle, are…

Networking and Internet Architecture · Computer Science 2016-12-14 Anastasios Giovanidis , Apostolos Avranas

Large Language Models (LLMs) and other large foundation models have achieved noteworthy success, but their size exacerbates existing resource consumption and latency challenges. In particular, the large-scale deployment of these models is…

Machine Learning · Computer Science 2023-08-30 Banghua Zhu , Ying Sheng , Lianmin Zheng , Clark Barrett , Michael I. Jordan , Jiantao Jiao

Computationally efficient matrix multiplication is a fundamental requirement in various fields, including and particularly in data analytics. To do so, the computation task of a large-scale matrix multiplication is typically outsourced to…

Information Theory · Computer Science 2018-11-01 Jaber Kakar , Seyedhamed Ebadifar , Aydin Sezgin

The effective management of large amounts of data processed or required by today's cloud or edge computing systems remains a fundamental challenge. This paper focuses on cache management for applications where data objects can be stored in…

Networking and Internet Architecture · Computer Science 2025-04-03 Agrim Bari , Gustavo de Veciana , George Kesidis

We study superfast algorithms that computes low rank approximation of a matrix (hereafter referred to as LRA) that use much fewer memory cells and arithmetic operations than the input matrix has entries. We first specify a family of 2mn…

Numerical Analysis · Mathematics 2018-06-08 Victor Y. Pan , Qi Luan , John Svadlenka , Liang Zhao
‹ Prev 1 2 3 10 Next ›