Related papers: Host-Based Allocators for Device Memory
Resource constraints can fundamentally change both learning and decision-making. We explore how memory constraints influence an agent's performance when navigating unknown environments using standard reinforcement learning algorithms.…
Programmers using native languages such as C, C++, or Rust can implement custom memory allocation strategies to improve execution time. In their paper titled "Reconsidering Custom Memory Allocation" almost 25 years ago, Berger et al. showed…
Applications making excessive use of single-object based data structures (such as linked lists, trees, etc...) can see a drop in efficiency over a period of time due to the randomization of nodes in memory. This slow down is due to the…
Despite the fact that Solid State Disk (SSD) data storage media had offered a revolutionary property storages community, but the unavailability of a comprehensive allocation strategy in SSDs storage media, leads to consuming the available…
Memory latency, bandwidth, capacity, and energy increasingly limit performance. In this paper, we reconsider proposed system architectures that consist of huge (many-terabyte to petabyte scale) memories shared among large numbers of CPUs.…
Using custom memory allocators is an efficient performance optimization technique. However, dependency on a custom allocator can introduce several maintenance-related issues. We present lessons learned from the industry and provide critical…
In this paper we consider distributed allocation problems with memory constraint limits. Firstly, we propose a tractable relaxation to the problem of optimal symmetric allocations from [1]. The approximated problem is based on the Q-error…
The allocation of scarce donor organs constitutes one of the most consequential algorithmic challenges in healthcare. While the field is rapidly transitioning from rigid, rule-based systems to machine learning and data-driven optimization,…
Databases need to allocate and free blocks of storage on disk. Freed blocks introduce holes where no data is stored. Allocation systems attempt to reuse such deallocated regions in order to minimize the footprint on disk. If previously…
Memory disaggregation addresses memory imbalance in a cluster by decoupling CPU and memory allocations of applications while also increasing the effective memory capacity for (memory-intensive) applications beyond the local memory limit…
For the last thirty years, a large variety of memory allocators have been proposed. Since performance, memory usage and energy consumption of each memory allocator differs, software engineers often face difficult choices in selecting the…
Designing neural network architectures is a task that lies somewhere between science and art. For a given task, some architectures are eventually preferred over others, based on a mix of intuition, experience, experimentation and luck. For…
The primary function of memory allocators is to allocate and deallocate chunks of memory primarily through the malloc API. Many memory allocators also implement other API extensions, such as deriving the size of an allocated object from the…
The recent advent of programmable switches makes distributed algorithms readily deployable in real-world datacenter networks. However, there are still gaps between theory and practice that prevent the smooth adaptation of CONGEST algorithms…
Disaggregated memory is promising for improving memory utilization in computer clusters in which memory demands significantly vary across computer nodes under utilization. It allows applications with high memory demands to use memory in…
In many applications such as rationing medical care and supplies, university admissions, and the assignment of public housing, the decision of who receives an allocation can be justified by various normative criteria. Such settings have…
A content-addressable-memory compares an input search word against all rows of stored words in an array in a highly parallel manner. While supplying a very powerful functionality for many applications in pattern matching and search, it…
Designing bounded-memory algorithms is becoming increasingly important nowadays. Previous works studying bounded-memory algorithms focused on proving impossibility results, while the design of bounded-memory algorithms was left relatively…
Deep learning-based models are utilized to achieve state-of-the-art performance for recommendation systems. A key challenge for these models is to work with millions of categorical classes or tokens. The standard approach is to learn…
AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance, but underprovisioned on density and…