Related papers: Futureproof Static Memory Planning

Dynamic Stochastic Approximation for Multi-stage Stochastic Optimization

In this paper, we consider multi-stage stochastic optimization problems with convex objectives and conic constraints at each stage. We present a new stochastic first-order method, namely the dynamic stochastic approximation (DSA) algorithm,…

Optimization and Control · Mathematics 2019-08-22 Guanghui Lan , Zhiqiang Zhou

On Distributed Storage Allocations for Memory-Limited Systems

In this paper we consider distributed allocation problems with memory constraint limits. Firstly, we propose a tractable relaxation to the problem of optimal symmetric allocations from [1]. The approximated problem is based on the Q-error…

Information Theory · Computer Science 2015-04-17 Iryna Andriyanova , Pablo M. Olmos

Storage Allocation for Multi-Class Distributed Data Storage Systems

Distributed storage systems (DSSs) provide a scalable solution for reliably storing massive amounts of data coming from various sources. Heterogeneity of these data sources often means different data classes (types) exist in a DSS, each…

Information Theory · Computer Science 2017-01-24 Koosha Pourtahmasi Roshandeh , Moslem Noori , Masoud Ardakani , Chintha Tellambura

Dynamic Sparse Attention: Access Patterns and Architecture

Dynamic sparse attention (DSA) reduces the per-token attention bandwidth by restricting computation to a top-k subset of cached key-value (KV) entries, but its token-dependent selection pattern introduces a system-level challenge: the KV…

Hardware Architecture · Computer Science 2026-03-17 Noam Levy

Neural Simulated Annealing

Simulated annealing (SA) is a stochastic global optimisation technique applicable to a wide range of discrete and continuous variable problems. Despite its simplicity, the development of an effective SA optimiser for a given problem hinges…

Machine Learning · Computer Science 2024-06-27 Alvaro H. C. Correia , Daniel E. Worrall , Roberto Bondesan

ONE-SA: Enabling Nonlinear Operations in Systolic Arrays for Efficient and Flexible Neural Network Inference

The computation and memory-intensive nature of DNNs limits their use in many mobile and embedded contexts. Application-specific integrated circuit (ASIC) hardware accelerators employ matrix multiplication units (such as the systolic arrays)…

Hardware Architecture · Computer Science 2024-02-02 Ruiqi Sun , Yinchen Ni , Xin He , Jie Zhao , An Zou

SARA: A Stall-Aware Memory Allocation Strategy for Mixed-Criticality Systems

The memory capacity in edge devices is often limited due to constraints on cost, size, and power. Consequently, memory competition leads to inevitable page swapping in memory-constrained mixed-criticality edge devices, causing slow storage…

Operating Systems · Computer Science 2025-11-26 Meng-Chia Lee , Wen Sheng Lim , Yuan-Hao Chang , Tei-Wei Kuo

Probabilistic Scheduling of Dynamic I/O Requests via Application Clustering for Burst-Buffer Equipped HPC

Burst-Buffering is a promising storage solution that introduces an intermediate highthroughput storage buffer layer to mitigate the I/O bottleneck problem that the current High-Performance Computing (HPC) platforms suffer. The existing…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-17 Benbo Zha , Hong Shen

DSA: Decentralized Double Stochastic Averaging Gradient Algorithm

This paper considers convex optimization problems where nodes of a network have access to summands of a global objective. Each of these local objectives is further assumed to be an average of a finite set of functions. The motivation for…

Optimization and Control · Mathematics 2015-06-16 Aryan Mokhtari , Alejandro Ribeiro

PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures

The ability to dynamically allocate memory is fundamental in modern programming languages. However, this feature is not adequately supported in current general-purpose PIM devices. To identify key design principles that PIM must consider,…

Hardware Architecture · Computer Science 2026-01-28 Dongjae Lee , Bongjoon Hyun , Youngjin Kwon , Minsoo Rhu

BitStopper: An Efficient Transformer Attention Accelerator via Stage-fusion and Early Termination

Attention-based large language models (LLMs) have transformed modern AI applications, but the quadratic cost of self-attention imposes significant compute and memory overhead. Dynamic sparsity (DS) attention mitigates this, yet its hardware…

Machine Learning · Computer Science 2025-12-09 Huizheng Wang , Hongbin Wang , Shaojun Wei , Yang Hu , Shouyi Yin

The Unexpected Efficiency of Bin Packing Algorithms for Dynamic Storage Allocation in the Wild: An Intellectual Abstract

Recent work has shown that viewing allocators as black-box 2DBP solvers bears meaning. For instance, there exists a 2DBP-based fragmentation metric which often correlates monotonically with maximum resident set size (RSS). Given the field's…

Programming Languages · Computer Science 2023-05-03 Christos P. Lamprakos , Sotirios Xydis , Francky Catthoor , Dimitrios Soudris

Memory-Efficient FPGA Implementation of Stochastic Simulated Annealing

Simulated annealing (SA) is a well-known algorithm for solving combinatorial optimization problems. However, the computation time of SA increases rapidly, as the size of the problem grows. Recently, a stochastic simulated annealing (SSA)…

Hardware Architecture · Computer Science 2026-01-27 Duckgyu Shin , Naoya Onizawa , Warren J. Gross , Takahiro Hanyu

Dynamic Buffers: Cost-Efficient Planning for Tabletop Rearrangement with Stacking

Rearranging objects in cluttered tabletop environments remains a long-standing challenge in robotics. Classical planners often generate inefficient, high-cost plans by shuffling objects individually and using fixed buffers--temporary spaces…

Robotics · Computer Science 2025-09-30 Arman Barghi , Hamed Hosseini , Seraj Ghasemi , Mehdi Tale Masouleh , Ahmad Kalhor

FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture

Neural Network (NN) accelerators with emerging ReRAM (resistive random access memory) technologies have been investigated as one of the promising solutions to address the \textit{memory wall} challenge, due to the unique capability of…

Emerging Technologies · Computer Science 2019-01-30 Yu Ji , Youyang Zhang , Xinfeng Xie , Shuangchen Li , Peiqi Wang , Xing Hu , Youhui Zhang , Yuan Xie

Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding

In this paper, we propose Dynamic Self-Attention (DSA), a new self-attention mechanism for sentence embedding. We design DSA by modifying dynamic routing in capsule network (Sabouretal.,2017) for natural language processing. DSA attends to…

Machine Learning · Computer Science 2018-08-23 Deunsol Yoon , Dongbok Lee , SangKeun Lee

Fast, Multicore-Scalable, Low-Fragmentation Memory Allocation through Large Virtual Memory and Global Data Structures

We demonstrate that general-purpose memory allocation involving many threads on many cores can be done with high performance, multicore scalability, and low memory consumption. For this purpose, we have designed and implemented scalloc, a…

Programming Languages · Computer Science 2015-08-26 Martin Aigner , Christoph M. Kirsch , Michael Lippautz , Ana Sokolova

Offloading Artificial Intelligence Workloads across the Computing Continuum by means of Active Storage Systems

The increasing demand for artificial intelligence (AI) workloads across diverse computing environments has driven the need for more efficient data management strategies. Traditional cloud-based architectures struggle to handle the sheer…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-03 Alex Barceló , Sebastián A. Cajas Ordoñez , Jaydeep Samanta , Andrés L. Suárez-Cetrulo , Romila Ghosh , Ricardo Simón Carbajo , Anna Queralt

DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation

Budgeted pruning is the problem of pruning under resource constraints. In budgeted pruning, how to distribute the resources across layers (i.e., sparsity allocation) is the key problem. Traditional methods solve it by discretely searching…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Xuefei Ning , Tianchen Zhao , Wenshuo Li , Peng Lei , Yu Wang , Huazhong Yang

Optimizing Access Mechanisms for QoS Provisioning in Hardware Constrained Dynamic Spectrum Access

One of the major challenges in Dynamic Spectrum Access (DSA) systems is to guarantee a required level of Quality of Service (QoS) to secondary users of the spectrum. In this paper, we propose efficient algorithms for deriving optimal…

Information Theory · Computer Science 2016-07-08 Spyridon Vassilaras , George C. Alexandropoulos