Related papers: On Transformations of Load-Store Maurer Instructio…

On the operating unit size of load/store architectures

We introduce a strict version of the concept of a load/store instruction set architecture in the setting of Maurer machines. We take the view that transformations on the states of a Maurer machine are achieved by applying threads as…

Hardware Architecture · Computer Science 2010-05-12 J. A. Bergstra , C. A. Middelburg

Data Scaling Laws in NMT: The Effect of Noise and Architecture

In this work, we study the effect of varying the architecture and training data quality on the data scaling properties of Neural Machine Translation (NMT). First, we establish that the test loss of encoder-decoder transformer models scales…

Machine Learning · Computer Science 2022-02-07 Yamini Bansal , Behrooz Ghorbani , Ankush Garg , Biao Zhang , Maxim Krikun , Colin Cherry , Behnam Neyshabur , Orhan Firat

Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data

This paper studies how the model architecture and data configurations influence the empirical memorization capacity of generative transformers. The models are trained using synthetic text datasets derived from the Systematized Nomenclature…

Computation and Language · Computer Science 2025-06-18 Anton Changalidis , Aki Härmä

A high-level operational semantics for hardware weak memory models

Modern processors deploy a variety of weak memory models, which for efficiency reasons may execute instructions in an order different to that specified by the program text. The consequences of instruction reordering can be complex and…

Logic in Computer Science · Computer Science 2018-12-05 Robert J. Colvin , Graeme Smith

Transfer Learning for Performance Modeling of Configurable Systems: An Exploratory Analysis

Modern software systems provide many configuration options which significantly influence their non-functional properties. To understand and predict the effect of configuration options, several sampling and learning strategies have been…

Machine Learning · Statistics 2017-09-08 Pooyan Jamshidi , Norbert Siegmund , Miguel Velez , Christian Kästner , Akshay Patel , Yuvraj Agarwal

A Load-Buffer Semantics for Total Store Ordering

We address the problem of verifying safety properties of concurrent programs running over the Total Store Order (TSO) memory model. Known decision procedures for this model are based on complex encodings of store buffers as lossy channels.…

Formal Languages and Automata Theory · Computer Science 2023-06-22 Parosh Aziz Abdulla , Mohamed Faouzi Atig , Ahmed Bouajjani , Tuan Phong Ngo

Memory Allocation in Resource-Constrained Reinforcement Learning

Resource constraints can fundamentally change both learning and decision-making. We explore how memory constraints influence an agent's performance when navigating unknown environments using standard reinforcement learning algorithms.…

Machine Learning · Computer Science 2025-06-24 Massimiliano Tamborski , David Abel

Changing Model Behavior at Test-Time Using Reinforcement Learning

Machine learning models are often used at test-time subject to constraints and trade-offs not present at training-time. For example, a computer vision model operating on an embedded device may need to perform real-time inference, or a…

Machine Learning · Statistics 2017-02-28 Augustus Odena , Dieterich Lawson , Christopher Olah

Memory Determines Learning Direction: A Theory of Gradient-Based Optimization in State Space Models

State space models (SSMs) have gained attention by showing potential to outperform Transformers. However, previous studies have not sufficiently addressed the mechanisms underlying their high performance owing to a lack of theoretical…

Machine Learning · Computer Science 2025-10-02 JingChuan Guan , Tomoyuki Kubota , Yasuo Kuniyoshi , Kohei Nakajima

Rare switching events in non-stationary systems

Physical systems with many degrees of freedom can often be understood in terms of transitions between a small number of metastable states. For time-homogeneous systems with short-term memory these transitions are fully characterized by a…

Molecular Networks · Quantitative Biology 2015-06-03 Nils B. Becker , Pieter Rein ten Wolde

Learning by training: emergent return-point memory from cyclically tuning disordered sphere packings

Many living and artificial systems improve their fitness or performance by adapting to changing environments or diverse training data. However, it remains unclear how such environmental variation influences adaptation, what is learned in…

Computational Physics · Physics 2026-04-09 Mengjie Zu , Carl P. Goodrich

Storehouse: a Reinforcement Learning Environment for Optimizing Warehouse Management

Warehouse Management Systems have been evolving and improving thanks to new Data Intelligence techniques. However, many current optimizations have been applied to specific cases or are in great need of manual interaction. Here is where…

Machine Learning · Computer Science 2022-07-22 Julen Cestero , Marco Quartulli , Alberto Maria Metelli , Marcello Restelli

Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Enhanced Model Architectures

Memory is fundamental to intelligence, enabling learning, reasoning, and adaptability across biological and artificial systems. While Transformer architectures excel at sequence modeling, they face critical limitations in long-range context…

Machine Learning · Computer Science 2025-08-19 Parsa Omidi , Xingshuai Huang , Axel Laborieux , Bahareh Nikpour , Tianyu Shi , Armaghan Eshaghi

Birth of a Transformer: A Memory Viewpoint

Large language models based on transformers have achieved great empirical successes. However, as they are deployed more widely, there is a growing need to better understand their internal mechanisms in order to make them more reliable.…

Machine Learning · Statistics 2023-11-08 Alberto Bietti , Vivien Cabannes , Diane Bouchacourt , Herve Jegou , Leon Bottou

Dephasing and Metal-Insulator Transition

The metal-insulator transition (MIT) observed in two-dimensional (2D) systems is apparently contradictory to the well known scaling theory of localization. By investigating the conductance of disordered one-dimensional systems with a finite…

Strongly Correlated Electrons · Physics 2009-10-31 Junren Shi , X. C. Xie

How memory architecture affects learning in a simple POMDP: the two-hypothesis testing problem

Reinforcement learning is generally difficult for partially observable Markov decision processes (POMDPs), which occurs when the agent's observation is partial or noisy. To seek good performance in POMDPs, one strategy is to endow the agent…

Machine Learning · Computer Science 2021-11-19 Mario Geiger , Christophe Eloy , Matthieu Wyart

A wide-spectrum language for verification of programs on weak memory models

Modern processors deploy a variety of weak memory models, which for efficiency reasons may (appear to) execute instructions in an order different to that specified by the program text. The consequences of instruction reordering can be…

Programming Languages · Computer Science 2018-12-04 Robert J. Colvin , Graeme Smith

Transfer Learning for a Class of Cascade Dynamical Systems

This work considers the problem of transfer learning in the context of reinforcement learning. Specifically, we consider training a policy in a reduced order system and deploying it in the full state system. The motivation for this training…

Machine Learning · Computer Science 2024-10-10 Shima Rabiei , Sandipan Mishra , Santiago Paternain

Resource-Efficient Transformer Architecture: Optimizing Memory and Execution Time for Real-Time Applications

This paper describes a memory-efficient transformer model designed to drive a reduction in memory usage and execution time by substantial orders of magnitude without impairing the model's performance near that of the original model.…

Machine Learning · Computer Science 2025-01-03 Krisvarish V , Priyadarshini T , K P Abhishek Sri Saai , Vaidehi Vijayakumar

Iterated Belief Revision Under Resource Constraints: Logic as Geometry

We propose a variant of iterated belief revision designed for settings with limited computational resources, such as mobile autonomous robots. The proposed memory architecture---called the {\em universal memory architecture}…

Artificial Intelligence · Computer Science 2018-12-21 Dan P. Guralnik , Daniel E. Koditschek