Related papers: Continuity Laws for Sequential Models

Block-Biased Mamba for Long-Range Sequence Processing

Mamba extends earlier state space models (SSMs) by introducing input-dependent dynamics, and has demonstrated strong empirical performance across a range of domains, including language modeling, computer vision, and foundation models.…

Machine Learning · Computer Science 2025-05-15 Annan Yu , N. Benjamin Erichson

Demystifying the Token Dynamics of Deep Selective State Space Models

Selective state space models (SSM), such as Mamba, have gained prominence for their effectiveness in modeling sequential data. Despite their outstanding empirical performance, a comprehensive theoretical understanding of deep selective SSM…

Machine Learning · Computer Science 2025-03-10 Thieu N Vo , Tung D. Pham , Xin T. Tong , Tan Minh Nguyen

Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces

Decision Transformer, a promising approach that applies Transformer architectures to reinforcement learning, relies on causal self-attention to model sequences of states, actions, and rewards. While this method has shown competitive…

Machine Learning · Computer Science 2024-04-01 Toshihiro Ota

Temporal Task Diversity: Inductive Biases Under Non-Stationarity in Synthetic Sequence Modelling

Modern deep learning science often assumes that neural networks learn from a fixed data distribution. However, many practically important learning problems involve data distributions that change throughout training. How does such…

Machine Learning · Computer Science 2026-05-19 Afiq Abdillah Effiezal Aswadi , Oliver Britton , Ross Baker , Matthew Farrugia-Roberts

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Many subquadratic-time architectures such as linear attention,…

Machine Learning · Computer Science 2024-06-03 Albert Gu , Tri Dao

Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models

Sequential recommendation aims to estimate the dynamic user preferences and sequential dependencies among historical user behaviors. Although Transformer-based models have proven to be effective for sequential recommendation, they suffer…

Information Retrieval · Computer Science 2024-07-02 Chengkai Liu , Jianghao Lin , Jianling Wang , Hanzhou Liu , James Caverlee

Model-Based Reinforcement Learning for Control under Time-Varying Dynamics

Learning-based control methods typically assume stationary system dynamics, an assumption often violated in real-world systems due to drift, wear, or changing operating conditions. We study reinforcement learning for control under…

Machine Learning · Computer Science 2026-04-03 Klemens Iten , Bruce Lee , Chenhao Li , Lenart Treven , Andreas Krause , Bhavya Sukhija

Emergence of Primacy and Recency Effect in Mamba: A Mechanistic Point of View

We study memory in state-space language models using primacy and recency effects as behavioral tools to uncover how information is retained and forgotten over time. Applying structured recall tasks to the Mamba architecture, we observe a…

Computation and Language · Computer Science 2025-06-19 Muhammad Cendekia Airlangga , Hilal AlQuabeh , Munachiso S Nwadike , Kentaro Inui

A Continuous-time Tractable Model for Present-biased Agents

Present bias, the tendency to overvalue immediate rewards while undervaluing future ones, is a well-known barrier to achieving long-term goals. As artificial intelligence and behavioral economics increasingly focus on this phenomenon, the…

Computer Science and Game Theory · Computer Science 2024-09-18 Yasunori Akagi , Hideaki Kim , Takeshi Kurashima

Aligning Inductive Bias for Data-Efficient Generalization in State Space Models

The remarkable success of modern AI has been closely tied to scaling laws, yet the finite supply of high-quality data makes data efficiency--learning more from less--an increasingly important frontier. A model's inductive bias is a critical…

Machine Learning · Computer Science 2026-05-06 Qiyu Chen , Guozhang Chen

Learning Human Motion with Temporally Conditional Mamba

Learning human motion based on a time-dependent input signal presents a challenging yet impactful task with various applications. The goal of this task is to generate or estimate human movement that consistently reflects the temporal…

Computer Vision and Pattern Recognition · Computer Science 2025-10-15 Quang Nguyen , Tri Le , Baoru Huang , Minh Nhat Vu , Ngan Le , Thieu Vo , Anh Nguyen

Inductive biases and Self Supervised Learning in modelling a physical heating system

Model Predictive Controllers (MPC) require a good model for the controlled process. In this paper I infer inductive biases about a physical system. I use these biases to derive a new neural network architecture that can model this real…

Machine Learning · Computer Science 2021-04-26 Cristian Vicas

HMamba: Hyperbolic Mamba for Sequential Recommendation

Sequential recommendation systems have become a cornerstone of personalized services, adept at modeling the temporal evolution of user preferences by capturing dynamic interaction sequences. Existing approaches predominantly rely on…

Information Retrieval · Computer Science 2025-05-15 Qianru Zhang , Honggang Wen , Wei Yuan , Crystal Chen , Menglin Yang , Siu-Ming Yiu , Hongzhi Yin

TIDES: Implicit Time-Awareness in Selective State Space Models

Selective state space models (SSMs), such as Mamba, achieve strong per-token expressivity by making the time discretization step $\Tilde{\Delta}$ a learned function of the input. However, in doing so, $\Tilde{\Delta}$ ceases to represent a…

Machine Learning · Computer Science 2026-05-12 Taylan Soydan , Miguel A. Bessa , Dirk Mohr , Rui Barreira

Time-Varying Priority Queuing Models for Human Dynamics

Queuing models provide insight into the temporal inhomogeneity of human dynamics, characterized by the broad distribution of waiting times of individuals performing tasks. We study the queuing model of an agent trying to execute a task of…

Physics and Society · Physics 2012-06-05 Hang-Hyun Jo , Raj Kumar Pan , Kimmo Kaski

Mamba Modulation: On the Length Generalization of Mamba

The quadratic complexity of the attention mechanism in Transformer models has motivated the development of alternative architectures with sub-quadratic scaling, such as state-space models. Among these, Mamba has emerged as a leading…

Machine Learning · Computer Science 2025-12-16 Peng Lu , Jerry Huang , Qiuhao Zeng , Xinyu Wang , Boxing Chen , Philippe Langlais , Yufei Cui

Uncovering Selective State Space Model's Capabilities in Lifelong Sequential Recommendation

Sequential Recommenders have been widely applied in various online services, aiming to model users' dynamic interests from their sequential interactions. With users increasingly engaging with online platforms, vast amounts of lifelong user…

Information Retrieval · Computer Science 2024-03-26 Jiyuan Yang , Yuanzi Li , Jingyu Zhao , Hanbing Wang , Muyang Ma , Jun Ma , Zhaochun Ren , Mengqi Zhang , Xin Xin , Zhumin Chen , Pengjie Ren

Numerical Investigation of Sequence Modeling Theory using Controllable Memory Functions

The evolution of sequence modeling architectures, from recurrent neural networks and convolutional models to Transformers and structured state-space models, reflects ongoing efforts to address the diverse temporal dependencies inherent in…

Machine Learning · Computer Science 2025-06-10 Haotian Jiang , Zeyu Bao , Shida Wang , Qianxiao Li

Understanding the Implicit Biases of Design Choices for Time Series Foundation Models

Time series foundation models (TSFMs) are a class of potentially powerful, general-purpose tools for time series forecasting and related temporal tasks, but their behavior is strongly shaped by subtle inductive biases in their design.…

Machine Learning · Computer Science 2025-10-23 Annan Yu , Danielle C. Maddix , Boran Han , Xiyuan Zhang , Abdul Fatir Ansari , Oleksandr Shchur , Christos Faloutsos , Andrew Gordon Wilson , Michael W. Mahoney , Yuyang Wang

EchoMamba4Rec: Harmonizing Bidirectional State Space Models with Spectral Filtering for Advanced Sequential Recommendation

Predicting user preferences and sequential dependencies based on historical behavior is the core goal of sequential recommendation. Although attention-based models have shown effectiveness in this field, they often struggle with inference…

Machine Learning · Computer Science 2024-06-11 Yuda Wang , Xuxin He , Shengxin Zhu