Related papers: A Linear Dynamical System Model for Text

Filtering Beats Fine Tuning: A Bayesian Kalman View of In Context Learning in LLMs

We present a theory-first framework that interprets inference-time adaptation in large language models (LLMs) as online Bayesian state estimation. Rather than modeling rapid adaptation as implicit optimization or meta-learning, we formulate…

Machine Learning · Computer Science 2026-01-13 Andrew Kiruluta

Dynamic Word Embeddings

We present a probabilistic language model for time-stamped text data which tracks the semantic evolution of individual words over time. The model represents words and contexts by latent trajectories in an embedding space. At each moment in…

Machine Learning · Statistics 2017-07-19 Robert Bamler , Stephan Mandt

Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces

In order to integrate uncertainty estimates into deep time-series modelling, Kalman Filters (KFs) (Kalman et al., 1960) have been integrated with deep learning models, however, such approaches typically rely on approximate inference…

Machine Learning · Computer Science 2019-05-20 Philipp Becker , Harit Pandya , Gregor Gebhardt , Cheng Zhao , James Taylor , Gerhard Neumann

Consistent Alignment of Word Embedding Models

Word embedding models offer continuous vector representations that can capture rich contextual semantics based on their word co-occurrence patterns. While these word vectors can provide very effective features used in many NLP tasks such as…

Computation and Language · Computer Science 2017-02-27 Cem Safak Sahin , Rajmonda S. Caceres , Brandon Oselio , William M. Campbell

Deep Dictionary-Free Method for Identifying Linear Model of Nonlinear System with Input Delay

Nonlinear dynamical systems with input delays pose significant challenges for prediction, estimation, and control due to their inherent complexity and the impact of delays on system behavior. Traditional linear control techniques often fail…

Systems and Control · Electrical Eng. & Systems 2025-11-07 Patrik Valábek , Marek Wadinger , Michal Kvasnica , Martin Klaučo

Uncertainty Representations in State-Space Layers for Deep Reinforcement Learning under Partial Observability

Optimal decision-making under partial observability requires reasoning about the uncertainty of the environment's hidden state. However, most reinforcement learning architectures handle partial observability with sequence models that have…

Machine Learning · Computer Science 2025-02-20 Carlos E. Luis , Alessandro G. Bottero , Julia Vinogradska , Felix Berkenkamp , Jan Peters

Transformers as Implicit State Estimators: In-Context Learning in Dynamical Systems

Predicting the behavior of a dynamical system from noisy observations of its past outputs is a classical problem encountered across engineering and science. For linear systems with Gaussian inputs, the Kalman filter -- the best linear…

Machine Learning · Computer Science 2026-03-10 Usman Akram , Haris Vikalo

Inference on high-dimensional implicit dynamic models using a guided intermediate resampling filter

We propose a method for inference on moderately high-dimensional, nonlinear, non-Gaussian, partially observed Markov process models for which the transition density is not analytically tractable. Markov processes with intractable transition…

Methodology · Statistics 2020-04-02 Joonha Park , Edward L. Ionides

On-Line Learning of Linear Dynamical Systems: Exponential Forgetting in Kalman Filters

Kalman filter is a key tool for time-series forecasting and analysis. We show that the dependence of a prediction of Kalman filter on the past is decaying exponentially, whenever the process noise is non-degenerate. Therefore, Kalman filter…

Statistics Theory · Mathematics 2019-09-24 Mark Kozdoba , Jakub Marecek , Tigran Tchrakian , Shie Mannor

Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation

Syntactic structures used to play a vital role in natural language processing (NLP), but since the deep learning revolution, NLP has been gradually dominated by neural models that do not consider syntactic structures in their design. One…

Computation and Language · Computer Science 2023-11-28 Haoyi Wu , Kewei Tu

In-Context Iterative Policy Improvement for Dynamic Manipulation

Attention-based architectures trained on internet-scale language data have demonstrated state of the art reasoning ability for various language-based tasks, such as logic problems and textual reasoning. Additionally, these Large Language…

Robotics · Computer Science 2025-08-22 Mark Van der Merwe , Devesh Jha

Dynamic latent space relational event model

Dynamic relational processes, such as e-mail exchanges, bank loans and scientific citations, are important examples of dynamic networks, in which the relational events consistute time-stamped edges. There are contexts where the network…

Computation · Statistics 2023-04-24 Igor Artico , Ernst C. Wit

Linear dynamical neural population models through nonlinear embeddings

A body of recent work in modeling neural activity focuses on recovering low-dimensional latent features that capture the statistical structure of large-scale neural populations. Most such approaches have focused on linear generative models,…

Neurons and Cognition · Quantitative Biology 2016-10-26 Yuanjun Gao , Evan Archer , Liam Paninski , John P. Cunningham

An efficient framework for learning sentence representations

In this work we propose a simple and efficient framework for learning sentence representations from unlabelled data. Drawing inspiration from the distributional hypothesis and recent work on learning sentence representations, we reformulate…

Computation and Language · Computer Science 2018-03-09 Lajanugen Logeswaran , Honglak Lee

Compressing Neural Language Models by Sparse Word Representations

Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding…

Computation and Language · Computer Science 2016-10-14 Yunchuan Chen , Lili Mou , Yan Xu , Ge Li , Zhi Jin

Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. However, existing literature has highlighted the…

Computation and Language · Computer Science 2024-02-14 Xinyi Wang , Wanrong Zhu , Michael Saxon , Mark Steyvers , William Yang Wang

A Novel Neural Filter to Improve Accuracy of Neural Network Models of Dynamic Systems

The application of neural networks in modeling dynamic systems has become prominent due to their ability to estimate complex nonlinear functions. Despite their effectiveness, neural networks face challenges in long-term predictions, where…

Machine Learning · Computer Science 2025-06-10 Parham Oveissi , Turibius Rozario , Ankit Goel

Learning Molecular Dynamics with Simple Language Model built upon Long Short-Term Memory Neural Network

Recurrent neural networks (RNNs) have led to breakthroughs in natural language processing and speech recognition, wherein hundreds of millions of people use such tools on a daily basis through smartphones, email servers and other avenues.…

Disordered Systems and Neural Networks · Physics 2020-12-02 Sun-Ting Tsai , En-Jui Kuo , Pratyush Tiwary

Dynamic Meta-Embeddings for Improved Sentence Representations

While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves. To that end, we introduce dynamic…

Computation and Language · Computer Science 2018-09-06 Douwe Kiela , Changhan Wang , Kyunghyun Cho

Predefined Sparseness in Recurrent Sequence Models

Inducing sparseness while training neural networks has been shown to yield models with a lower memory footprint but similar effectiveness to dense models. However, sparseness is typically induced starting from a dense model, and thus this…

Machine Learning · Computer Science 2022-03-30 Thomas Demeester , Johannes Deleu , Fréderic Godin , Chris Develder