English
Related papers

Related papers: Predictive State Temporal Difference Learning

200 papers

TD($\lambda$) with function approximation has proved empirically successful for some complex reinforcement learning problems. For linear approximation, TD($\lambda$) has been shown to minimise the squared error between the approximate value…

Machine Learning · Computer Science 2025-12-24 Lex Weaver , Jonathan Baxter

Temporal difference learning (TD) is a foundational concept in reinforcement learning (RL), aimed at efficiently assessing a policy's value function. TD($\lambda$), a potent variant, incorporates a memory trace to distribute the prediction…

Machine Learning · Computer Science 2024-02-13 Jianfei Ma

Reinforcement learning (RL) with continuous time and state/action spaces is often data-intensive and brittle under nuisance variability and shift, motivating methods that exploit value-preserving structures to stabilize and improve…

Machine Learning · Computer Science 2026-05-08 Zuyuan Zhang , Fei Xu Yu , Tian Lan

Feature selection in reinforcement learning (RL), i.e. choosing basis functions such that useful approximations of the unkown value function can be obtained, is one of the main challenges in scaling RL to real-world applications. Here we…

Artificial Intelligence · Computer Science 2012-02-01 Tobias Jung , Peter Stone

Reinforcement learning (RL) algorithms allow artificial agents to improve their selection of actions to increase rewarding experiences in their environments. Temporal Difference (TD) Learning -- a model-free RL method -- is a leading…

Machine Learning · Computer Science 2019-09-05 Jacob Rafati , David C. Noelle

Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interacts with their environment. However, existing algorithms often view these problem as static, focusing on point estimates for model…

Machine Learning · Statistics 2024-03-21 Frank Shih , Faming Liang

Deep reinforcement learning (RL) agents that exist in high-dimensional state spaces, such as those composed of images, have interconnected learning burdens. Agents must learn an action-selection policy that completes their given task, which…

Machine Learning · Computer Science 2021-10-12 Trevor McInroe , Lukas Schäfer , Stefano V. Albrecht

Using insight from numerical approximation of ODEs and the problem formulation and solution methodology of TD learning through a Galerkin relaxation, I propose a new class of TD learning algorithms. After applying the improved numerical…

Machine Learning · Computer Science 2021-04-21 Caleb Bowyer

With the decreasing cost of data collection, the space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially. Therefore, identifying the most characterizing features…

Machine Learning · Computer Science 2021-01-26 Sali Rasoul , Sodiq Adewole , Alphonse Akakpo

Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most widely used algorithms in reinforcement…

Machine Learning · Computer Science 2018-11-07 Jalaj Bhandari , Daniel Russo , Raghav Singal

In reinforcement learning (RL), state representations are key to dealing with large or continuous state spaces. While one of the promises of deep learning algorithms is to automatically construct features well-tuned for the task they try to…

Machine Learning · Computer Science 2023-06-21 Charline Le Lan , Stephen Tu , Mark Rowland , Anna Harutyunyan , Rishabh Agarwal , Marc G. Bellemare , Will Dabney

Temporal-Difference (TD) learning is a general and very useful tool for estimating the value function of a given policy, which in turn is required to find good policies. Generally speaking, TD learning updates states whenever they are…

Machine Learning · Computer Science 2021-08-24 Nishanth Anand , Doina Precup

Relative temporal-difference (TD) learning was introduced to mitigate the slow convergence of TD methods when the discount factor approaches one by subtracting a baseline from the temporal-difference update. While this idea has been studied…

Machine Learning · Computer Science 2026-04-08 Masoud S. Sakha , Rushikesh Kamalapurkar , Sean Meyn

Temporal difference (TD) learning is a foundational algorithm in reinforcement learning (RL). For nearly forty years, TD learning has served as a workhorse for applied RL as well as a building block for more complex and specialized…

Machine Learning · Computer Science 2025-06-24 Hwanwoo Kim , Panos Toulis , Eric Laber

Temporal difference (TD) learning is a fundamental technique in reinforcement learning that updates value estimates for states or state-action pairs using a TD target. This target represents an improved estimate of the true value by…

Machine Learning · Computer Science 2024-08-05 Wuhao Wang , Zhiyong Chen , Lepeng Zhang

Unsupervised skill discovery in reinforcement learning (RL) aims to learn diverse behaviors without relying on external rewards. However, current methods often overlook the periodic nature of learned skills, focusing instead on increasing…

Machine Learning · Computer Science 2025-12-01 Jonghae Park , Daesol Cho , Jusuk Lee , Dongseok Shim , Inkyu Jang , H. Jin Kim

In large-scale distributed machine learning, recent works have studied the effects of compressing gradients in stochastic optimization to alleviate the communication bottleneck. These works have collectively revealed that stochastic…

Machine Learning · Computer Science 2024-06-05 Aritra Mitra , George J. Pappas , Hamed Hassani

We explore fixed-horizon temporal difference (TD) methods, reinforcement learning algorithms for a new kind of value function that predicts the sum of rewards over a $\textit{fixed}$ number of future time steps. To learn the value function…

Machine Learning · Computer Science 2020-02-12 Kristopher De Asis , Alan Chan , Silviu Pitis , Richard S. Sutton , Daniel Graves

Value function approximation is a crucial module for policy evaluation in reinforcement learning when the state space is large or continuous. The present paper takes a generative perspective on policy evaluation via temporal-difference (TD)…

Machine Learning · Statistics 2021-12-03 Qin Lu , Georgios B. Giannakis

The effectiveness of Reinforcement Learning (RL) depends on an animal's ability to assign credit for rewards to the appropriate preceding stimuli. One aspect of understanding the neural underpinnings of this process involves understanding…

Neurons and Cognition · Quantitative Biology 2019-10-08 Jesse P. Geerts , Kimberly L. Stachenfeld , Neil Burgess
‹ Prev 1 2 3 10 Next ›