English
Related papers

Related papers: Sample Complexity and Overparameterization Bounds …

200 papers

Temporal difference (TD) learning algorithms with neural network function parameterization have well-established empirical success in many practical large-scale reinforcement learning tasks. However, theoretical understanding of these…

Machine Learning · Computer Science 2024-05-08 Zhifa Ke , Zaiwen Wen , Junyu Zhang

Neural Temporal Difference (TD) Learning is an approximate temporal difference method for policy evaluation that uses a neural network for function approximation. Analysis of Neural TD Learning has proven to be challenging. In this paper we…

Machine Learning · Computer Science 2023-12-12 Haoxing Tian , Ioannis Ch. Paschalidis , Alex Olshevsky

We study the finite-time behaviour of the popular temporal difference (TD) learning algorithm when combined with tail-averaging. We derive finite time bounds on the parameter error of the tail-averaged TD iterate under a step-size choice…

Machine Learning · Computer Science 2024-09-20 Gandharv Patil , Prashanth L. A. , Dheeraj Nagaraj , Doina Precup

In this paper we consider the problem of obtaining sharp bounds for the performance of temporal difference (TD) methods with linear function approximation for policy evaluation in discounted Markov decision processes. We show that a simple…

Machine Learning · Statistics 2024-06-18 Sergey Samsonov , Daniil Tiapkin , Alexey Naumov , Eric Moulines

Recently, optimal time variable learning in deep neural networks (DNNs) was introduced in arXiv:2204.08528. In this manuscript we extend the concept by introducing a regularization term that directly relates to the time horizon in discrete…

Machine Learning · Computer Science 2023-12-07 Evelyn Herberg , Roland Herzog , Frederik Köhne

We discuss the approximation of the value function for infinite-horizon discounted Markov Reward Processes (MRP) with nonlinear functions trained with the Temporal-Difference (TD) learning algorithm. We first consider this problem under a…

Machine Learning · Computer Science 2024-02-05 Andrea Agazzi , Jianfeng Lu

Temporal-difference learning (TD), coupled with neural networks, is among the most fundamental building blocks of deep reinforcement learning. However, due to the nonlinearity in value function approximation, such a coupling leads to…

Machine Learning · Computer Science 2020-04-16 Qi Cai , Zhuoran Yang , Jason D. Lee , Zhaoran Wang

In this work, we describe a new approach that uses deep neural networks (DNN) to obtain regularization parameters for solving inverse problems. We consider a supervised learning approach, where a network is trained to approximate the…

Numerical Analysis · Mathematics 2021-04-15 Babak Maboudi Afkham , Julianne Chung , Matthias Chung

Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most widely used algorithms in reinforcement…

Machine Learning · Computer Science 2018-11-07 Jalaj Bhandari , Daniel Russo , Raghav Singal

In this paper, we study the finite-sample statistical rates of distributional temporal difference (TD) learning with linear function approximation. The aim of distributional TD learning is to estimate the return distribution of a discounted…

Machine Learning · Statistics 2025-05-14 Yang Peng , Kaicheng Jin , Liangyu Zhang , Zhihua Zhang

The effectiveness of non-parametric, kernel-based methods for function estimation comes at the price of high computational complexity, which hinders their applicability in adaptive, model-based control. Motivated by approximation techniques…

Statistics Theory · Mathematics 2023-03-17 Anna Scampicchio , Elena Arcari , Melanie N. Zeilinger

Temporal difference (TD) learning is a fundamental algorithm for estimating value functions in reinforcement learning. Recent finite-time analyses of TD with linear function approximation quantify its theoretical convergence rate. However,…

Machine Learning · Computer Science 2026-03-04 Yunxiang Li , Mark Schmidt , Reza Babanezhad , Sharan Vaswani

In this paper, we study the Temporal Difference (TD) learning with linear value function approximation. It is well known that most TD learning algorithms are unstable with linear function approximation and off-policy learning. Recent…

Artificial Intelligence · Computer Science 2016-10-06 Dominik Meyer , Hao Shen , Klaus Diepold

We investigate the statistical properties of Temporal Difference (TD) learning with Polyak-Ruppert averaging, arguably one of the most widely used algorithms in reinforcement learning, for the task of estimating the parameters of the…

Machine Learning · Statistics 2026-02-25 Weichen Wu , Gen Li , Yuting Wei , Alessandro Rinaldo

Relative temporal-difference (TD) learning was introduced to mitigate the slow convergence of TD methods when the discount factor approaches one by subtracting a baseline from the temporal-difference update. While this idea has been studied…

Machine Learning · Computer Science 2026-04-08 Masoud S. Sakha , Rushikesh Kamalapurkar , Sean Meyn

When fine-tuning Deep Neural Networks (DNNs) to new data, DNNs are prone to overwriting network parameters required for task-specific functionality on previously learned tasks, resulting in a loss of performance on those tasks. We propose…

Machine Learning · Computer Science 2025-01-22 Christopher Angelini , Nidhal Bouaynaya

Deep learning achieves remarkable generalization capability with overwhelming number of model parameters. Theoretical understanding of deep learning generalization receives recent attention yet remains not fully explored. This paper…

Machine Learning · Computer Science 2017-11-22 Guanhua Zheng , Jitao Sang , Changsheng Xu

We study regularization in the context of small sample-size learning with over-parameterized neural networks. Specifically, we shift focus from architectural properties, such as norms on the network weights, to properties of the internal…

Machine Learning · Computer Science 2021-05-18 Christoph D. Hofer , Florian Graf , Marc Niethammer , Roland Kwitt

In this paper, we study the finite-sample statistical rates of distributional temporal difference (TD) learning with linear function approximation. The purpose of distributional TD learning is to estimate the return distribution of a…

Machine Learning · Statistics 2025-11-18 Kaicheng Jin , Yang Peng , Jiansheng Yang , Zhihua Zhang

This paper is concerned with the problem of policy evaluation with linear function approximation in discounted infinite horizon Markov decision processes. We investigate the sample complexities required to guarantee a predefined estimation…

Machine Learning · Statistics 2024-05-03 Gen Li , Weichen Wu , Yuejie Chi , Cong Ma , Alessandro Rinaldo , Yuting Wei
‹ Prev 1 2 3 10 Next ›