English
Related papers

Related papers: Gauss-Newton Temporal Difference Learning with Non…

200 papers

First-order methods such as stochastic gradient descent (SGD) are currently the standard algorithm for training deep neural networks. Second-order methods, despite their better convergence rate, are rarely used in practice due to the…

Machine Learning · Computer Science 2019-09-26 Tianle Cai , Ruiqi Gao , Jikai Hou , Siyu Chen , Dong Wang , Di He , Zhihua Zhang , Liwei Wang

We devise a distributional variant of gradient temporal-difference (TD) learning. Distributional reinforcement learning has been demonstrated to outperform the regular one in the recent study \citep{bellemare2017distributional}. In the…

Machine Learning · Computer Science 2019-04-04 Chao Qu , Shie Mannor , Huan Xu

Multi-step temporal-difference (TD) learning, where the update targets contain information from multiple time steps ahead, is one of the most popular forms of TD learning for linear function approximation. The reason is that multi-step…

Artificial Intelligence · Computer Science 2016-08-19 Harm van Seijen

Temporal difference (TD) learning algorithms with neural network function parameterization have well-established empirical success in many practical large-scale reinforcement learning tasks. However, theoretical understanding of these…

Machine Learning · Computer Science 2024-05-08 Zhifa Ke , Zaiwen Wen , Junyu Zhang

A q-Gauss-Newton algorithm is an iterative procedure that solves nonlinear unconstrained optimization problems based on minimization of the sum squared errors of the objective function residuals. Main advantage of the algorithm is that it…

Optimization and Control · Mathematics 2021-05-28 Danijela Protic , Miomir Stankovic

Temporal-difference learning (TD), coupled with neural networks, is among the most fundamental building blocks of deep reinforcement learning. However, due to the nonlinearity in value function approximation, such a coupling leads to…

Machine Learning · Computer Science 2020-04-16 Qi Cai , Zhuoran Yang , Jason D. Lee , Zhaoran Wang

The generalized Gauss-Newton (GGN) optimization method incorporates curvature estimates into its solution steps, and provides a good approximation to the Newton method for large-scale optimization problems. GGN has been found particularly…

Machine Learning · Computer Science 2024-04-24 Adeyemi D. Adeoye , Philipp Christian Petersen , Alberto Bemporad

Temporal-difference (TD) learning is highly effective at controlling and evaluating an agent's long-term outcomes. Most approaches in this paradigm implement a semi-gradient update to boost the learning speed, which consists of ignoring the…

Machine Learning · Computer Science 2026-05-15 Théo Vincent , Kevin Gerhardt , Yogesh Tripathi , Habib Maraqten , Adam White , Martha White , Jan Peters , Carlo D'Eramo

In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms. We show how gradient TD (GTD)…

Machine Learning · Computer Science 2020-06-09 Bo Liu , Ian Gemp , Mohammad Ghavamzadeh , Ji Liu , Sridhar Mahadevan , Marek Petrik

This work studies the global convergence and implicit bias of Gauss Newton's (GN) when optimizing over-parameterized one-hidden layer networks in the mean-field regime. We first establish a global convergence result for GN in the…

Machine Learning · Computer Science 2023-12-13 Michael Arbel , Romain Menegaux , Pierre Wolinski

Gradient descent or its variants are popular in training neural networks. However, in deep Q-learning with neural network approximation, a type of reinforcement learning, gradient descent (also known as Residual Gradient (RG)) is barely…

Machine Learning · Computer Science 2022-11-15 Shuyu Yin , Tao Luo , Peilin Liu , Zhi-Qin John Xu

Neural Temporal Difference (TD) Learning is an approximate temporal difference method for policy evaluation that uses a neural network for function approximation. Analysis of Neural TD Learning has proven to be challenging. In this paper we…

Machine Learning · Computer Science 2023-12-12 Haoxing Tian , Ioannis Ch. Paschalidis , Alex Olshevsky

In this paper, we investigate the convergence performance of a cooperative diffusion Gauss-Newton (GN) method, which is widely used to solve the nonlinear least squares problems (NLLS) due to the low computation cost compared with Newton's…

Optimization and Control · Mathematics 2019-03-06 Mou Wu , Naixue Xiong , Liansheng Tan

Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most widely used algorithms in reinforcement…

Machine Learning · Computer Science 2018-11-07 Jalaj Bhandari , Daniel Russo , Raghav Singal

Following early work on Hessian-free methods for deep learning, we study a stochastic generalized Gauss-Newton method (SGN) for training DNNs. SGN is a second-order optimization method, with efficient iterations, that we demonstrate to…

Machine Learning · Computer Science 2020-06-11 Matilde Gargiani , Andrea Zanelli , Moritz Diehl , Frank Hutter

Though quasi-Newton methods have been extensively studied in the literature, they either suffer from local convergence or use a series of line searches for global convergence which is not acceptable in the distributed setting. In this work,…

Optimization and Control · Mathematics 2023-12-01 Yubo Du , Keyou You

Optimization techniques in deep learning are predominantly led by first-order gradient methodologies, such as SGD. However, neural network training can greatly benefit from the rapid convergence characteristics of second-order optimization.…

Quantum Physics · Physics 2025-04-30 Pingzhi Li , Junyu Liu , Hanrui Wang , Tianlong Chen

Stochastic gradient updates are widely used for their efficiency and scalability, but their effective step sizes can depend strongly on feature scaling and local model sensitivity. Gauss-Newton methods address such scale effects through…

Machine Learning · Computer Science 2026-05-27 Mikalai Korbit , Mario Zanon

We consider off-policy temporal-difference (TD) learning methods for policy evaluation in Markov decision processes with finite spaces and discounted reward criteria, and we present a collection of convergence results for several…

Machine Learning · Computer Science 2018-03-30 Huizhen Yu

In this paper, we study the Temporal Difference (TD) learning with linear value function approximation. It is well known that most TD learning algorithms are unstable with linear function approximation and off-policy learning. Recent…

Artificial Intelligence · Computer Science 2016-10-06 Dominik Meyer , Hao Shen , Klaus Diepold
‹ Prev 1 2 3 10 Next ›