English
Related papers

Related papers: Proximal Algorithms and Temporal Differences for L…

200 papers

In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms. We show how gradient TD (GTD)…

Machine Learning · Computer Science 2020-06-09 Bo Liu , Ian Gemp , Mohammad Ghavamzadeh , Ji Liu , Sridhar Mahadevan , Marek Petrik

Temporal difference (TD) learning is a foundational algorithm in reinforcement learning (RL). For nearly forty years, TD learning has served as a workhorse for applied RL as well as a building block for more complex and specialized…

Machine Learning · Computer Science 2025-06-24 Hwanwoo Kim , Panos Toulis , Eric Laber

Several problems in modeling and control of stochastically-driven dynamical systems can be cast as regularized semi-definite programs. We examine two such representative problems and show that they can be formulated in a similar manner. The…

Optimization and Control · Mathematics 2019-12-30 Armin Zare , Hesameddin Mohammadi , Neil K. Dhingra , Tryphon T. Georgiou , Mihailo R. Jovanović

Multi-step temporal-difference (TD) learning, where the update targets contain information from multiple time steps ahead, is one of the most popular forms of TD learning for linear function approximation. The reason is that multi-step…

Artificial Intelligence · Computer Science 2016-08-19 Harm van Seijen

In this paper we develop proximal methods for statistical learning. Proximal point algorithms are useful in statistics and machine learning for obtaining optimization solutions for composite functions. Our approach exploits closed-form…

Machine Learning · Statistics 2015-06-02 Nicholas G. Polson , James G. Scott , Brandon T. Willard

Many applications using large datasets require efficient methods for minimizing a proximable convex function subject to satisfying a set of linear constraints within a specified tolerance. For this task, we present a proximal projection…

Optimization and Control · Mathematics 2024-12-10 Howard Heaton

Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most widely used algorithms in reinforcement…

Machine Learning · Computer Science 2018-11-07 Jalaj Bhandari , Daniel Russo , Raghav Singal

The latent variable proximal point (LVPP) algorithm is a framework for solving infinite-dimensional variational problems with pointwise inequality constraints. The algorithm is a saddle point reformulation of the Bregman proximal point…

Optimization and Control · Mathematics 2025-07-01 Jørgen S. Dokken , Patrick E. Farrell , Brendan Keith , Ioannis P. A. Papadopoulos , Thomas M. Surowiec

Gradient temporal difference (Gradient TD) algorithms are a popular class of stochastic approximation (SA) algorithms used for policy evaluation in reinforcement learning. Here, we consider Gradient TD algorithms with an additional heavy…

Machine Learning · Computer Science 2021-11-23 Rohan Deb , Shalabh Bhatnagar

In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to…

Machine Learning · Computer Science 2014-05-28 Sridhar Mahadevan , Bo Liu , Philip Thomas , Will Dabney , Steve Giguere , Nicholas Jacek , Ian Gemp , Ji Liu

We analyze several generic proximal splitting algorithms well suited for large-scale convex nonsmooth optimization. We derive sublinear and linear convergence results with new rates on the function value suboptimality or distance to the…

Optimization and Control · Mathematics 2022-01-28 Laurent Condat , Grigory Malinovsky , Peter Richtárik

Large sectors of the recent optimization literature focused in the last decade on the development of optimal stochastic first order schemes for constrained convex models under progressively relaxed assumptions. Stochastic proximal point is…

Optimization and Control · Mathematics 2020-05-05 Andrei Patrascu

This paper considers the robust phase retrieval problem, which can be cast as a nonsmooth and nonconvex optimization problem. We propose a new inexact proximal linear algorithm with the subproblem being solved inexactly. Our contributions…

Optimization and Control · Mathematics 2024-02-12 Zhong Zheng , Shiqian Ma , Lingzhou Xue

In contrast with many other convex optimization classes, state-of-the-art semidefinite programming solvers are yet unable to efficiently solve large scale instances. This work aims to reduce this scalability gap by proposing a novel…

Optimization and Control · Mathematics 2018-12-20 Mario Souto , Joaquim D. Garcia , Alvaro Veiga

The proximal gradient method is a generic technique introduced to tackle the non-smoothness in optimization problems, wherein the objective function is expressed as the sum of a differentiable convex part and a non-differentiable…

Numerical Analysis · Mathematics 2024-01-19 Abdeslem Hafid Bentbib , Khalide Jbilou , Ridwane Tahiri

The proximal gradient algorithm for minimizing the sum of a smooth and a nonsmooth convex function often converges linearly even without strong convexity. One common reason is that a multiple of the step length at each iteration may…

Optimization and Control · Mathematics 2016-06-29 Dmitriy Drusvyatskiy , Adrian S. Lewis

Temporal difference (TD) learning algorithms with neural network function parameterization have well-established empirical success in many practical large-scale reinforcement learning tasks. However, theoretical understanding of these…

Machine Learning · Computer Science 2024-05-08 Zhifa Ke , Zaiwen Wen , Junyu Zhang

Proximal algorithms have gained popularity in recent years in large-scale and distributed optimization problems. One such problem is the phase retrieval problem, for which proximal operators have been proposed recently. The phase retrieval…

Optimization and Control · Mathematics 2018-08-16 Biel Roig-Solvas , Lee Makowski , Dana H. Brooks

Decentralized optimization is a powerful paradigm that finds applications in engineering and learning design. This work studies decentralized composite optimization problems with non-smooth regularization terms. Most existing gradient-based…

Optimization and Control · Mathematics 2019-10-29 Sulaiman A. Alghunaim , Kun Yuan , Ali H. Sayed

Temporal difference (TD) learning with linear function approximation (linear TD) is a classic and powerful prediction algorithm in reinforcement learning. While it is well-understood that linear TD converges almost surely to a unique point,…

Machine Learning · Computer Science 2026-03-25 Jiuqi Wang , Shangtong Zhang
‹ Prev 1 2 3 10 Next ›