Related papers: Proximal Algorithms and Temporal Differences for L…

Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity

In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms. We show how gradient TD (GTD)…

Machine Learning · Computer Science 2020-06-09 Bo Liu , Ian Gemp , Mohammad Ghavamzadeh , Ji Liu , Sridhar Mahadevan , Marek Petrik

Stabilizing Temporal Difference Learning via Implicit Stochastic Recursion

Temporal difference (TD) learning is a foundational algorithm in reinforcement learning (RL). For nearly forty years, TD learning has served as a workhorse for applied RL as well as a building block for more complex and specialized…

Machine Learning · Computer Science 2025-06-24 Hwanwoo Kim , Panos Toulis , Eric Laber

Proximal algorithms for large-scale statistical modeling and sensor/actuator selection

Several problems in modeling and control of stochastically-driven dynamical systems can be cast as regularized semi-definite programs. We examine two such representative problems and show that they can be formulated in a similar manner. The…

Optimization and Control · Mathematics 2019-12-30 Armin Zare , Hesameddin Mohammadi , Neil K. Dhingra , Tryphon T. Georgiou , Mihailo R. Jovanović

Effective Multi-step Temporal-Difference Learning for Non-Linear Function Approximation

Multi-step temporal-difference (TD) learning, where the update targets contain information from multiple time steps ahead, is one of the most popular forms of TD learning for linear function approximation. The reason is that multi-step…

Artificial Intelligence · Computer Science 2016-08-19 Harm van Seijen

Proximal Algorithms in Statistics and Machine Learning

In this paper we develop proximal methods for statistical learning. Proximal point algorithms are useful in statistics and machine learning for obtaining optimization solutions for composite functions. Our approach exploits closed-form…

Machine Learning · Statistics 2015-06-02 Nicholas G. Polson , James G. Scott , Brandon T. Willard

Proximal Projection Method for Stable Linearly Constrained Optimization

Many applications using large datasets require efficient methods for minimizing a proximable convex function subject to satisfying a set of linear constraints within a specified tolerance. For this task, we present a proximal projection…

Optimization and Control · Mathematics 2024-12-10 Howard Heaton

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation

Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most widely used algorithms in reinforcement…

Machine Learning · Computer Science 2018-11-07 Jalaj Bhandari , Daniel Russo , Raghav Singal

The latent variable proximal point algorithm for variational problems with inequality constraints

The latent variable proximal point (LVPP) algorithm is a framework for solving infinite-dimensional variational problems with pointwise inequality constraints. The algorithm is a saddle point reformulation of the Bregman proximal point…

Optimization and Control · Mathematics 2025-07-01 Jørgen S. Dokken , Patrick E. Farrell , Brendan Keith , Ioannis P. A. Papadopoulos , Thomas M. Surowiec

Gradient Temporal Difference with Momentum: Stability and Convergence

Gradient temporal difference (Gradient TD) algorithms are a popular class of stochastic approximation (SA) algorithms used for policy evaluation in reinforcement learning. Here, we consider Gradient TD algorithms with an additional heavy…

Machine Learning · Computer Science 2021-11-23 Rohan Deb , Shalabh Bhatnagar

Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces

In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to…

Machine Learning · Computer Science 2014-05-28 Sridhar Mahadevan , Bo Liu , Philip Thomas , Will Dabney , Steve Giguere , Nicholas Jacek , Ian Gemp , Ji Liu

Distributed Proximal Splitting Algorithms with Rates and Acceleration

We analyze several generic proximal splitting algorithms well suited for large-scale convex nonsmooth optimization. We derive sublinear and linear convergence results with new rates on the function value suboptimality or distance to the…

Optimization and Control · Mathematics 2022-01-28 Laurent Condat , Grigory Malinovsky , Peter Richtárik

New nonasymptotic convergence rates of stochastic proximal pointalgorithm for convex optimization problems

Large sectors of the recent optimization literature focused in the last decade on the development of optimal stochastic first order schemes for constrained convex models under progressively relaxed assumptions. Stochastic proximal point is…

Optimization and Control · Mathematics 2020-05-05 Andrei Patrascu

A New Inexact Proximal Linear Algorithm with Adaptive Stopping Criteria for Robust Phase Retrieval

This paper considers the robust phase retrieval problem, which can be cast as a nonsmooth and nonconvex optimization problem. We propose a new inexact proximal linear algorithm with the subproblem being solved inexactly. Our contributions…

Optimization and Control · Mathematics 2024-02-12 Zhong Zheng , Shiqian Ma , Lingzhou Xue

Exploiting Low-Rank Structure in Semidefinite Programming by Approximate Operator Splitting

In contrast with many other convex optimization classes, state-of-the-art semidefinite programming solvers are yet unable to efficiently solve large scale instances. This work aims to reduce this scalability gap by proposing a novel…

Optimization and Control · Mathematics 2018-12-20 Mario Souto , Joaquim D. Garcia , Alvaro Veiga

Multidimensional extrapolated global proximal gradient and applications for image processing

The proximal gradient method is a generic technique introduced to tackle the non-smoothness in optimization problems, wherein the objective function is expressed as the sum of a differentiable convex part and a non-differentiable…

Numerical Analysis · Mathematics 2024-01-19 Abdeslem Hafid Bentbib , Khalide Jbilou , Ridwane Tahiri

Error bounds, quadratic growth, and linear convergence of proximal methods

The proximal gradient algorithm for minimizing the sum of a smooth and a nonsmooth convex function often converges linearly even without strong convexity. One common reason is that a multiple of the step length at each iteration may…

Optimization and Control · Mathematics 2016-06-29 Dmitriy Drusvyatskiy , Adrian S. Lewis

An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks

Temporal difference (TD) learning algorithms with neural network function parameterization have well-established empirical success in many practical large-scale reinforcement learning tasks. However, theoretical understanding of these…

Machine Learning · Computer Science 2024-05-08 Zhifa Ke , Zaiwen Wen , Junyu Zhang

A Proximal Operator for Multispectral Phase Retrieval Problems

Proximal algorithms have gained popularity in recent years in large-scale and distributed optimization problems. One such problem is the phase retrieval problem, for which proximal operators have been proposed recently. The phase retrieval…

Optimization and Control · Mathematics 2018-08-16 Biel Roig-Solvas , Lee Makowski , Dana H. Brooks

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization

Decentralized optimization is a powerful paradigm that finds applications in engineering and learning design. This work studies decentralized composite optimization problems with non-smooth regularization terms. Most existing gradient-based…

Optimization and Control · Mathematics 2019-10-29 Sulaiman A. Alghunaim , Kun Yuan , Ali H. Sayed

Almost Sure Convergence of Linear Temporal Difference Learning with Arbitrary Features

Temporal difference (TD) learning with linear function approximation (linear TD) is a classic and powerful prediction algorithm in reinforcement learning. While it is well-understood that linear TD converges almost surely to a unique point,…

Machine Learning · Computer Science 2026-03-25 Jiuqi Wang , Shangtong Zhang