Related papers: Policy Gradient-based Algorithms for Continuous-ti…

LQR through the Lens of First Order Methods: Discrete-time Case

We consider the Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. Such a setup facilitates examining the implications of a natural initial-state independent…

Systems and Control · Electrical Eng. & Systems 2019-07-31 Jingjing Bu , Afshin Mesbahi , Maryam Fazel , Mehran Mesbahi

Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon

We explore reinforcement learning methods for finding the optimal policy in the linear quadratic regulator (LQR) problem. In particular, we consider the convergence of policy gradient methods in the setting of known and unknown parameters.…

Machine Learning · Computer Science 2021-06-25 Ben Hambly , Renyuan Xu , Huining Yang

Sample-Efficient Model-Free Policy Gradient Methods for Stochastic LQR via Robust Linear Regression

Policy gradient algorithms are widely used in reinforcement learning and belong to the class of approximate dynamic programming methods. This paper studies two key policy gradient algorithms, the Natural Policy Gradient and the Gauss-Newton…

Systems and Control · Electrical Eng. & Systems 2026-05-11 Bowen Song , Sebastien Gros , Andrea Iannelli

Global Convergence of Policy Gradient Algorithms for Indefinite Least Squares Stationary Optimal Control

We consider policy gradient algorithms for the indefinite least squares stationary optimal control, e.g., linear-quadratic-regulator (LQR) with indefinite state and input penalization matrices. Such a setup has important applications in…

Optimization and Control · Mathematics 2020-02-13 Jingjing Bu , Mehran Mesbahi

LQR for Systems with Probabilistic Parametric Uncertainties: A Gradient Method

A gradient-based method is proposed for solving the linear quadratic regulator (LQR) problem for linear systems with nonlinear dependence on time-invariant probabilistic parametric uncertainties. The approach explicitly accounts for model…

Systems and Control · Electrical Eng. & Systems 2026-03-30 Leilei Cui , Richard D. Braatz

Second-Order Policy Gradient Methods for the Linear Quadratic Regulator

Policy gradient methods are a powerful family of reinforcement learning algorithms for continuous control that optimize a policy directly. However, standard first-order methods often converge slowly. Second-order methods can accelerate…

Systems and Control · Electrical Eng. & Systems 2025-11-05 Amirreza Valaei , Arash Bahari Kordabad , Sadegh Soudjani

Gradient Dominance in the Linear Quadratic Regulator: A Unified Analysis for Continuous-Time and Discrete-Time Systems

Despite its nonconvexity, policy optimization for the Linear Quadratic Regulator (LQR) admits a favorable structural property known as gradient dominance, which facilitates linear convergence of policy gradient methods to the globally…

Optimization and Control · Mathematics 2026-02-27 Yuto Watanabe , Yang Zheng

On the (almost) Global Exponential Convergence of the Overparameterized Policy Optimization for the LQR Problem

In this work we study the convergence of gradient methods for nonconvex optimization problems -- specifically the effect of the problem formulation to the convergence behavior of the solution of a gradient flow. We show through a simple…

Optimization and Control · Mathematics 2025-10-03 Moh Kamalul Wafi , Arthur Castello B. de Oliveira , Eduardo D. Sontag

Revisiting Strong Duality, Hidden Convexity, and Gradient Dominance in the Linear Quadratic Regulator

The Linear Quadratic Regulator (LQR) is a cornerstone of optimal control theory, widely studied in both model-based and model-free approaches. Despite its well-established nature, certain foundational aspects remain subtle. In this paper,…

Optimization and Control · Mathematics 2025-03-17 Yuto Watanabe , Yang Zheng

Power-Constrained Policy Gradient Methods for LQR

Consider a discrete-time Linear Quadratic Regulator (LQR) problem solved using policy gradient descent when the system matrices are unknown. The gradient is transmitted across a noisy channel over a finite time horizon using analog…

Optimization and Control · Mathematics 2025-07-22 Ashwin Verma , Aritra Mitra , Lintao Ye , Vijay Gupta

On Global and Local Convergence of Iterative Linear Quadratic Optimization Algorithms for Discrete Time Nonlinear Control

A classical approach for solving discrete time nonlinear control on a finite horizon consists in repeatedly minimizing linear quadratic approximations of the original problem around current candidate solutions. While widely popular in many…

Optimization and Control · Mathematics 2025-07-08 Vincent Roulet , Siddhartha Srinivasa , Maryam Fazel , Zaid Harchaoui

Policy Gradient Adaptive Control for the LQR: Indirect and Direct Approaches

Motivated by recent advances of reinforcement learning and direct data-driven control, we propose policy gradient adaptive control (PGAC) for the linear quadratic regulator (LQR), which uses online closed-loop data to improve the control…

Optimization and Control · Mathematics 2025-06-16 Feiran Zhao , Alessandro Chiuso , Florian Dörfler

Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator

Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model 2) they are an…

Machine Learning · Computer Science 2019-03-26 Maryam Fazel , Rong Ge , Sham M. Kakade , Mehran Mesbahi

Convergence Analysis of Gradient Flow for Overparameterized LQR Formulations

Motivated by the growing use of artificial intelligence (AI) tools in control design, this paper analyses the intersection between results from gradient methods for the model-free linear quadratic regulator (LQR), and linear feedforward…

Systems and Control · Electrical Eng. & Systems 2025-05-27 Arthur Castello B. de Oliveira , Milad Siami , Eduardo D. Sontag

Convergence of Policy Gradient for Stochastic Linear-Quadratic Control Problem in Infinite Horizon

With the outstanding performance of policy gradient (PG) method in the reinforcement learning field, the convergence theory of it has aroused more and more interest recently. Meanwhile, the significant importance and abundant theoretical…

Optimization and Control · Mathematics 2024-04-19 Xinpei Zhang , Guangyan Jia

On the Optimization Landscape of Dynamic Output Feedback Linear Quadratic Control

The convergence of policy gradient algorithms hinges on the optimization landscape of the underlying optimal control problem. Theoretical insights into these algorithms can often be acquired from analyzing those of linear quadratic control.…

Optimization and Control · Mathematics 2023-11-02 Jingliang Duan , Wenhan Cao , Yang Zheng , Lin Zhao

Learning Stabilizing Controllers of Linear Systems via Discount Policy Gradient

Stability is one of the most fundamental requirements for systems synthesis. In this paper, we address the stabilization problem for unknown linear systems via policy gradient (PG) methods. We leverage a key feature of PG for Linear…

Optimization and Control · Mathematics 2021-12-20 Feiran Zhao , Xingyun Fu , Keyou You

Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators

Nonlinear control systems with partial information to the decision maker are prevalent in a variety of applications. As a step toward studying such nonlinear systems, this work explores reinforcement learning methods for finding the optimal…

Machine Learning · Computer Science 2025-04-11 Yinbin Han , Meisam Razaviyayn , Renyuan Xu

Convergence of Flow-Policy Gradient Learning for Linear Quadratic Regulator Problems

Flow $Q$-learning has recently been introduced to integrate learning from expert demonstrations into an actor-critic structure. Central to this innovation is the ``the one-step policy'' network, which is optimized through a $Q$-function…

Systems and Control · Electrical Eng. & Systems 2025-11-17 Farnaz Adib Yaghmaie , Arunava Naha

Convergence Guarantees of Model-free Policy Gradient Methods for LQR with Stochastic Data

Policy gradient (PG) methods are the backbone of many reinforcement learning algorithms due to their good performance in policy optimization problems. As a gradient-based approach, PG methods typically rely on knowledge of the system…

Systems and Control · Electrical Eng. & Systems 2026-04-02 Bowen Song , Andrea Iannelli