English
Related papers

Related papers: Policy Gradient-based Algorithms for Continuous-ti…

200 papers

We consider the Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. Such a setup facilitates examining the implications of a natural initial-state independent…

Systems and Control · Electrical Eng. & Systems 2019-07-31 Jingjing Bu , Afshin Mesbahi , Maryam Fazel , Mehran Mesbahi

We explore reinforcement learning methods for finding the optimal policy in the linear quadratic regulator (LQR) problem. In particular, we consider the convergence of policy gradient methods in the setting of known and unknown parameters.…

Machine Learning · Computer Science 2021-06-25 Ben Hambly , Renyuan Xu , Huining Yang

Policy gradient algorithms are widely used in reinforcement learning and belong to the class of approximate dynamic programming methods. This paper studies two key policy gradient algorithms, the Natural Policy Gradient and the Gauss-Newton…

Systems and Control · Electrical Eng. & Systems 2026-05-11 Bowen Song , Sebastien Gros , Andrea Iannelli

We consider policy gradient algorithms for the indefinite least squares stationary optimal control, e.g., linear-quadratic-regulator (LQR) with indefinite state and input penalization matrices. Such a setup has important applications in…

Optimization and Control · Mathematics 2020-02-13 Jingjing Bu , Mehran Mesbahi

A gradient-based method is proposed for solving the linear quadratic regulator (LQR) problem for linear systems with nonlinear dependence on time-invariant probabilistic parametric uncertainties. The approach explicitly accounts for model…

Systems and Control · Electrical Eng. & Systems 2026-03-30 Leilei Cui , Richard D. Braatz

Policy gradient methods are a powerful family of reinforcement learning algorithms for continuous control that optimize a policy directly. However, standard first-order methods often converge slowly. Second-order methods can accelerate…

Systems and Control · Electrical Eng. & Systems 2025-11-05 Amirreza Valaei , Arash Bahari Kordabad , Sadegh Soudjani

Despite its nonconvexity, policy optimization for the Linear Quadratic Regulator (LQR) admits a favorable structural property known as gradient dominance, which facilitates linear convergence of policy gradient methods to the globally…

Optimization and Control · Mathematics 2026-02-27 Yuto Watanabe , Yang Zheng

In this work we study the convergence of gradient methods for nonconvex optimization problems -- specifically the effect of the problem formulation to the convergence behavior of the solution of a gradient flow. We show through a simple…

Optimization and Control · Mathematics 2025-10-03 Moh Kamalul Wafi , Arthur Castello B. de Oliveira , Eduardo D. Sontag

The Linear Quadratic Regulator (LQR) is a cornerstone of optimal control theory, widely studied in both model-based and model-free approaches. Despite its well-established nature, certain foundational aspects remain subtle. In this paper,…

Optimization and Control · Mathematics 2025-03-17 Yuto Watanabe , Yang Zheng

Consider a discrete-time Linear Quadratic Regulator (LQR) problem solved using policy gradient descent when the system matrices are unknown. The gradient is transmitted across a noisy channel over a finite time horizon using analog…

Optimization and Control · Mathematics 2025-07-22 Ashwin Verma , Aritra Mitra , Lintao Ye , Vijay Gupta

A classical approach for solving discrete time nonlinear control on a finite horizon consists in repeatedly minimizing linear quadratic approximations of the original problem around current candidate solutions. While widely popular in many…

Optimization and Control · Mathematics 2025-07-08 Vincent Roulet , Siddhartha Srinivasa , Maryam Fazel , Zaid Harchaoui

Motivated by recent advances of reinforcement learning and direct data-driven control, we propose policy gradient adaptive control (PGAC) for the linear quadratic regulator (LQR), which uses online closed-loop data to improve the control…

Optimization and Control · Mathematics 2025-06-16 Feiran Zhao , Alessandro Chiuso , Florian Dörfler

Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model 2) they are an…

Machine Learning · Computer Science 2019-03-26 Maryam Fazel , Rong Ge , Sham M. Kakade , Mehran Mesbahi

Motivated by the growing use of artificial intelligence (AI) tools in control design, this paper analyses the intersection between results from gradient methods for the model-free linear quadratic regulator (LQR), and linear feedforward…

Systems and Control · Electrical Eng. & Systems 2025-05-27 Arthur Castello B. de Oliveira , Milad Siami , Eduardo D. Sontag

With the outstanding performance of policy gradient (PG) method in the reinforcement learning field, the convergence theory of it has aroused more and more interest recently. Meanwhile, the significant importance and abundant theoretical…

Optimization and Control · Mathematics 2024-04-19 Xinpei Zhang , Guangyan Jia

The convergence of policy gradient algorithms hinges on the optimization landscape of the underlying optimal control problem. Theoretical insights into these algorithms can often be acquired from analyzing those of linear quadratic control.…

Optimization and Control · Mathematics 2023-11-02 Jingliang Duan , Wenhan Cao , Yang Zheng , Lin Zhao

Stability is one of the most fundamental requirements for systems synthesis. In this paper, we address the stabilization problem for unknown linear systems via policy gradient (PG) methods. We leverage a key feature of PG for Linear…

Optimization and Control · Mathematics 2021-12-20 Feiran Zhao , Xingyun Fu , Keyou You

Nonlinear control systems with partial information to the decision maker are prevalent in a variety of applications. As a step toward studying such nonlinear systems, this work explores reinforcement learning methods for finding the optimal…

Machine Learning · Computer Science 2025-04-11 Yinbin Han , Meisam Razaviyayn , Renyuan Xu

Flow $Q$-learning has recently been introduced to integrate learning from expert demonstrations into an actor-critic structure. Central to this innovation is the ``the one-step policy'' network, which is optimized through a $Q$-function…

Systems and Control · Electrical Eng. & Systems 2025-11-17 Farnaz Adib Yaghmaie , Arunava Naha

Policy gradient (PG) methods are the backbone of many reinforcement learning algorithms due to their good performance in policy optimization problems. As a gradient-based approach, PG methods typically rely on knowledge of the system…

Systems and Control · Electrical Eng. & Systems 2026-04-02 Bowen Song , Andrea Iannelli
‹ Prev 1 2 3 10 Next ›