Related papers: A Policy Optimization Method Towards Optimal-time …

An Actor-Critic Method for Simulation-Based Optimization

We focus on a simulation-based optimization problem of choosing the best design from the feasible space. Although the simulation model can be queried with finite samples, its internal processing rule cannot be utilized in the optimization…

Machine Learning · Computer Science 2021-11-02 Kuo Li , Qing-Shan Jia , Jiaqi Yan

Actor-Critic Reinforcement Learning for Control with Stability Guarantee

Reinforcement Learning (RL) and its integration with deep learning have achieved impressive performance in various robotic control tasks, ranging from motion planning and navigation to end-to-end visual manipulation. However, stability is…

Robotics · Computer Science 2020-07-16 Minghao Han , Lixian Zhang , Jun Wang , Wei Pan

Optimal Actor-Critic Policy with Optimized Training Datasets

Actor-critic (AC) algorithms are known for their efficacy and high performance in solving reinforcement learning problems, but they also suffer from low sampling efficiency. An AC based policy optimization process is iterative and needs to…

Machine Learning · Computer Science 2021-12-02 Chayan Banerjee , Zhiyong Chen , Nasimul Noman , Mohsen Zamani

A Stochastic-Optimization-Based Adaptive-Sampling Scheme for Data-Driven Stability Analysis of Switched Linear Systems

We introduce a novel approach based on stochastic optimization to find the optimal sampling distribution for the data-driven stability analysis of switched linear systems. Our goal is to address limitations of existing approaches, in…

Optimization and Control · Mathematics 2025-09-01 Alexis Vuille , Guillaume O. Berger , Raphaël M. Jungers

Model-Based Actor-Critic with Chance Constraint for Stochastic System

Safety is essential for reinforcement learning (RL) applied in real-world situations. Chance constraints are suitable to represent the safety requirements in stochastic systems. Previous chance-constrained RL methods usually have a low…

Machine Learning · Computer Science 2021-03-17 Baiyu Peng , Yao Mu , Yang Guan , Shengbo Eben Li , Yuming Yin , Jianyu Chen

Relaxed Actor-Critic with Convergence Guarantees for Continuous-Time Optimal Control of Nonlinear Systems

This paper presents the Relaxed Continuous-Time Actor-critic (RCTAC) algorithm, a method for finding the nearly optimal policy for nonlinear continuous-time (CT) systems with known dynamics and infinite horizon, such as the path-tracking…

Systems and Control · Electrical Eng. & Systems 2023-03-31 Jingliang Duan , Jie Li , Qiang Ge , Shengbo Eben Li , Monimoy Bujarbaruah , Fei Ma , Dezhao Zhang

Neural Lyapunov and Optimal Control

Despite impressive results, reinforcement learning (RL) suffers from slow convergence and requires a large variety of tuning strategies. In this paper, we investigate the ability of RL algorithms on simple continuous control tasks. We show…

Robotics · Computer Science 2024-02-16 Daniel Layeghi , Steve Tonneau , Michael Mistry

Stabilizing Neural Control Using Self-Learned Almost Lyapunov Critics

The lack of stability guarantee restricts the practical use of learning-based methods in core control problems in robotics. We develop new methods for learning neural control policies and neural Lyapunov critic functions in the model-free…

Robotics · Computer Science 2021-07-13 Ya-Chien Chang , Sicun Gao

Actor-Critic based Improper Reinforcement Learning

We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform…

Machine Learning · Computer Science 2022-07-20 Mohammadi Zaki , Avinash Mohan , Aditya Gopalan , Shie Mannor

Soft-Robust Actor-Critic Policy-Gradient

Robust Reinforcement Learning aims to derive optimal behavior that accounts for model uncertainty in dynamical systems. However, previous studies have shown that by considering the worst case scenario, robust policies can be overly…

Machine Learning · Computer Science 2018-10-25 Esther Derman , Daniel J. Mankowitz , Timothy A. Mann , Shie Mannor

Solving Time-Continuous Stochastic Optimal Control Problems: Algorithm Design and Convergence Analysis of Actor-Critic Flow

We propose an actor-critic framework to solve the time-continuous stochastic optimal control problem. A least square temporal difference method is applied to compute the value function for the critic. The policy gradient method is…

Optimization and Control · Mathematics 2025-01-27 Mo Zhou , Jianfeng Lu

Constrained Policy Optimization for Stochastic Optimal Control under Nonstationary Uncertainties

This article presents a constrained policy optimization approach for the optimal control of systems under nonstationary uncertainties. We introduce an assumption that we call Markov embeddability that allows us to cast the stochastic…

Optimization and Control · Mathematics 2026-05-11 Sungho Shin , François Pacaud , Emil Contantinescu , Mihai Anitescu

Stable Reinforcement Learning with Unbounded State Space

We consider the problem of reinforcement learning (RL) with unbounded state space motivated by the classical problem of scheduling in a queueing network. Traditional policies as well as error metric that are designed for finite, bounded or…

Machine Learning · Computer Science 2020-06-09 Devavrat Shah , Qiaomin Xie , Zhi Xu

Optimization via a Control-Centric Framework

Optimization plays a central role in intelligent systems and cyber-physical technologies, where speed and reliability of convergence directly impact performance. In control theory, optimization-centric methods are standard: controllers are…

Optimization and Control · Mathematics 2026-03-23 Liraz Mudrik , Isaac Kaminer , Sean Kragelund , Abram H. Clark

A Lyapunov Function for the Combined System-Optimizer Dynamics in Inexact Model Predictive Control

In this paper, an asymptotic stability proof for a class of methods for inexact nonlinear model predictive control is presented. General Q-linearly convergent online optimization methods are considered and an asymptotic stability result is…

Optimization and Control · Mathematics 2021-12-01 Andrea Zanelli , Quoc Tran Dinh , Moritz Diehl

Online Algorithms and Policies Using Adaptive and Machine Learning Approaches

This paper considers the problem of real-time control and learning in dynamic systems subjected to parametric uncertainties. We propose a combination of a Reinforcement Learning (RL) based policy in the outer loop suitably chosen to ensure…

Machine Learning · Computer Science 2023-06-13 Anuradha M. Annaswamy , Anubhav Guha , Yingnan Cui , Sunbochen Tang , Peter A. Fisher , Joseph E. Gaudio

Lyapunov-based Safe Policy Optimization for Continuous Control

We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i.e.,~policies that do not take the agent to undesirable situations. We formulate…

Machine Learning · Computer Science 2019-02-13 Yinlam Chow , Ofir Nachum , Aleksandra Faust , Edgar Duenez-Guzman , Mohammad Ghavamzadeh

Nearly Optimal Policy Optimization with Stable at Any Time Guarantee

Policy optimization methods are one of the most widely used classes of Reinforcement Learning (RL) algorithms. However, theoretical understanding of these methods remains insufficient. Even in the episodic (time-inhomogeneous) tabular…

Machine Learning · Computer Science 2022-12-06 Tianhao Wu , Yunchang Yang , Han Zhong , Liwei Wang , Simon S. Du , Jiantao Jiao

Actor-Critic Policy Optimization in Partially Observable Multiagent Environments

Optimization of parameterized policies for reinforcement learning (RL) is an important and challenging problem in artificial intelligence. Among the most common approaches are algorithms based on gradient ascent of a score function…

Machine Learning · Computer Science 2020-06-15 Sriram Srinivasan , Marc Lanctot , Vinicius Zambaldi , Julien Perolat , Karl Tuyls , Remi Munos , Michael Bowling

State and Input Constrained Output-Feedback Adaptive Optimal Control of Affine Nonlinear Systems

In this paper, a novel online, output-feedback, critic-only, model-based reinforcement learning framework is developed for safety-critical control systems operating in complex environments. The developed framework ensures system stability…

Systems and Control · Electrical Eng. & Systems 2024-06-28 Tochukwu Elijah Ogri , Muzaffar Qureshi , Zachary I. Bell , Rushikesh Kamalapurkar