Related papers: A Policy Optimization Method Towards Optimal-time …
We focus on a simulation-based optimization problem of choosing the best design from the feasible space. Although the simulation model can be queried with finite samples, its internal processing rule cannot be utilized in the optimization…
Reinforcement Learning (RL) and its integration with deep learning have achieved impressive performance in various robotic control tasks, ranging from motion planning and navigation to end-to-end visual manipulation. However, stability is…
Actor-critic (AC) algorithms are known for their efficacy and high performance in solving reinforcement learning problems, but they also suffer from low sampling efficiency. An AC based policy optimization process is iterative and needs to…
We introduce a novel approach based on stochastic optimization to find the optimal sampling distribution for the data-driven stability analysis of switched linear systems. Our goal is to address limitations of existing approaches, in…
Safety is essential for reinforcement learning (RL) applied in real-world situations. Chance constraints are suitable to represent the safety requirements in stochastic systems. Previous chance-constrained RL methods usually have a low…
This paper presents the Relaxed Continuous-Time Actor-critic (RCTAC) algorithm, a method for finding the nearly optimal policy for nonlinear continuous-time (CT) systems with known dynamics and infinite horizon, such as the path-tracking…
Despite impressive results, reinforcement learning (RL) suffers from slow convergence and requires a large variety of tuning strategies. In this paper, we investigate the ability of RL algorithms on simple continuous control tasks. We show…
The lack of stability guarantee restricts the practical use of learning-based methods in core control problems in robotics. We develop new methods for learning neural control policies and neural Lyapunov critic functions in the model-free…
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform…
Robust Reinforcement Learning aims to derive optimal behavior that accounts for model uncertainty in dynamical systems. However, previous studies have shown that by considering the worst case scenario, robust policies can be overly…
We propose an actor-critic framework to solve the time-continuous stochastic optimal control problem. A least square temporal difference method is applied to compute the value function for the critic. The policy gradient method is…
This article presents a constrained policy optimization approach for the optimal control of systems under nonstationary uncertainties. We introduce an assumption that we call Markov embeddability that allows us to cast the stochastic…
We consider the problem of reinforcement learning (RL) with unbounded state space motivated by the classical problem of scheduling in a queueing network. Traditional policies as well as error metric that are designed for finite, bounded or…
Optimization plays a central role in intelligent systems and cyber-physical technologies, where speed and reliability of convergence directly impact performance. In control theory, optimization-centric methods are standard: controllers are…
In this paper, an asymptotic stability proof for a class of methods for inexact nonlinear model predictive control is presented. General Q-linearly convergent online optimization methods are considered and an asymptotic stability result is…
This paper considers the problem of real-time control and learning in dynamic systems subjected to parametric uncertainties. We propose a combination of a Reinforcement Learning (RL) based policy in the outer loop suitably chosen to ensure…
We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i.e.,~policies that do not take the agent to undesirable situations. We formulate…
Policy optimization methods are one of the most widely used classes of Reinforcement Learning (RL) algorithms. However, theoretical understanding of these methods remains insufficient. Even in the episodic (time-inhomogeneous) tabular…
Optimization of parameterized policies for reinforcement learning (RL) is an important and challenging problem in artificial intelligence. Among the most common approaches are algorithms based on gradient ascent of a score function…
In this paper, a novel online, output-feedback, critic-only, model-based reinforcement learning framework is developed for safety-critical control systems operating in complex environments. The developed framework ensures system stability…