Related papers: Gaining efficiency in deep policy gradient method …
This paper studies an optimal control problem for continuous-time stochastic systems subject to reachability objectives specified in a subclass of metric interval temporal logic specifications, a temporal logic with real-time constraints.…
This paper focuses on optimal control problem for a class of discrete-time nonlinear systems. In practical applications, computation time is a crucial consideration when solving nonlinear optimal control problems, especially under real-time…
In this paper, we consider a class of continuous-time, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation methods and sampling-based algorithms for deterministic path planning,…
This paper is concerned with the computing efficiency of model predictive control (MPC) problems for dynamical systems with both rate and amplitude constraints on the inputs. Instead of augmenting the decision variables of the underlying…
In this article, we discuss two algorithms tailored to discrete-time deterministic finite-horizon nonlinear optimal control problems or so-called deterministic trajectory optimization problems. Both algorithms can be derived from an…
This paper studies an infinite horizon optimal control problem for discrete-time linear system and quadratic criteria, both with random parameters which are independent and identically distributed with respect to time. In this general…
We consider policy gradient methods for stochastic optimal control problem in continuous time. In particular, we analyze the gradient flow for the control, viewed as a continuous time limit of the policy gradient method. We prove the global…
In recent years, deep learning has been connected with optimal control as a way to define a notion of a continuous underlying learning problem. In this view, neural networks can be interpreted as a discretization of a parametric Ordinary…
The article discusses the gradient discretisation method (GDM) for distributed optimal control problems governed by diffusion equation with pure Neumann boundary condition. Using the GDM framework enables to develop an analysis that…
We prove convergence of the proximal policy gradient method for a class of constrained stochastic control problems with control in both the drift and diffusion of the state process. The problem requires either the running or terminal cost…
We study a Q learning algorithm for continuous time stochastic control problems. The proposed algorithm uses the sampled state process by discretizing the state and control action spaces under piece-wise constant control processes. We show…
We study the estimation of policy gradients for continuous-time systems with known dynamics. By reframing policy learning in continuous-time, we show that it is possible construct a more efficient and accurate gradient estimator. The…
We study the global linear convergence of policy gradient (PG) methods for finite-horizon continuous-time exploratory linear-quadratic control (LQC) problems. The setting includes stochastic LQC problems with indefinite costs and allows…
We develop policy gradients methods for stochastic control with exit time in a model-free setting. We propose two types of algorithms for learning either directly the optimal policy or by learning alternately the value function (critic) and…
We propose a comprehensive framework for policy gradient methods tailored to continuous time reinforcement learning. This is based on the connection between stochastic control problems and randomised problems, enabling applications across…
A new method for stochastic control based on neural networks and using randomisation of discrete random variables is proposed and applied to optimal stopping time problems. The method models directly the policy and does not need the…
We present a tree structure algorithm for optimal control problems with state constraints. We prove a convergence result for a discrete time approximation of the value function based on a novel formulation of the constrained problem. Then…
The ability of Gaussian processes (GPs) to predict the behavior of dynamical systems as a more sample-efficient alternative to parametric models seems promising for real-world robotics research. However, the computational complexity of GPs…
In this paper, we propose a class of discrete-time approximation schemes for stochastic optimal control problems under the $G$-expectation framework. The proposed schemes are constructed recursively based on piecewise constant policy. We…
We establish a variety of results extending the well-known Pontryagin maximum principle of optimal control to discrete-time optimal control problems posed on smooth manifolds. These results are organized around a new theorem on critical and…