Related papers: Dynamical System Optimization
Direct optimization is an appealing framework that replaces integration with optimization of a random objective for approximating gradients in models with discrete random variables. A$^\star$ sampling is a framework for optimizing such…
Policy iteration is one of the classical frameworks of reinforcement learning, which requires a known initial stabilizing control. However, finding the initial stabilizing control depends on the known system model. To relax this requirement…
Direct policy optimization in reinforcement learning is usually solved with policy-gradient algorithms, which optimize policy parameters via stochastic gradient ascent. This paper provides a new theoretical interpretation and justification…
In the theory of dynamic programming, an optimal policy is a policy whose lifetime value dominates that of all other policies from every possible initial condition in the state space. This raises a natural question: when does optimality…
We address the design and synthesis of optimal control strategies for high-dimensional stochastic dynamical systems. Such systems may be deterministic nonlinear systems evolving from random initial states, or systems driven by random…
We focus on developing efficient and reliable policy optimization strategies for robot learning with real-world data. In recent years, policy gradient methods have emerged as a promising paradigm for training control policies in simulation.…
Reinforcement learning is a promising approach to learning robotics controllers. It has recently been shown that algorithms based on finite-difference estimates of the policy gradient are competitive with algorithms based on the policy…
This paper presents a model-free approximation for the Hessian of the performance of deterministic policies to use in the context of Reinforcement Learning based on Quasi-Newton steps in the policy parameters. We show that the approximate…
Evolutionary algorithms, such as Differential Evolution, excel in solving real-parameter optimization challenges. However, the effectiveness of a single algorithm varies across different problem instances, necessitating considerable efforts…
The performance of an algorithm often critically depends on its parameter configuration. While a variety of automated algorithm configuration methods have been proposed to relieve users from the tedious and error-prone task of manually…
Optimization algorithms can be interpreted through the lens of dynamical systems as the interconnection of linear systems and a set of subgradient nonlinearities. This dynamical systems formulation allows for the analysis and synthesis of…
A model among many may only be best under certain states of the world. Switching from a model to another can also be costly. Finding a procedure to dynamically choose a model in these circumstances requires to solve a complex estimation…
Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. However, these methods are also data-inefficient, afflicted with high variance gradient estimates, and frequently…
Attempts from different disciplines to provide a fundamental understanding of deep learning have advanced rapidly in recent years, yet a unified framework remains relatively limited. In this article, we provide one possible way to align…
Guided policy search algorithms have been proven to work with incredible accuracy for not only controlling a complicated dynamical system, but also learning optimal policies from various unseen instances. One assumes true nature of the…
Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks. Such a challenge is more outstanding in multi-agent tasks, as each step of operation is more costly…
We consider the problem of optimizing time averages in systems with independent and identically distributed behavior over renewal frames. This includes scheduling and task processing to maximize utility in stochastic networks with variable…
In this work the problem of path planning for an autonomous vehicle that moves on a freeway is considered. The most common approaches that are used to address this problem are based on optimal control methods, which make assumptions about…
This paper proposes the first-ever algorithmic framework for tuning hyper-parameters of stochastic optimization algorithm based on reinforcement learning. Hyper-parameters impose significant influences on the performance of stochastic…
Despite their apparent diversity, modern machine learning methods can be reduced to a remarkably simple core principle: learning is achieved by continuously optimizing parameters to minimize or maximize a scalar objective function. This…