Related papers: Differentiable Dynamic Programming for Structured …
Dynamic programming (DP) is an algorithmic design paradigm for the efficient, exact solution of otherwise intractable, combinatorial problems. However, DP algorithm design is often presented in an ad-hoc manner. It is sometimes difficult to…
Differential Dynamic Programming (DDP) has become a well established method for unconstrained trajectory optimization. Despite its several applications in robotics and controls however, a widely successful constrained version of the…
Dynamic Programming (DP) provides standard algorithms to solve Markov Decision Processes. However, these algorithms generally do not optimize a scalar objective function. In this paper, we draw connections between DP and (constrained)…
Differential Dynamic Programming is an optimal control technique often used for trajectory generation. Many variations of this algorithm have been developed in the literature, including algorithms for stochastic dynamics or state and input…
Routing problems are a class of combinatorial problems with many practical applications. Recently, end-to-end deep learning methods have been proposed to learn approximate solution heuristics for such problems. In contrast, classical…
Differential dynamic programming (DDP) is a popular technique for solving nonlinear optimal control problems with locally quadratic approximations. However, existing DDP methods are not designed for stochastic systems with unknown…
Differentiable programming is the combination of classical neural networks modules with algorithmic ones in an end-to-end differentiable model. These new models, that use automatic differentiation to calculate gradients, have new learning…
We consider convex optimization problems formulated using dynamic programming equations. Such problems can be solved using the Dual Dynamic Programming algorithm combined with the Level 1 cut selection strategy or the Territory algorithm to…
Differential Dynamic Programming (DDP) is an efficient computational tool for solving nonlinear optimal control problems. It was originally designed as a single shooting method and thus is sensitive to the initial guess supplied. This work…
Trajectory optimization considers the problem of deciding how to control a dynamical system to move along a trajectory which minimizes some cost function. Differential Dynamic Programming (DDP) is an optimal control method which utilizes a…
In this article, we address a class of non convex, integer, non linear mathematical programs using dynamic programming. The mathematical program considered, whose properties are studied in this article, may be used to model the optimal…
Connections between Deep Neural Networks (DNNs) training and optimal control theory has attracted considerable attention as a principled tool of algorithmic design. Differential Dynamic Programming (DDP) neural optimizer is a recently…
Autonomous agents are limited in their ability to observe the world state. Partially observable Markov decision processes (POMDPs) formally model the problem of planning under world state uncertainty, but POMDPs with continuous actions and…
Trajectory optimization is an efficient approach for solving optimal control problems for complex robotic systems. It relies on two key components: first the transcription into a sparse nonlinear program, and second the corresponding solver…
Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we…
Differential Dynamic Programming (DDP) is an efficient trajectory optimization algorithm relying on second-order approximations of a system's dynamics and cost function, and has recently been applied to optimize systems with time-invariant…
Safe operation of systems such as robots requires them to plan and execute trajectories subject to safety constraints. When those systems are subject to uncertainties in their dynamics, it is challenging to ensure that the constraints are…
The top-k operator returns a sparse vector, where the non-zero values correspond to the k largest values of the input. Unfortunately, because it is a discontinuous function, it is difficult to incorporate in neural networks trained…
In dynamic programming (DP) and reinforcement learning (RL), an agent learns to act optimally in terms of expected long-term return by sequentially interacting with its environment modeled by a Markov decision process (MDP). More generally…
Robot design optimization, imitation learning and system identification share a common problem which requires optimization over robot or task parameters at the same time as optimizing the robot motion. To solve these problems, we can use…