Related papers: Accelerated Point-wise Maximum Approach to Approxi…
In this paper, a convex optimization-based method is proposed for numerically solving dynamic programs in continuous state and action spaces. The key idea is to approximate the output of the Bellman operator at a particular state by the…
We describe an approximate dynamic programming method for stochastic control problems on infinite state and input spaces. The optimal value function is approximated by a linear combination of basis functions with coefficients as decision…
We consider dynamic programming problems with finite, discrete-time horizons and prohibitively high-dimensional, discrete state-spaces for direct computation of the value function from the Bellman equation. For the case that the value…
We consider stochastic dynamic programming problems with high-dimensional, discrete state-spaces and finite, discrete-time horizons that prohibit direct computation of the value function from a given Bellman equation for all states and time…
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation,…
We consider least squares approximation of a function of one variable by a continuous, piecewise-linear approximand that has a small number of breakpoints. This problem was notably considered by Bellman who proposed an approximate algorithm…
In this paper, we present a discretization algorithm for finite horizon risk constrained dynamic programming algorithm in [Chow_Pavone_13]. Although in a theoretical standpoint, Bellman's recursion provides a systematic way to find optimal…
This paper considers an optimization problem for a dynamical system whose evolution depends on a collection of binary decision variables. We develop scalable approximation algorithms with provable suboptimality bounds to provide…
The solutions to many sequential decision-making problems are characterized by dynamic programming and Bellman's principle of optimality. However, due to the inherent complexity of solving Bellman's equation exactly, there has been…
This paper considers non-smooth optimization problems where we seek to minimize the pointwise maximum of a continuously parameterized family of functions. Since the objective function is given as the solution to a maximization problem,…
One often encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for controlled Markov chains. In this paper, we provide a method to construct sub-optimal policies along with a bound…
We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon…
We study the dynamic programming approach to revenue management in the context of attended home delivery. We draw on results from dynamic programming theory for Markov decision problems to show that the underlying Bellman operator has a…
A very simple example of an algorithmic problem solvable by dynamic programming is to maximize, over sets A in {1,2,...,n}, the objective function |A| - \sum_i \xi_i 1(i \in A,i+1 \in A) for given \xi_i > 0. This problem, with random…
Approximate dynamic programming is a popular method for solving large Markov decision processes. This paper describes a new class of approximate dynamic programming (ADP) methods- distributionally robust ADP-that address the curse of…
In this paper, we propose a proximal gradient method and an accelerated proximal gradient method for solving composite optimization problems, where the objective function is the sum of a smooth and a convex, possibly nonsmooth, function. We…
Many sequential decision problems can be formulated as Markov Decision Processes (MDPs) where the optimal value function (or cost-to-go function) can be shown to satisfy a monotone structure in some or all of its dimensions. When the state…
Bilevel optimization has been developed for many machine learning tasks with large-scale and high-dimensional data. This paper considers a constrained bilevel optimization problem, where the lower-level optimization problem is convex with…
We develop a new Approximate Dynamic Programming (ADP) method for infinite horizon discounted reward Markov Decision Processes (MDP) based on projection onto a subsemimodule. We approximate the value function in terms of a $(\min,+)$ linear…
We propose a method of approximating multivariate Gaussian probabilities using dynamic programming. We show that solving the optimization problem associated with a class of discrete-time finite horizon Markov decision processes with…