Related papers: Global Optimization for Value Function Approximati…
We describe an approximate dynamic programming approach to compute lower bounds on the optimal value function for a discrete time, continuous space, infinite horizon setting. The approach iteratively constructs a family of lower bounding…
This paper describes an approximate method for global optimization of polynomial programming problems with bounded variables. The method uses a reformulation and linearization technique to transform the original polynomial optimization…
One of the most natural approaches to reinforcement learning (RL) with function approximation is value iteration, which inductively generates approximations to the optimal value function by solving a sequence of regression problems. To…
We study both the value function and Q-function formulation of the Linear Programming approach to Approximate Dynamic Programming. The approach is model-based and optimizes over a restricted function space to approximate the value function…
We consider the global minimization of smooth functions based solely on function evaluations. Algorithms that achieve the optimal number of function evaluations for a given precision level typically rely on explicitly constructing an…
In this paper, a convex optimization-based method is proposed for numerically solving dynamic programs in continuous state and action spaces. The key idea is to approximate the output of the Bellman operator at a particular state by the…
The paper is about developing a solver for maximizing a real-valued function of binary variables. The solver relies on an algorithm that estimates the optimal objective-function value of instances from the underlying distribution of…
Markov decision problems are most commonly solved via dynamic programming. Another approach is Bellman residual minimization, which directly minimizes the squared Bellman residual objective function. However, compared to dynamic…
This paper considers an optimization problem for a dynamical system whose evolution depends on a collection of binary decision variables. We develop scalable approximation algorithms with provable suboptimality bounds to provide…
We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of…
Value function learning plays a central role in many state-of-the-art reinforcement-learning algorithms. Many popular algorithms like Q-learning do not optimize any objective function, but are fixed-point iterations of some variant of…
We describe an approximate dynamic programming method for stochastic control problems on infinite state and input spaces. The optimal value function is approximated by a linear combination of basis functions with coefficients as decision…
Many computer vision and human-computer interaction applications developed in recent years need evaluating complex and continuous mathematical functions as an essential step toward proper operation. However, rigorous evaluation of this kind…
Maximizing a single submodular set function subject to a cardinality constraint is a well-studied and central topic in combinatorial optimization. However, finding a set that maximizes multiple functions at the same time is much less…
We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon…
Many reinforcement learning algorithms rely on value estimation, however, the most widely used algorithms -- namely temporal difference algorithms -- can diverge under both off-policy sampling and nonlinear function approximation. Many…
We consider least squares approximation of a function of one variable by a continuous, piecewise-linear approximand that has a small number of breakpoints. This problem was notably considered by Bellman who proposed an approximate algorithm…
This works handles the inverse reinforcement learning problem in high-dimensional state spaces, which relies on an efficient solution of model-based high-dimensional reinforcement learning problems. To solve the computationally expensive…
In this paper, we propose a method for the approximation of the solution of high-dimensional weakly coercive problems formulated in tensor spaces using low-rank approximation formats. The method can be seen as a perturbation of a minimal…
Global optimization of black-box functions from noisy samples is a fundamental challenge in machine learning and scientific computing. Traditional methods such as Bayesian Optimization often converge to local minima on multi-modal…