Related papers: A Low-Complexity Multi-Survivor Dynamic Programmin…

Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs

Memory-Bounded Dynamic Programming (MBDP) has proved extremely effective in solving decentralized POMDPs with large horizons. We generalize the algorithm and improve its scalability by reducing the complexity with respect to the number of…

Artificial Intelligence · Computer Science 2012-06-26 Sven Seuken , Shlomo Zilberstein

A Generalization of Bellman's Equation with Application to Path Planning, Obstacle Avoidance and Invariant Set Estimation

The standard Dynamic Programming (DP) formulation can be used to solve Multi-Stage Optimization Problems (MSOP's) with additively separable objective functions. In this paper we consider a larger class of MSOP's with monotonically backward…

Optimization and Control · Mathematics 2020-10-15 Morgan Jones , Matthew Peet

Constrained Resource Allocation Problems in Communications: An Information-assisted Approach

We consider a class of resource allocation problems given a set of unconditional constraints whose objective function satisfies Bellman's optimality principle. Such problems are ubiquitous in wireless communication, signal processing, and…

Signal Processing · Electrical Eng. & Systems 2021-12-08 I. Zakir Ahmed , Hamid Sadjadpour , Shahram Yousefi

Constrained Differential Dynamic Programming Revisited

Differential Dynamic Programming (DDP) has become a well established method for unconstrained trajectory optimization. Despite its several applications in robotics and controls however, a widely successful constrained version of the…

Optimization and Control · Mathematics 2020-05-05 Yuichiro Aoyama , George Boutselis , Akash Patel , Evangelos A. Theodorou

Approximate Constrained Discounted Dynamic Programming with Uniform Feasibility and Optimality

We consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a…

Optimization and Control · Mathematics 2023-08-08 Hyeong Soo Chang

Multi-Objective Policy Gradients with Topological Constraints

Multi-objective optimization models that encode ordered sequential constraints provide a solution to model various challenging problems including encoding preferences, modeling a curriculum, and enforcing measures of safety. A recently…

Artificial Intelligence · Computer Science 2022-09-16 Kyle Hollins Wray , Stas Tiomkin , Mykel J. Kochenderfer , Pieter Abbeel

On Integrality in Semidefinite Programming for Discrete Optimization

It is well-known that by adding integrality constraints to the semidefinite programming (SDP) relaxation of the max-cut problem, the resulting integer semidefinite program is an exact formulation of the problem. In this paper we show…

Optimization and Control · Mathematics 2023-11-09 Frank de Meijer , Renata Sotirov

Unrolling Dynamic Programming via Graph Filters

Dynamic programming (DP) is a fundamental tool used across many engineering fields. The main goal of DP is to solve Bellman's optimality equations for a given Markov decision process (MDP). Standard methods like policy iteration exploit the…

Artificial Intelligence · Computer Science 2025-07-30 Sergio Rozada , Samuel Rey , Gonzalo Mateos , Antonio G. Marques

Analysis of approximate linear programming solution to Markov decision problem with log barrier function

There are two primary approaches to solving Markov decision problems (MDPs): dynamic programming based on the Bellman equation and linear programming (LP). Dynamic programming methods are the most widely used and form the foundation of both…

Artificial Intelligence · Computer Science 2026-02-24 Donghwan Lee , Hyukjun Yang , Bum Geun Park

Dual Dynamic Programming with cut selection: convergence proof and numerical experiments

We consider convex optimization problems formulated using dynamic programming equations. Such problems can be solved using the Dual Dynamic Programming algorithm combined with the Level 1 cut selection strategy or the Territory algorithm to…

Optimization and Control · Mathematics 2017-05-26 Vincent Guigues

Semilinear Dynamic Programming: Analysis, Algorithms, and Certainty Equivalence Properties

We consider a broad class of dynamic programming (DP) problems that involve a partially linear structure and some positivity properties in their system equation and cost function. We address deterministic and stochastic problems, possibly…

Optimization and Control · Mathematics 2026-04-21 Yuchao Li , Dimitri Bertsekas

Modeling and Optimizing Resource Allocation Decisions through Multi-model Markov Decision Processes with Capacity Constraints

This paper proposes a new formulation for the dynamic resource allocation problem, which converts the traditional MDP model with known parameters and no capacity constraints to a new model with uncertain parameters and a resource capacity…

Optimization and Control · Mathematics 2020-11-10 Onur Demiray , Evrim Didem Güneş , Lerzan Örmeci

Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its utilities/costs. A new primal-dual approach is…

Optimization and Control · Mathematics 2021-10-22 Tianjiao Li , Ziwei Guan , Shaofeng Zou , Tengyu Xu , Yingbin Liang , Guanghui Lan

Efficient Low-Rank Semidefinite Programming with Robust Loss Functions

In real-world applications, it is important for machine learning algorithms to be robust against data outliers or corruptions. In this paper, we focus on improving the robustness of a large class of learning algorithms that are formulated…

Machine Learning · Computer Science 2021-06-04 Quanming Yao , Hangsi Yang , En-Liang Hu , James Kwok

Minimax Regret Optimisation for Robust Planning in Uncertain Markov Decision Processes

The parameters for a Markov Decision Process (MDP) often cannot be specified exactly. Uncertain MDPs (UMDPs) capture this model ambiguity by defining sets which the parameters belong to. Minimax regret has been proposed as an objective for…

Artificial Intelligence · Computer Science 2023-02-14 Marc Rigter , Bruno Lacerda , Nick Hawes

Large-Scale Markov Decision Problems via the Linear Programming Dual

We consider the problem of controlling a fully specified Markov decision process (MDP), also known as the planning problem, when the state space is very large and calculating the optimal policy is intractable. Instead, we pursue the more…

Optimization and Control · Mathematics 2019-01-09 Yasin Abbasi-Yadkori , Peter L. Bartlett , Xi Chen , Alan Malek

A safe exploration approach to constrained Markov decision processes

We consider discounted infinite-horizon constrained Markov decision processes (CMDPs), where the goal is to find an optimal policy that maximizes the expected cumulative reward while satisfying expected cumulative constraints. Motivated by…

Machine Learning · Computer Science 2025-03-04 Tingting Ni , Maryam Kamgarpour

A unified algorithm framework for mean-variance optimization in discounted Markov decision processes

This paper studies the risk-averse mean-variance optimization in infinite-horizon discounted Markov decision processes (MDPs). The involved variance metric concerns reward variability during the whole process, and future deviations are…

Optimization and Control · Mathematics 2022-01-19 Shuai Ma , Xiaoteng Ma , Li Xia

Scalable Semidefinite Programming

Semidefinite programming (SDP) is a powerful framework from convex optimization that has striking potential for data science applications. This paper develops a provably correct randomized algorithm for solving large, weakly constrained SDP…

Optimization and Control · Mathematics 2021-03-26 Alp Yurtsever , Joel A. Tropp , Olivier Fercoq , Madeleine Udell , Volkan Cevher

Anytime-Constrained Reinforcement Learning

We introduce and study constrained Markov Decision Processes (cMDPs) with anytime constraints. An anytime constraint requires the agent to never violate its budget at any point in time, almost surely. Although Markovian policies are no…

Machine Learning · Computer Science 2024-06-14 Jeremy McMahan , Xiaojin Zhu