Related papers: Time-Varying Parameters in Sequential Decision Mak…

Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework

We present a framework to address a class of sequential decision making problems. Our framework features learning the optimal control policy with robustness to noisy data, determining the unknown state and action parameters, and performing…

Machine Learning · Computer Science 2022-01-20 Amber Srivastava , Srinivasa M Salapaka

Towards Enabling Learning for Time-Varying finite horizon Sequential Decision-Making Problems*

Parameterized Sequential Decision Making (Para-SDM) framework models a wide array of network design applications spanning supply-chain, transportation, and sensor networks. These problems entail sequential multi-stage optimization…

Systems and Control · Electrical Eng. & Systems 2025-04-04 Dhananjay Tiwari , Salar Basiri , Srinivasa Salapaka

Risk-Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance

This paper investigates the optimization problem of an infinite stage discrete time Markov decision process (MDP) with a long-run average metric considering both mean and variance of rewards together. Such performance metric is important…

Optimization and Control · Mathematics 2020-08-11 Li Xia

Parameter-Independent Strategies for pMDPs via POMDPs

Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition…

Logic in Computer Science · Computer Science 2018-06-14 Sebastian Arming , Ezio Bartocci , Krishnendu Chatterjee , Joost-Pieter Katoen , Ana Sokolova

Least Squares Temporal Difference Actor-Critic Methods with Applications to Robot Motion Control

We consider the problem of finding a control policy for a Markov Decision Process (MDP) to maximize the probability of reaching some states while avoiding some other states. This problem is motivated by applications in robotics, where such…

Robotics · Computer Science 2011-08-31 Reza Moazzez Estanjini , Xu Chu Ding , Morteza Lahijanian , Jing Wang , Calin A. Belta , Ioannis Ch. Paschalidis

Controlling the low-temperature Ising model using spatiotemporal Markov decision theory

We introduce the spatiotemporal Markov decision process (STMDP), a special type of Markov decision process that models sequential decision-making problems which are not only characterized by temporal, but also by spatial interaction…

Optimization and Control · Mathematics 2025-01-08 M. C. de Jongh , Richard J. Boucherie , M. N. M. van Lieshout

Dynamic consistency for Stochastic Optimal Control problems

For a sequence of dynamic optimization problems, we aim at discussing a notion of consistency over time. This notion can be informally introduced as follows. At the very first time step $t_0$, the decision maker formulates an optimization…

Optimization and Control · Mathematics 2010-05-21 Pierre Carpentier , Jean-Philippe Chancelier , Guy Cohen , Michel De Lara , Pierre Girardeau

Optimizing Performance of Continuous-Time Stochastic Systems using Timeout Synthesis

We consider parametric version of fixed-delay continuous-time Markov chains (or equivalently deterministic and stochastic Petri nets, DSPN) where fixed-delay transitions are specified by parameters, rather than concrete values. Our goal is…

Performance · Computer Science 2016-04-18 Tomáš Brázdil , Ľuboš Korenčiak , Jan Krčál , Petr Novotný , Vojtěch Řehák

Stochastic Optimal Control With Dynamic, Time-Consistent Risk Constraints

In this paper we present a dynamic programing approach to stochastic optimal control problems with dynamic, time-consistent risk constraints. Constrained stochastic optimal control problems, which naturally arise when one has to consider…

Optimization and Control · Mathematics 2015-11-24 Yin-Lam Chow , Marco Pavone

Learning Stochastic Parametric Differentiable Predictive Control Policies

The problem of synthesizing stochastic explicit model predictive control policies is known to be quickly intractable even for systems of modest complexity when using classical control-theoretic methods. To address this challenge, we present…

Machine Learning · Computer Science 2022-05-24 Ján Drgoňa , Sayak Mukherjee , Aaron Tuor , Mahantesh Halappanavar , Draguna Vrabie

Online Optimization with Unknown Time-varying Parameters

In this paper, we study optimization problems where the cost function contains time-varying parameters that are unmeasurable and evolve according to linear, yet unknown, dynamics. We propose a solution that leverages control theoretic tools…

Optimization and Control · Mathematics 2025-03-20 Shivanshu Tripathi , Abed AlRahman Al Makdah , Fabio Pasqualetti

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

We consider synthesis of control policies that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments. We model the interaction between the system and its environment as a Markov…

Systems and Control · Computer Science 2014-05-01 Jie Fu , Ufuk Topcu

Sequential Selection with Expirations

Motivated by applications where impatience is pervasive and evaluation times are uncertain, we study a selection model where options may expire at an unknown point in time and evaluation times are stochastic. Initially, the decision-maker…

Optimization and Control · Mathematics 2026-02-05 Yihua Xu , Rohan Ghuge , Sebastian Perez-Salazar

Stochastic Comparative Statics in Markov Decision Processes

In multi-period stochastic optimization problems, the future optimal decision is a random variable whose distribution depends on the parameters of the optimization problem. We analyze how the expected value of this random variable changes…

Optimization and Control · Mathematics 2020-01-28 Bar Light

An Incremental Sampling-based Algorithm for Stochastic Optimal Control

In this paper, we consider a class of continuous-time, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation methods and sampling-based algorithms for deterministic path planning,…

Robotics · Computer Science 2012-02-27 Vu Anh Huynh , Sertac Karaman , Emilio Frazzoli

Linear Programming for Large-Scale Markov Decision Problems

We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for large scale problems, we pursue the more modest…

Optimization and Control · Mathematics 2014-02-28 Yasin Abbasi-Yadkori , Peter L. Bartlett , Alan Malek

A Framework for Time-Consistent, Risk-Sensitive Model Predictive Control: Theory and Algorithms

In this paper we present a framework for risk-sensitive model predictive control (MPC) of linear systems affected by stochastic multiplicative uncertainty. Our key innovation is to consider a time-consistent, dynamic risk evaluation of the…

Optimization and Control · Mathematics 2018-04-26 Sumeet Singh , Yin-Lam Chow , Anirudha Majumdar , Marco Pavone

Stochastic Finite State Control of POMDPs with LTL Specifications

Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs…

Artificial Intelligence · Computer Science 2020-01-22 Mohamadreza Ahmadi , Rangoli Sharan , Joel W. Burdick

Bayesian Learning of Optimal Policies in Markov Decision Processes with Countably Infinite State-Space

Models of many real-life applications, such as queuing models of communication networks or computing systems, have a countably infinite state-space. Algorithmic and learning procedures that have been developed to produce optimal policies…

Systems and Control · Electrical Eng. & Systems 2024-03-19 Saghar Adler , Vijay Subramanian

MDP Optimal Control under Temporal Logic Constraints

In this paper, we develop a method to automatically generate a control policy for a dynamical system modeled as a Markov Decision Process (MDP). The control specification is given as a Linear Temporal Logic (LTL) formula over a set of…

Robotics · Computer Science 2011-03-24 Xu Chu Ding , Stephen L. Smith , Calin Belta , Daniela Rus