Related papers: Sequential decision problems, dependent types and …
In control theory, to solve a finite-horizon sequential decision problem (SDP) commonly means to find a list of decision rules that result in an optimal expected total reward (or cost) when taking a given number of decision steps. SDPs are…
Graded monads refine traditional monads using effect annotations in order to describe quantitatively the computational effects that a program can generate. They have been successfully applied to a variety of formal systems for reasoning…
In this paper, we consider discrete-time infinite horizon problems of optimal control to a terminal set of states. These are the problems that are often taken as the starting point for adaptive dynamic programming. Under very general…
Adaptive optimal control of nonlinear dynamic systems with deterministic and known dynamics under a known undiscounted infinite-horizon cost function is investigated. Policy iteration scheme initiated using a stabilizing initial control is…
We introduce a unified probabilistic framework for solving sequential decision making problems ranging from Bayesian optimisation to contextual bandits and reinforcement learning. This is accomplished by a probabilistic model-based approach…
Contextual online decision-making problems with constraints appear in a wide range of real-world applications, such as adaptive experimental design under safety constraints, personalized recommendation with resource limits, and dynamic…
One can perform equational reasoning about computational effects with a purely functional programming language thanks to monads. Even though equational reasoning for effectful programs is desirable, it is not yet mainstream. This is partly…
We study finite-horizon continuous-time policy evaluation from discrete closed-loop trajectories under time-inhomogeneous dynamics. The target value surface solves a backward parabolic equation, but the Bellman baseline obtained from…
Many forms of dependence manifest themselves over time, with behavior of variables in dynamical systems as a paradigmatic example. This paper studies temporal dependence in dynamical systems from a logical perspective, by enriching a…
We propose finitely convergent methods for solving convex feasibility problems defined over a possibly infinite pool of constraints. Following other works in this area, we assume that the interior of the solution set is nonempty and that…
We describe a method for time-critical decision making involving sequential tasks and stochastic processes. The method employs several iterative refinement routines for solving different aspects of the decision making problem. This paper…
Computer models, aiming at simulating a complex real system, are often calibrated in the light of data to improve performance. Standard calibration methods assume that the optimal values of calibration parameters are invariant to the model…
Recently developed pretrained models can encode rich world knowledge expressed in multiple modalities, such as text and images. However, the outputs of these models cannot be integrated into algorithms to solve sequential decision-making…
We consider both discrete and continuous "uncertain horizon" deterministic control processes, for which the termination time is a random variable. We examine the dynamic programming equations for the value function of such processes,…
In this note, we consider infinite horizon optimal control problems with deterministic systems. Since exact solutions to these problems are often intractable, we propose a parallel model predictive control (MPC) method that provides an…
Automated synthesis of correct-by-construction controllers for autonomous systems is crucial for their deployment in safety-critical scenarios. Such autonomous systems are naturally modeled as stochastic dynamical models. The general…
We propose a machine learning algorithm for solving finite-horizon stochastic control problems based on a deep neural network representation of the optimal policy functions. The algorithm has three features: (1) It can solve…
We study methods for solving stochastic control problems of systems of forward-backward mean-field equations with delay, in finite or infinite horizon. Necessary and sufficient maximum principles under partial information are given. The…
We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon…
In this paper we develop a unified approach for solving a wide class of sequential selection problems. This class includes, but is not limited to, selection problems with no-information, rank-dependent rewards, and considers both fixed as…