Related papers: Beyond dynamic programming

Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings

Reinforcement learning is a general technique that allows an agent to learn an optimal policy and interact with an environment in sequential decision making problems. The goodness of a policy is measured by its value function starting from…

Machine Learning · Statistics 2025-06-30 C. Shi , S. Zhang , W. Lu , R. Song

Linear programming for finite-horizon vector-valued Markov decision processes

We propose a vector linear programming formulation for a non-stationary, finite-horizon Markov decision process with vector-valued rewards. Pareto efficient policies are shown to correspond to efficient solutions of the linear program, and…

Optimization and Control · Mathematics 2025-06-02 Anas Mifrani , Dominikus Noll

Linear programming approach to optimal impulse control problems with functional constraints

This paper considers an optimal impulse control problem of dynamical systems generated by a flow. The performance criteria are total costs over the infinite time horizon. Apart from the main performance to be minimized, there are multiple…

Optimization and Control · Mathematics 2020-10-27 Alexey Piunovskiy , Yi Zhang

Pausing Policy Learning in Non-stationary Reinforcement Learning

Real-time inference is a challenge of real-world reinforcement learning due to temporal differences in time-varying environments: the system collects data from the past, updates the decision model in the present, and deploys it in the…

Machine Learning · Computer Science 2024-05-28 Hyunin Lee , Ming Jin , Javad Lavaei , Somayeh Sojoudi

Dynamic programming for infinite horizon boundary control problems of PDE's with age structure

We develop the dynamic programming approach for a family of infinite horizon boundary control problems with linear state equation and convex cost. We prove that the value function of the problem is the unique regular solution of the…

Optimization and Control · Mathematics 2008-06-27 Silvia Faggian , Fausto Gozzi

Reinforcement Learning Method for Zero-Sum Linear-Quadratic Stochastic Differential Games in Infinite Horizons

In this work, we propose, for the first time, a reinforcement learning framework specifically designed for zero-sum linear-quadratic stochastic differential games. This approach offers a generalized solution for scenarios in which accurate…

Optimization and Control · Mathematics 2026-02-10 Yiyuan Wang

Online Adaptive Optimal Control Algorithm Based on Synchronous Integral Reinforcement Learning With Explorations

In this paper, we present a novel algorithm named synchronous integral Q-learning, which is based on synchronous policy iteration, to solve the continuous-time infinite horizon optimal control problems of input-affine system dynamics. The…

Systems and Control · Electrical Eng. & Systems 2021-05-20 Lei Guo , Han Zhao

A policy gradient approach for Finite Horizon Constrained Markov Decision Processes

The infinite horizon setting is widely adopted for problems of reinforcement learning (RL). These invariably result in stationary policies that are optimal. In many situations, finite horizon control problems are of interest and for such…

Machine Learning · Computer Science 2025-03-21 Soumyajit Guin , Shalabh Bhatnagar

Score Function Gradient Estimation to Widen the Applicability of Decision-Focused Learning

Many real-world optimization problems contain parameters that are unknown before deployment time, either due to stochasticity or to lack of information (e.g., demand or travel times in delivery problems). A common strategy in such cases is…

Machine Learning · Computer Science 2024-06-18 Mattia Silvestri , Senne Berden , Jayanta Mandi , Ali İrfan Mahmutoğulları , Brandon Amos , Tias Guns , Michele Lombardi

Reinforcement Learning with Random Time Horizons

We extend the standard reinforcement learning framework to random time horizons. While the classical setting typically assumes finite and deterministic or infinite runtimes of trajectories, we argue that multiple real-world applications…

Machine Learning · Computer Science 2025-08-15 Enric Ribera Borrell , Lorenz Richter , Christof Schütte

Policy Learning for Individualized Treatment Regimes on Infinite Time Horizon

With the recent advancements of technology in facilitating real-time monitoring and data collection, "just-in-time" interventions can be delivered via mobile devices to achieve both real-time and long-term management and control.…

Methodology · Statistics 2023-09-26 Wenzhuo Zhou , Yuhan Li , Ruoqing Zhu

A dynamic programming approach to solving constrained linear-quadratic optimal control problems

The solution of a constrained linear-quadratic regulator problem is determined by the set of its optimal active sets. We propose an algorithm that constructs this set of active sets for a desired horizon N from that for horizon N-1. While…

Optimization and Control · Mathematics 2020-09-21 Ruth Mitze , Martin Mönnigmann

Deep Reinforcement Learning amidst Lifelong Non-Stationarity

As humans, our goals and our environment are persistently changing throughout our lifetime based on our experiences, actions, and internal and external drives. In contrast, typical reinforcement learning problem set-ups consider decision…

Machine Learning · Computer Science 2020-06-19 Annie Xie , James Harrison , Chelsea Finn

Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis

This paper develops algorithms for high-dimensional stochastic control problems based on deep learning and dynamic programming. Unlike classical approximate dynamic programming approaches, we first approximate the optimal policy by means of…

Probability · Mathematics 2021-09-21 Côme Huré , Huyên Pham , Achref Bachouch , Nicolas Langrené

Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces

We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon…

Optimization and Control · Mathematics 2018-10-05 Joseph Warrington , Paul N. Beuchat , John Lygeros

Score-Based Methods for Discrete Optimization in Deep Learning

Discrete optimization problems often arise in deep learning tasks, despite the fact that neural networks typically operate on continuous data. One class of these problems involve objective functions which depend on neural networks, but…

Machine Learning · Computer Science 2023-10-17 Eric Lei , Arman Adibi , Hamed Hassani

Model Based Reinforcement Learning with Final Time Horizon Optimization

We present one of the first algorithms on model based reinforcement learning and trajectory optimization with free final time horizon. Grounded on the optimal control theory and Dynamic Programming, we derive a set of backward differential…

Systems and Control · Computer Science 2015-09-04 Wei Sun , Evangelos Theodorou , Panagiotis Tsiotras

Policy Optimization for Stochastic Shortest Path

Policy optimization is among the most popular and successful reinforcement learning algorithms, and there is increasing interest in understanding its theoretical guarantees. In this work, we initiate the study of policy optimization for the…

Machine Learning · Computer Science 2022-02-08 Liyu Chen , Haipeng Luo , Aviv Rosenberg

Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting

Policy gradient methods have shown success in learning control policies for high-dimensional dynamical systems. Their biggest downside is the amount of exploration they require before yielding high-performing policies. In a lifelong…

Machine Learning · Computer Science 2020-10-23 Jorge A. Mendez , Boyu Wang , Eric Eaton

Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

Recent advances in robot skill learning have unlocked the potential to construct task-agnostic skill libraries, facilitating the seamless sequencing of multiple simple manipulation primitives (aka. skills) to tackle significantly more…

Robotics · Computer Science 2024-07-18 Teng Xue , Amirreza Razmjoo , Suhan Shetty , Sylvain Calinon