Related papers: Revised Progressive-Hedging-Algorithm Based Two-la…

An Adaptive Sampling-based Progressive Hedging Algorithm for Stochastic Programming

The progressive hedging algorithm (PHA) is a cornerstone among algorithms for large-scale stochastic programming problems. However, its traditional implementation is hindered by some limitations, including the requirement to solve all…

Optimization and Control · Mathematics 2025-03-13 Di Zhang , Yihang Zhang , Suvrajeet Sen

A Hierarchical Two-tier Approach to Hyper-parameter Optimization in Reinforcement Learning

Optimization of hyper-parameters in reinforcement learning (RL) algorithms is a key task, because they determine how the agent will learn its policy by interacting with its environment, and thus what data is gathered. In this work, an…

Machine Learning · Computer Science 2019-09-19 Juan Cruz Barsce , Jorge A. Palombarini , Ernesto Martínez

Dual Control for Approximate Bayesian Reinforcement Learning

Control of non-episodic, finite-horizon dynamical systems with uncertain dynamics poses a tough and elementary case of the exploration-exploitation trade-off. Bayesian reinforcement learning, reasoning about the effect of actions and future…

Machine Learning · Statistics 2016-08-12 Edgar D. Klenske , Philipp Hennig

Robust Reinforcement Learning for Risk-Sensitive Linear Quadratic Gaussian Control

This paper proposes a novel robust reinforcement learning framework for discrete-time linear systems with model mismatch that may arise from the sim-to-real gap. A key strategy is to invoke advanced techniques from control theory. Using the…

Systems and Control · Electrical Eng. & Systems 2023-12-07 Leilei Cui , Tamer Başar , Zhong-Ping Jiang

Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning

Two-time-scale optimization is a framework introduced in Zeng et al. (2024) that abstracts a range of policy evaluation and policy optimization problems in reinforcement learning (RL). Akin to bi-level optimization under a particular type…

Optimization and Control · Mathematics 2026-01-21 Sihan Zeng , Thinh T. Doan

A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model

Bayesian Reinforcement Learning (RL) is capable of not only incorporating domain knowledge, but also solving the exploration-exploitation dilemma in a natural way. As Bayesian RL is intractable except for special cases, previous work has…

Artificial Intelligence · Computer Science 2013-06-14 Kenji Kawaguchi , Mauricio Araya

A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance

We study reinforcement learning by combining recent advances in regularized linear programming formulations with the classical theory of stochastic approximation. Motivated by the challenge of designing algorithms that leverage off-policy…

Optimization and Control · Mathematics 2026-04-15 Axel Friedrich Wolter , Tobias Sutter

Randomized Progressive Hedging methods for Multi-stage Stochastic Programming

Progressive Hedging is a popular decomposition algorithm for solving multi-stage stochastic optimization problems. A computational bottleneck of this algorithm is that all scenario subproblems have to be solved at each iteration. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-28 Gilles Bareilles , Yassine Laguel , Dmitry Grishchenko , Franck Iutzeler , Jérôme Malick

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes that involves the use of stochastic approximation algorithms along with state-of-the-art techniques that are…

Machine Learning · Computer Science 2022-10-17 Anna Winnicki , R. Srikant

Hybrid Reinforcement Learning Framework for Mixed-Variable Problems

Optimization problems characterized by both discrete and continuous variables are common across various disciplines, presenting unique challenges due to their complex solution landscapes and the difficulty of navigating mixed-variable…

Optimization and Control · Mathematics 2024-06-03 Haoyan Zhai , Qianli Hu , Jiangning Chen

Deep Reinforcement Learning via L-BFGS Optimization

Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selections so as to increase rewarding experiences in their environments. Deep Reinforcement Learning algorithms require solving a nonconvex and…

Machine Learning · Computer Science 2019-04-18 Jacob Rafati , Roummel F. Marcia

Strong Convergence of Relaxed Inertial Inexact Progressive Hedging Algorithm for Multi-stage Stochastic Variational Inequality Problems

A Halpern-type relaxed inertial inexact progressive hedging algorithm (PHA) is proposed for solving multi-stage stochastic variational inequalities in general probability spaces. The subproblems in this algorithm are allowed to be…

Optimization and Control · Mathematics 2024-12-10 Jiaxin Chen , Zunjie Huang , Haisen Zhang

Inverse Reinforcement Learning with Gaussian Process

We present new algorithms for inverse reinforcement learning (IRL, or inverse optimal control) in convex optimization settings. We argue that finite-space IRL can be posed as a convex quadratic program under a Bayesian inference framework…

Machine Learning · Computer Science 2013-01-22 Qifeng Qiao , Peter A. Beling

Hysteresis-Based RL: Robustifying Reinforcement Learning-based Control Policies via Hybrid Control

Reinforcement learning (RL) is a promising approach for deriving control policies for complex systems. As we show in two control problems, the derived policies from using the Proximal Policy Optimization (PPO) and Deep Q-Network (DQN)…

Machine Learning · Computer Science 2022-04-05 Jan de Priester , Ricardo G. Sanfelice , Nathan van de Wouw

Model-free Reinforcement Learning for ${H_{2}/H_{\infty}}$ Control of Stochastic Discrete-time Systems

This paper proposes a reinforcement learning (RL) algorithm for infinite horizon $\rm {H_{2}/H_{\infty}}$ problem in a class of stochastic discrete-time systems, rather than using a set of coupled generalized algebraic Riccati equations…

Optimization and Control · Mathematics 2023-11-28 Xiushan Jiang , Li Wang , Dongya Zhao , Ling Shi

Bayesian Reinforcement Learning via Deep, Sparse Sampling

We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance…

Machine Learning · Computer Science 2020-06-30 Divya Grover , Debabrota Basu , Christos Dimitrakakis

Annealing Optimization for Progressive Learning with Stochastic Approximation

In this work, we introduce a learning model designed to meet the needs of applications in which computational resources are limited, and robustness and interpretability are prioritized. Learning problems can be formulated as constrained…

Systems and Control · Electrical Eng. & Systems 2025-09-26 Christos Mavridis , John Baras

Reinforcement Learning for Discrete-time LQG Mean Field Social Control Problems with Unknown Dynamics

This paper studies the discrete-time linear-quadratic-Gaussian mean field (MF) social control problem in an infinite horizon, where the dynamics of all agents are unknown. The objective is to design a reinforcement learning (RL) algorithm…

Optimization and Control · Mathematics 2025-12-05 Hanfang Zhang , Bing-Chang Wang , Shuo Chen

Stochastic Linear Quadratic Optimal Control Problem: A Reinforcement Learning Method

This paper applies a reinforcement learning (RL) method to solve infinite horizon continuous-time stochastic linear quadratic problems, where drift and diffusion terms in the dynamics may depend on both the state and control. Based on…

Optimization and Control · Mathematics 2021-09-17 Na Li , Xun Li , Jing Peng , Zuo Quan Xu

Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

We propose a framework, called neural-progressive hedging (NP), that leverages stochastic programming during the online phase of executing a reinforcement learning (RL) policy. The goal is to ensure feasibility with respect to constraints…

Machine Learning · Computer Science 2022-03-01 Supriyo Ghosh , Laura Wynter , Shiau Hong Lim , Duc Thien Nguyen