Related papers: Bayesian Bellman Operators

Bayesian Exploration Networks

Bayesian reinforcement learning (RL) offers a principled and elegant approach for sequential decision making under uncertainty. Most notably, Bayesian agents do not face an exploration/exploitation dilemma, a major pathology of frequentist…

Machine Learning · Computer Science 2024-06-26 Mattie Fellows , Brandon Kaplowitz , Christian Schroeder de Witt , Shimon Whiteson

Bayesian Reinforcement Learning: A Survey

Bayesian methods for machine learning have been widely investigated, yielding principled methods for incorporating prior information into inference algorithms. In this survey, we provide an in-depth review of the role of Bayesian methods…

Artificial Intelligence · Computer Science 2016-09-16 Mohammad Ghavamzadeh , Shie Mannor , Joelle Pineau , Aviv Tamar

Inferential Induction: A Novel Framework for Bayesian Reinforcement Learning

Bayesian reinforcement learning (BRL) offers a decision-theoretic solution for reinforcement learning. While "model-based" BRL algorithms have focused either on maintaining a posterior distribution on models or value functions and combining…

Machine Learning · Computer Science 2020-07-03 Hannes Eriksson , Emilio Jorge , Christos Dimitrakakis , Debabrota Basu , Divya Grover

Spectral Bellman Method: Unifying Representation and Exploration in RL

Representation learning is critical to the empirical and theoretical success of reinforcement learning. However, many existing methods are induced from model-learning aspects, misaligning them with the RL task in hand. This work introduces…

Machine Learning · Computer Science 2026-02-03 Ofir Nabati , Bo Dai , Shie Mannor , Guy Tennenholtz

Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

The problem of inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. Despite significant algorithmic contributions in recent years, IRL remains an ill-posed…

Machine Learning · Computer Science 2020-11-18 Sreejith Balakrishnan , Quoc Phong Nguyen , Bryan Kian Hsiang Low , Harold Soh

Robust Bayesian Dynamic Programming for On-policy Risk-sensitive Reinforcement Learning

We propose a novel framework for risk-sensitive reinforcement learning (RSRL) that incorporates robustness against transition uncertainty. We define two distinct yet coupled risk measures: an inner risk measure addressing state and cost…

Risk Management · Quantitative Finance 2026-01-01 Shanyu Han , Yangbo He , Yang Liu

BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

Bayesian optimization (BO) offers an efficient pipeline for optimizing black-box functions with the help of a Gaussian process prior and an acquisition function (AF). Recently, in the context of single-objective BO, learning-based AFs…

Machine Learning · Computer Science 2025-05-30 Yu-Heng Hung , Kai-Jie Lin , Yu-Heng Lin , Chien-Yi Wang , Cheng Sun , Ping-Chun Hsieh

Fully Bayesian Recurrent Neural Networks for Safe Reinforcement Learning

Reinforcement Learning (RL) has demonstrated state-of-the-art results in a number of autonomous system applications, however many of the underlying algorithms rely on black-box predictions. This results in poor explainability of the…

Machine Learning · Computer Science 2019-11-27 Matt Benatan , Edward O. Pyzer-Knapp

Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning

For continuous action spaces, actor-critic methods are widely used in online reinforcement learning (RL). However, unlike RL algorithms for discrete actions, which generally model the optimal value function using the Bellman optimality…

Machine Learning · Computer Science 2025-08-14 Motoki Omura , Kazuki Ota , Takayuki Osa , Yusuke Mukuta , Tatsuya Harada

MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator

Offline reinforcement learning (RL) faces a significant challenge of distribution shift. Model-free offline RL penalizes the Q value for out-of-distribution (OOD) data or constrains the policy closed to the behavior policy to tackle this…

Machine Learning · Computer Science 2024-04-18 Xiao-Yin Liu , Xiao-Hu Zhou , Guotao Li , Hao Li , Mei-Jiang Gui , Tian-Yu Xiang , De-Xing Huang , Zeng-Guang Hou

Generalised Linear Models in Deep Bayesian RL with Learnable Basis Functions

Bayesian Reinforcement Learning (BRL), a subclass of Meta-Reinforcement Learning (Meta-RL), provides a principled framework for generalisation by explicitly incorporating Bayesian task parameters into transition and reward models. However,…

Machine Learning · Computer Science 2026-05-11 Jingyang You , Hanna Kurniawati

Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach

Traditional reinforcement learning (RL) assumes the agents make decisions based on Markov decision processes (MDPs) with one-step transition models. In many real-world applications, such as energy management and stock investment, agents can…

Machine Learning · Computer Science 2025-10-22 Chenbei Lu , Zaiwei Chen , Tongxin Li , Chenye Wu , Adam Wierman

Bayesian Inverse Reinforcement Learning for Non-Markovian Rewards

Inverse reinforcement learning (IRL) is the problem of inferring a reward function from expert behavior. There are several approaches to IRL, but most are designed to learn a Markovian reward. However, a reward function might be…

Machine Learning · Computer Science 2024-06-21 Noah Topper , Alvaro Velasquez , George Atia

End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes

Meta-Bayesian optimisation (meta-BO) aims to improve the sample efficiency of Bayesian optimisation by leveraging data from related tasks. While previous methods successfully meta-learn either a surrogate model or an acquisition function…

Machine Learning · Computer Science 2023-12-25 Alexandre Maraval , Matthieu Zimmer , Antoine Grosnit , Haitham Bou Ammar

Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks

Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment.…

Machine Learning · Computer Science 2023-09-15 Mohamed Aziz Bhouri , Michael Joly , Robert Yu , Soumalya Sarkar , Paris Perdikaris

Robust Reinforcement Learning for Continuous Control with Model Misspecification

We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms. We specifically focus on…

Machine Learning · Computer Science 2020-02-12 Daniel J. Mankowitz , Nir Levine , Rae Jeong , Yuanyuan Shi , Jackie Kay , Abbas Abdolmaleki , Jost Tobias Springenberg , Timothy Mann , Todd Hester , Martin Riedmiller

To bootstrap or to rollout? An optimal and adaptive interpolation

Bootstrapping and rollout are two fundamental principles for value function estimation in reinforcement learning (RL). We introduce a novel class of Bellman operators, called subgraph Bellman operators, that interpolate between…

Machine Learning · Computer Science 2024-12-02 Wenlong Mou , Jian Qian

Bootstrapping Expectiles in Reinforcement Learning

Many classic Reinforcement Learning (RL) algorithms rely on a Bellman operator, which involves an expectation over the next states, leading to the concept of bootstrapping. To introduce a form of pessimism, we propose to replace this…

Machine Learning · Computer Science 2024-06-07 Pierre Clavier , Emmanuel Rachelson , Erwan Le Pennec , Matthieu Geist

A Bayesian Approach to Robust Inverse Reinforcement Learning

We consider a Bayesian approach to offline model-based inverse reinforcement learning (IRL). The proposed framework differs from existing offline model-based IRL approaches by performing simultaneous estimation of the expert's reward…

Machine Learning · Computer Science 2024-04-09 Ran Wei , Siliang Zeng , Chenliang Li , Alfredo Garcia , Anthony McDonald , Mingyi Hong

EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization

To avoid myopic behavior, multi-step lookahead Bayesian optimization (BO) algorithms consider the sequential nature of BO and have demonstrated promising results in recent years. However, owing to the curse of dimensionality, most of these…

Machine Learning · Computer Science 2026-04-24 Mujin Cheon , Jay H. Lee , Dong-Yeun Koh , Calvin Tsay