Related papers: Control-Oriented Model-Based Reinforcement Learnin…

Outcome-Driven Reinforcement Learning via Variational Inference

While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the…

Machine Learning · Computer Science 2022-12-29 Tim G. J. Rudner , Vitchyr H. Pong , Rowan McAllister , Yarin Gal , Sergey Levine

Maximum Reward Formulation In Reinforcement Learning

Reinforcement learning (RL) algorithms typically deal with maximizing the expected cumulative return (discounted or undiscounted, finite or infinite horizon). However, several crucial applications in the real world, such as drug discovery,…

Machine Learning · Computer Science 2023-12-20 Sai Krishna Gottipati , Yashaswi Pathak , Rohan Nuttall , Sahir , Raviteja Chunduru , Ahmed Touati , Sriram Ganapathi Subramanian , Matthew E. Taylor , Sarath Chandar

Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification

Inverse reinforcement learning (IRL) aims to infer an agent's preferences (represented as a reward function $R$) from their behaviour (represented as a policy $\pi$). To do this, we need a behavioural model of how $\pi$ relates to $R$. In…

Machine Learning · Computer Science 2024-03-12 Joar Skalse , Alessandro Abate

Active Learning for Control-Oriented Identification of Nonlinear Systems

Model-based reinforcement learning is an effective approach for controlling an unknown system. It is based on a longstanding pipeline familiar to the control community in which one performs experiments on the environment to collect a…

Systems and Control · Electrical Eng. & Systems 2024-08-14 Bruce D. Lee , Ingvar Ziemann , George J. Pappas , Nikolai Matni

A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment

Empowerment is an information-theoretic method that can be used to intrinsically motivate learning agents. It attempts to maximize an agent's control over the environment by encouraging visiting states with a large number of reachable next…

Machine Learning · Computer Science 2020-01-09 Felix Leibfried , Sergio Pascual-Diaz , Jordi Grau-Moya

Direct Uncertainty Estimation in Reinforcement Learning

Optimal probabilistic approach in reinforcement learning is computationally infeasible. Its simplification consisting in neglecting difference between true environment and its model estimated using limited number of observations causes…

Artificial Intelligence · Computer Science 2013-06-26 Sergey Rodionov , Alexey Potapov , Yurii Vinogradov

Model predictive control-based value estimation for efficient reinforcement learning

Reinforcement learning suffers from limitations in real practices primarily due to the number of required interactions with virtual environments. It results in a challenging problem because we are implausible to obtain a local optimal…

Machine Learning · Computer Science 2024-10-28 Qizhen Wu , Kexin Liu , Lei Chen

Robust Reinforcement Learning for Continuous Control with Model Misspecification

We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms. We specifically focus on…

Machine Learning · Computer Science 2020-02-12 Daniel J. Mankowitz , Nir Levine , Rae Jeong , Yuanyuan Shi , Jackie Kay , Abbas Abdolmaleki , Jost Tobias Springenberg , Timothy Mann , Todd Hester , Martin Riedmiller

Reinforcement Learning under Model Mismatch

We study reinforcement learning under model misspecification, where we do not have access to the true environment but only to a reasonably close approximation to it. We address this problem by extending the framework of robust MDPs to the…

Machine Learning · Computer Science 2017-11-10 Aurko Roy , Huan Xu , Sebastian Pokutta

Misspecification in Inverse Reinforcement Learning

The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function $R$ from a policy $\pi$. To do this, we need a model of how $\pi$ relates to $R$. In the current literature, the most common models are optimality, Boltzmann…

Machine Learning · Computer Science 2023-03-27 Joar Skalse , Alessandro Abate

Model-Based Reinforcement Learning Under Confounding

We investigate model-based reinforcement learning in contextual Markov decision processes (C-MDPs) in which the context is unobserved and induces confounding in the offline dataset. In such settings, conventional model-learning methods are…

Machine Learning · Computer Science 2025-12-09 Nishanth Venkatesh , Andreas A. Malikopoulos

Inverse Reinforcement Learning via Matching of Optimality Profiles

The goal of inverse reinforcement learning (IRL) is to infer a reward function that explains the behavior of an agent performing a task. The assumption that most approaches make is that the demonstrated behavior is near-optimal. In many…

Machine Learning · Computer Science 2020-11-20 Luis Haug , Ivan Ovinnikov , Eugene Bykovets

Model-Based Uncertainty in Value Functions

We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. In particular, we focus on characterizing the variance over values induced by a distribution over MDPs. Previous work…

Machine Learning · Computer Science 2023-03-08 Carlos E. Luis , Alessandro G. Bottero , Julia Vinogradska , Felix Berkenkamp , Jan Peters

Model-free Reinforcement Learning for Model-based Control: Towards Safe, Interpretable and Sample-efficient Agents

Training sophisticated agents for optimal decision-making under uncertainty has been key to the rapid development of modern autonomous systems across fields. Notably, model-free reinforcement learning (RL) has enabled decision-making agents…

Machine Learning · Computer Science 2025-07-21 Thomas Banker , Ali Mesbah

Confounding Robust Continuous Control via Automatic Reward Shaping

Reward shaping has been applied widely to accelerate Reinforcement Learning (RL) agents' training. However, a principled way of designing effective reward shaping functions, especially for complex continuous control problems, remains…

Machine Learning · Computer Science 2026-02-12 Mateo Juliani , Mingxuan Li , Elias Bareinboim

Implicit Constraint-Aware Off-Policy Correction for Offline Reinforcement Learning

Offline reinforcement learning promises policy improvement from logged interaction data alone, yet state-of-the-art algorithms remain vulnerable to value over-estimation and to violations of domain knowledge such as monotonicity or…

Systems and Control · Electrical Eng. & Systems 2025-06-18 Ali Baheri

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Providing a suitable reward function to reinforcement learning can be difficult in many real world applications. While inverse reinforcement learning (IRL) holds promise for automatically learning reward functions from demonstrations,…

Machine Learning · Computer Science 2019-10-29 Lantao Yu , Tianhe Yu , Chelsea Finn , Stefano Ermon

Model-based Policy Optimization with Unsupervised Model Adaptation

Model-based reinforcement learning methods learn a dynamics model with real data sampled from the environment and leverage it to generate simulated data to derive an agent. However, due to the potential distribution mismatch between…

Machine Learning · Computer Science 2020-10-29 Jian Shen , Han Zhao , Weinan Zhang , Yong Yu

MBMF: Model-Based Priors for Model-Free Reinforcement Learning

Reinforcement Learning is divided in two main paradigms: model-free and model-based. Each of these two paradigms has strengths and limitations, and has been successfully applied to real world domains that are appropriate to its…

Machine Learning · Computer Science 2017-10-19 Somil Bansal , Roberto Calandra , Kurtland Chua , Sergey Levine , Claire Tomlin

A reinforced learning approach to optimal design under model uncertainty

Optimal designs are usually model-dependent and likely to be sub-optimal if the postulated model is not correctly specified. In practice, it is common that a researcher has a list of candidate models at hand and a design has to be found…

Statistics Theory · Mathematics 2023-03-29 Mingyao Ai , Holger Dette , Zhengfu Liu , Jun Yu