Related papers: Objective Mismatch in Model-based Reinforcement Le…

A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning

Model-based Reinforcement Learning (MBRL) aims to make agents more sample-efficient, adaptive, and explainable by learning an explicit model of the environment. While the capabilities of MBRL agents have significantly improved in recent…

Machine Learning · Computer Science 2024-04-09 Ran Wei , Nathan Lambert , Anthony McDonald , Alfredo Garcia , Roberto Calandra

Mismatched No More: Joint Model-Policy Optimization for Model-Based RL

Many model-based reinforcement learning (RL) methods follow a similar template: fit a model to previously observed data, and then use data from that model for RL or planning. However, models that achieve better training performance (e.g.,…

Machine Learning · Computer Science 2023-02-21 Benjamin Eysenbach , Alexander Khazatsky , Sergey Levine , Ruslan Salakhutdinov

Model Imitation for Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) aims to learn a dynamic model to reduce the number of interactions with real-world environments. However, due to estimation error, rollouts in the learned model, especially those of long horizons,…

Machine Learning · Computer Science 2020-03-17 Yueh-Hua Wu , Ting-Han Fan , Peter J. Ramadge , Hao Su

Should Models Be Accurate?

Model-based Reinforcement Learning (MBRL) holds promise for data-efficiency by planning with model-generated experience in addition to learning with experience from the environment. However, in complex or changing environments, models in…

Machine Learning · Computer Science 2022-05-24 Esra'a Saleh , John D. Martin , Anna Koop , Arash Pourzarabi , Michael Bowling

Discriminator Augmented Model-Based Reinforcement Learning

By planning through a learned dynamics model, model-based reinforcement learning (MBRL) offers the prospect of good performance with little environment interaction. However, it is common in practice for the learned model to be inaccurate,…

Machine Learning · Computer Science 2021-03-31 Behzad Haghgoo , Allan Zhou , Archit Sharma , Chelsea Finn

Multi-timestep models for Model-based Reinforcement Learning

In model-based reinforcement learning (MBRL), most algorithms rely on simulating trajectories from one-step dynamics models learned on data. A critical challenge of this approach is the compounding of one-step prediction errors as length of…

Machine Learning · Computer Science 2023-10-12 Abdelhakim Benechehab , Giuseppe Paolo , Albert Thomas , Maurizio Filippone , Balázs Kégl

Investigating Compounding Prediction Errors in Learned Dynamics Models

Accurately predicting the consequences of agents' actions is a key prerequisite for planning in robotic control. Model-based reinforcement learning (MBRL) is one paradigm which relies on the iterative learning and prediction of state-action…

Machine Learning · Computer Science 2022-03-21 Nathan Lambert , Kristofer Pister , Roberto Calandra

Selective Learning: Towards Robust Calibration with Dynamic Regularization

Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance. This problem usually arises due to the overfitting problem, which is characterized by learning everything presented in the…

Machine Learning · Computer Science 2024-07-16 Zongbo Han , Yifeng Yang , Changqing Zhang , Linjun Zhang , Joey Tianyi Zhou , Qinghua Hu

A Survey on Model-based Reinforcement Learning

Reinforcement learning (RL) solves sequential decision-making problems via a trial-and-error process interacting with the environment. While RL achieves outstanding success in playing complex video games that allow huge trial-and-error,…

Machine Learning · Computer Science 2022-06-22 Fan-Ming Luo , Tian Xu , Hang Lai , Xiong-Hui Chen , Weinan Zhang , Yang Yu

Planning with Exploration: Addressing Dynamics Bottleneck in Model-based Reinforcement Learning

Model-based reinforcement learning (MBRL) is believed to have higher sample efficiency compared with model-free reinforcement learning (MFRL). However, MBRL is plagued by dynamics bottleneck dilemma. Dynamics bottleneck dilemma is the…

Machine Learning · Computer Science 2021-06-25 Xiyao Wang , Junge Zhang , Wenzhen Huang , Qiyue Yin

A Unified Framework for Alternating Offline Model Training and Policy Learning

In offline model-based reinforcement learning (offline MBRL), we learn a dynamic model from historically collected data, and subsequently utilize the learned model and fixed datasets for policy learning, without further interacting with the…

Machine Learning · Computer Science 2022-10-13 Shentao Yang , Shujian Zhang , Yihao Feng , Mingyuan Zhou

The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback

Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique to make large language models (LLMs) more capable in complex settings. RLHF proceeds as collecting human preference data, training a reward model on said…

Machine Learning · Computer Science 2024-02-05 Nathan Lambert , Roberto Calandra

Beyond Precision: Training-Inference Mismatch is an Optimization Problem and Simple LR Scheduling Fixes It

Reinforcement Learning (RL) for training Large Language Models is notoriously unstable. While recent studies attribute this to "training inference mismatch stemming" from inconsistent hybrid engines, standard remedies, such as Importance…

Machine Learning · Computer Science 2026-02-03 Yaxiang Zhang , Yingru Li , Jiacai Liu , Jiawei Xu , Ziniu Li , Qian Liu , Haoyuan Li

Don't miss the Mismatch: Investigating the Objective Function Mismatch for Unsupervised Representation Learning

Finding general evaluation metrics for unsupervised representation learning techniques is a challenging open research question, which recently has become more and more necessary due to the increasing interest in unsupervised methods. Even…

Computer Vision and Pattern Recognition · Computer Science 2022-03-02 Bonifaz Stuhr , Jürgen Brauer

On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner. MBRL algorithms can be fairly complex due to the separate dynamics modeling and the subsequent planning algorithm, and as a…

Machine Learning · Computer Science 2021-03-01 Baohe Zhang , Raghu Rajan , Luis Pineda , Nathan Lambert , André Biedenkapp , Kurtland Chua , Frank Hutter , Roberto Calandra

Model-based Lookahead Reinforcement Learning

Model-based Reinforcement Learning (MBRL) allows data-efficient learning which is required in real world applications such as robotics. However, despite the impressive data-efficiency, MBRL does not achieve the final performance of…

Machine Learning · Computer Science 2019-08-19 Zhang-Wei Hong , Joni Pajarinen , Jan Peters

The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms

We propose a novel approach to addressing two fundamental challenges in Model-based Reinforcement Learning (MBRL): the computational expense of repeatedly finding a good policy in the learned model, and the objective mismatch between model…

Machine Learning · Computer Science 2023-03-02 Anirudh Vemula , Yuda Song , Aarti Singh , J. Andrew Bagnell , Sanjiban Choudhury

Train Hard, Fight Easy: Robust Meta Reinforcement Learning

A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients. Meta-RL (MRL) addresses this issue by learning a meta-policy that adapts to new tasks. Standard MRL methods…

Machine Learning · Computer Science 2023-10-03 Ido Greenberg , Shie Mannor , Gal Chechik , Eli Meirom

Self-Correcting Models for Model-Based Reinforcement Learning

When an agent cannot represent a perfectly accurate model of its environment's dynamics, model-based reinforcement learning (MBRL) can fail catastrophically. Planning involves composing the predictions of the model; when flawed predictions…

Machine Learning · Computer Science 2017-07-28 Erik Talvitie

Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning

In numerous reinforcement learning (RL) problems involving safety-critical systems, a key challenge lies in balancing multiple objectives while simultaneously meeting all stringent safety constraints. To tackle this issue, we propose a…

Artificial Intelligence · Computer Science 2024-05-28 Shangding Gu , Bilgehan Sel , Yuhao Ding , Lu Wang , Qingwei Lin , Alois Knoll , Ming Jin