English
Related papers

Related papers: Meta-Model-Based Meta-Policy Optimization

200 papers

We introduce Massively Multi-Task Model-Based Policy Optimization (M3PO), a scalable model-based reinforcement learning (MBRL) framework designed to address sample inefficiency in single-task settings and poor generalization in multi-task…

Machine Learning · Computer Science 2025-06-30 Aditya Narendra , Dmitry Makarov , Aleksandr Panov

Model-based reinforcement learning approaches carry the promise of being data efficient. However, due to challenges in learning dynamics models that sufficiently match the real-world dynamics, they struggle to achieve the same asymptotic…

Machine Learning · Computer Science 2018-09-17 Ignasi Clavera , Jonas Rothfuss , John Schulman , Yasuhiro Fujita , Tamim Asfour , Pieter Abbeel

Existing offline reinforcement learning (RL) methods face a few major challenges, particularly the distributional shift between the learned policy and the behavior policy. Offline Meta-RL is emerging as a promising approach to address these…

Machine Learning · Computer Science 2022-07-14 Sen Lin , Jialin Wan , Tengyu Xu , Yingbin Liang , Junshan Zhang

Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL. However, the theoretical understanding of such methods has been rather limited. This paper…

Machine Learning · Computer Science 2021-02-16 Yuping Luo , Huazhe Xu , Yuanzhi Li , Yuandong Tian , Trevor Darrell , Tengyu Ma

Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner. MBRL algorithms can be fairly complex due to the separate dynamics modeling and the subsequent planning algorithm, and as a…

Machine Learning · Computer Science 2021-03-01 Baohe Zhang , Raghu Rajan , Luis Pineda , Nathan Lambert , André Biedenkapp , Kurtland Chua , Frank Hutter , Roberto Calandra

Meta-reinforcement learning (meta-RL) is a promising framework for tackling challenging domains requiring efficient exploration. Existing meta-RL algorithms are characterized by low sample efficiency, and mostly focus on low-dimensional…

Machine Learning · Computer Science 2024-03-18 Zohar Rimon , Tom Jurgenson , Orr Krupnik , Gilad Adler , Aviv Tamar

Offline reinforcement learning (RL) refers to the problem of learning policies entirely from a large batch of previously collected data. This problem setting offers the promise of utilizing such datasets to acquire policies without any…

Machine Learning · Computer Science 2020-11-24 Tianhe Yu , Garrett Thomas , Lantao Yu , Stefano Ermon , James Zou , Sergey Levine , Chelsea Finn , Tengyu Ma

Meta reinforcement learning (Meta-RL) is an approach wherein the experience gained from solving a variety of tasks is distilled into a meta-policy. The meta-policy, when adapted over only a small (or just a single) number of steps, is able…

Machine Learning · Computer Science 2022-09-28 Desik Rengarajan , Sapana Chaudhary , Jaewon Kim , Dileep Kalathil , Srinivas Shakkottai

Deep reinforcement learning (RL) excels in various control tasks, yet the absence of safety guarantees hampers its real-world applicability. In particular, explorations during learning usually results in safety violations, while the RL…

Robotics · Computer Science 2025-06-04 Yifan Sun , Feihan Li , Weiye Zhao , Rui Chen , Tianhao Wei , Changliu Liu

Meta-reinforcement learning (Meta-RL) has attracted attention due to its capability to enhance reinforcement learning (RL) algorithms, in terms of data efficiency and generalizability. In this paper, we develop a bilevel optimization…

Machine Learning · Computer Science 2024-10-15 Siyuan Xu , Minghui Zhu

Existing studies on constrained reinforcement learning (RL) may obtain a well-performing policy in the training environment. However, when deployed in a real environment, it may easily violate constraints that were originally satisfied…

Machine Learning · Computer Science 2024-05-06 Zhongchang Sun , Sihong He , Fei Miao , Shaofeng Zou

Meta reinforcement learning (RL) allows agents to leverage experience across a distribution of tasks on which the agent can train at will, enabling faster learning of optimal policies on new test tasks. Despite its success in improving…

Machine Learning · Computer Science 2026-05-27 Tingting Ni , Maryam Kamgarpour

Reward-based alignment methods for large language models (LLMs) face two key limitations: vulnerability to reward hacking, where models exploit flaws in the reward signal; and reliance on brittle, labor-intensive prompt engineering when…

Computation and Language · Computer Science 2025-05-20 Zae Myung Kim , Chanwoo Park , Vipul Raheja , Suin Kim , Dongyeop Kang

Hyperparameter optimization (HPO) plays a critical role in improving model performance. Transformer-based HPO methods have shown great potential; however, existing approaches rely heavily on large-scale historical optimization trajectories…

Machine Learning · Computer Science 2025-09-23 Haoxin Guo , Jiawen Pan , Weixin Zhai

Reinforcement Learning (RL) applications in real-world scenarios must prioritize safety and reliability, which impose strict constraints on agent behavior. Model-based RL leverages predictive world models for action planning and policy…

Artificial Intelligence · Computer Science 2025-06-06 Artem Latyshev , Gregory Gorbov , Aleksandr I. Panov

The goal of robust constrained reinforcement learning (RL) is to optimize an agent's performance under the worst-case model uncertainty while satisfying safety or resource constraints. In this paper, we demonstrate that strong duality does…

Machine Learning · Computer Science 2025-09-23 Shaocong Ma , Ziyi Chen , Yi Zhou , Heng Huang

Model-free reinforcement learning (RL) methods are succeeding in a growing number of tasks, aided by recent advances in deep learning. However, they tend to suffer from high sample complexity, which hinders their use in real-world domains.…

Machine Learning · Computer Science 2018-10-08 Thanard Kurutach , Ignasi Clavera , Yan Duan , Aviv Tamar , Pieter Abbeel

Designing and analyzing model-based RL (MBRL) algorithms with guaranteed monotonic improvement has been challenging, mainly due to the interdependence between policy optimization and model learning. Existing discrepancy bounds generally…

Machine Learning · Computer Science 2023-11-09 Tianying Ji , Yu Luo , Fuchun Sun , Mingxuan Jing , Fengxiang He , Wenbing Huang

Tremendous progress has been made in reinforcement learning (RL) over the past decade. Most of these advancements came through the continual development of new algorithms, which were designed using a combination of mathematical derivations,…

Machine Learning · Computer Science 2022-10-14 Chris Lu , Jakub Grudzien Kuba , Alistair Letcher , Luke Metz , Christian Schroeder de Witt , Jakob Foerster

Modern meta-reinforcement learning (Meta-RL) methods are mainly developed based on model-agnostic meta-learning, which performs policy gradient steps across tasks to maximize policy performance. However, the gradient conflict problem is…

Artificial Intelligence · Computer Science 2022-09-22 Haozhi Wang , Qing Wang , Yunfeng Shao , Dong Li , Jianye Hao , Yinchuan Li
‹ Prev 1 2 3 10 Next ›