English
Related papers

Related papers: Offline Preference-Based Apprenticeship Learning

200 papers

Learning a reward function from human preferences is challenging as it typically requires having a high-fidelity simulator or using expensive and potentially unsafe actual physical rollouts in the environment. However, in many tasks the…

Machine Learning · Computer Science 2023-01-05 Daniel Shin , Anca D. Dragan , Daniel S. Brown

Applying reinforcement learning (RL) to real-world problems is often made challenging by the inability to interact with the environment and the difficulty of designing reward functions. Offline RL addresses the first challenge by…

Machine Learning · Computer Science 2025-03-03 Alizée Pace , Bernhard Schölkopf , Gunnar Rätsch , Giorgia Ramponi

In preference-based reinforcement learning (PbRL), a reward function is learned from a type of human feedback called preference. To expedite preference collection, recent works have leveraged \emph{offline preferences}, which are…

Machine Learning · Computer Science 2024-03-18 Guoxi Zhang , Han Bao , Hisashi Kashima

Offline preference-based reinforcement learning (RL), which focuses on optimizing policies using human preferences between pairs of trajectory segments selected from an offline dataset, has emerged as a practical avenue for RL applications.…

Machine Learning · Computer Science 2024-07-08 Chen-Xiao Gao , Shengjun Fang , Chenjun Xiao , Yang Yu , Zongzhang Zhang

Offline Preference-based Reinforcement Learning (PbRL) learns rewards and policies aligned with human preferences without the need for extensive reward engineering and direct interaction with human annotators. However, ensuring safety…

Artificial Intelligence · Computer Science 2025-12-24 Ze Gong , Pradeep Varakantham , Akshat Kumar

This study focuses on the topic of offline preference-based reinforcement learning (PbRL), a variant of conventional reinforcement learning that dispenses with the need for online interaction or specification of reward functions. Instead,…

Machine Learning · Computer Science 2023-06-12 Yachen Kang , Diyuan Shi , Jinxin Liu , Li He , Donglin Wang

In this paper, we investigate the problem of offline Preference-based Reinforcement Learning (PbRL) with human feedback where feedback is available in the form of preference between trajectory pairs rather than explicit rewards. Our…

Machine Learning · Computer Science 2023-10-03 Wenhao Zhan , Masatoshi Uehara , Nathan Kallus , Jason D. Lee , Wen Sun

Conventional reinforcement learning (RL) needs an environment to collect fresh data, which is impractical when online interactions are costly. Offline RL provides an alternative solution by directly learning from the previously collected…

Machine Learning · Computer Science 2023-03-15 Han Zheng , Xufang Luo , Pengfei Wei , Xuan Song , Dongsheng Li , Jing Jiang

Training practical agents usually involve offline and online reinforcement learning (RL) to balance the policy's performance and interaction costs. In particular, online fine-tuning has become a commonly used method to correct the erroneous…

Machine Learning · Computer Science 2023-06-07 Qisen Yang , Shenzhi Wang , Matthieu Gaetan Lin , Shiji Song , Gao Huang

Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, without interactions with the system. An agent in this setting should avoid selecting actions whose consequences cannot be predicted from the…

In recent years, significant progress has been made in multi-objective reinforcement learning (RL) research, which aims to balance multiple objectives by incorporating preferences for each objective. In most existing studies, specific…

Machine Learning · Computer Science 2024-09-17 Qian Lin , Zongkai Liu , Danying Mo , Chao Yu

Preference-based reinforcement learning (PbRL) is an approach that enables RL agents to learn from preference, which is particularly useful when formulating a reward function is challenging. Existing PbRL methods generally involve a…

Machine Learning · Computer Science 2023-10-30 Gaon An , Junhyeok Lee , Xingdong Zuo , Norio Kosaka , Kyung-Min Kim , Hyun Oh Song

Offline preference-based reinforcement learning (PbRL) provides an effective way to overcome the challenges of designing reward and the high costs of online interaction. However, since labeling preference needs real-time human feedback,…

Machine Learning · Computer Science 2026-02-10 Xiao-Yin Liu , Guotao Li , Xiao-Hu Zhou , Zeng-Guang Hou

Two central paradigms have emerged in the reinforcement learning (RL) community: online RL and offline RL. In the online RL setting, the agent has no prior knowledge of the environment, and must interact with it in order to find an…

Machine Learning · Computer Science 2023-07-21 Andrew Wagenmaker , Aldo Pacchiano

The complexity of designing reward functions has been a major obstacle to the wide application of deep reinforcement learning (RL) techniques. Describing an agent's desired behaviors and properties can be difficult, even for experts. A new…

Machine Learning · Computer Science 2024-05-09 Wanqi Xue , Bo An , Shuicheng Yan , Zhongwen Xu

Offline reinforcement learning has become one of the most practical RL settings. However, most existing works on offline RL focus on the standard setting with scalar reward feedback. It remains unknown how to universally transfer the…

Machine Learning · Computer Science 2024-10-25 Yinglun Xu , David Zhu , Rohan Gumaste , Gagandeep Singh

Offline reinforcement learning (RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities, diffusion models have shown significant potential as policy models for…

Machine Learning · Computer Science 2024-05-30 Tianle Zhang , Jiayi Guan , Lin Zhao , Yihang Li , Dongjiang Li , Zecui Zeng , Lei Sun , Yue Chen , Xuelong Wei , Lusong Li , Xiaodong He

Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked processes of reinforcement learning: collecting informative experience and inferring optimal behaviour. The second step has been widely studied in the…

Preference-based reinforcement learning (PbRL) can help avoid sophisticated reward designs and align better with human intentions, showing great promise in various real-world applications. However, obtaining human feedback for preferences…

Machine Learning · Computer Science 2026-04-06 Yiqin Yang , Hao Hu , Yihuan Mao , Jin Zhang , Chengjie Wu , Yuhua Jiang , Xu Yang , Runpeng Xie , Yi Fan , Bo Liu , Yang Gao , Bo Xu , Chongjie Zhang

Offline Reinforcement Learning (RL) aims to turn large datasets into powerful decision-making engines without any online interactions with the environment. This great promise has motivated a large amount of research that hopes to replicate…

Machine Learning · Computer Science 2020-12-01 Louis Monier , Jakub Kmec , Alexandre Laterre , Thomas Pierrot , Valentin Courgeau , Olivier Sigaud , Karim Beguir
‹ Prev 1 2 3 10 Next ›