Related papers: APRIL: Active Preference-learning based Reinforcem…

Advances in Preference-based Reinforcement Learning: A Review

Reinforcement Learning (RL) algorithms suffer from the dependency on accurately engineered reward functions to properly guide the learning agents to do the required tasks. Preference-based reinforcement learning (PbRL) addresses that by…

Artificial Intelligence · Computer Science 2024-08-23 Youssef Abdelkareem , Shady Shehata , Fakhri Karray

Provable Reward-Agnostic Preference-Based Reinforcement Learning

Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories, rather than explicit reward signals. While PbRL has demonstrated…

Machine Learning · Computer Science 2024-04-18 Wenhao Zhan , Masatoshi Uehara , Wen Sun , Jason D. Lee

Direct Preference-based Policy Optimization without Reward Modeling

Preference-based reinforcement learning (PbRL) is an approach that enables RL agents to learn from preference, which is particularly useful when formulating a reward function is challenging. Existing PbRL methods generally involve a…

Machine Learning · Computer Science 2023-10-30 Gaon An , Junhyeok Lee , Xingdong Zuo , Norio Kosaka , Kyung-Min Kim , Hyun Oh Song

Reinforcement Learning

Reinforcement learning (RL) is a general framework for adaptive control, which has proven to be efficient in many domains, e.g., board games, video games or autonomous vehicles. In such problems, an agent faces a sequential decision-making…

Machine Learning · Computer Science 2020-06-16 Olivier Buffet , Olivier Pietquin , Paul Weng

Introduction to Reinforcement Learning

Reinforcement Learning (RL), a subfield of Artificial Intelligence (AI), focuses on training agents to make decisions by interacting with their environment to maximize cumulative rewards. This paper provides an overview of RL, covering its…

Artificial Intelligence · Computer Science 2024-12-04 Majid Ghasemi , Dariush Ebrahimi

Beyond Human Preferences: Exploring Reinforcement Learning Trajectory Evaluation and Improvement through LLMs

Reinforcement learning (RL) faces challenges in evaluating policy trajectories within intricate game tasks due to the difficulty in designing comprehensive and precise reward functions. This inherent difficulty curtails the broader…

Artificial Intelligence · Computer Science 2024-07-02 Zichao Shen , Tianchen Zhu , Qingyun Sun , Shiqi Gao , Jianxin Li

Reinforcement Learning from Diverse Human Preferences

The complexity of designing reward functions has been a major obstacle to the wide application of deep reinforcement learning (RL) techniques. Describing an agent's desired behaviors and properties can be difficult, even for experts. A new…

Machine Learning · Computer Science 2024-05-09 Wanqi Xue , Bo An , Shuicheng Yan , Zhongwen Xu

Interpretable Preference-based Reinforcement Learning with Tree-Structured Reward Functions

The potential of reinforcement learning (RL) to deliver aligned and performant agents is partially bottlenecked by the reward engineering problem. One alternative to heuristic trial-and-error is preference-based RL (PbRL), where a reward…

Machine Learning · Computer Science 2021-12-22 Tom Bewley , Freddy Lecue

Reinforcement Learning through Active Inference

The central tenet of reinforcement learning (RL) is that agents seek to maximize the sum of cumulative rewards. In contrast, active inference, an emerging framework within cognitive and computational neuroscience, proposes that agents act…

Machine Learning · Computer Science 2020-03-02 Alexander Tschantz , Beren Millidge , Anil K. Seth , Christopher L. Buckley

Inverse Preference Learning: Preference-based RL without a Reward Function

Reward functions are difficult to design and often hard to align with human intent. Preference-based Reinforcement Learning (RL) algorithms address these problems by learning reward functions from human feedback. However, the majority of…

Machine Learning · Computer Science 2023-11-28 Joey Hejna , Dorsa Sadigh

Prior Preference Learning from Experts:Designing a Reward with Active Inference

Active inference may be defined as Bayesian modeling of a brain with a biologically plausible model of the agent. Its primary idea relies on the free energy principle and the prior preference of the agent. An agent will choose an action…

Machine Learning · Computer Science 2021-12-14 Jin young Shin , Cheolhyeong Kim , Hyung Ju Hwang

Information Directed Reward Learning for Reinforcement Learning

For many reinforcement learning (RL) applications, specifying a reward is difficult. This paper considers an RL setting where the agent obtains information about the reward only by querying an expert that can, for example, evaluate…

Machine Learning · Computer Science 2022-02-01 David Lindner , Matteo Turchetta , Sebastian Tschiatschek , Kamil Ciosek , Andreas Krause

Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics

Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from…

Robotics · Computer Science 2022-11-07 Krishan Rana , Ming Xu , Brendan Tidd , Michael Milford , Niko Sünderhauf

Rethinking Reinforcement Learning for Recommendation: A Prompt Perspective

Modern recommender systems aim to improve user experience. As reinforcement learning (RL) naturally fits this objective -- maximizing an user's reward per session -- it has become an emerging topic in recommender systems. Developing…

Information Retrieval · Computer Science 2022-06-16 Xin Xin , Tiago Pimentel , Alexandros Karatzoglou , Pengjie Ren , Konstantina Christakopoulou , Zhaochun Ren

ELO-Rated Sequence Rewards: Advancing Reinforcement Learning Models

Reinforcement Learning (RL) heavily relies on the careful design of the reward function. However, accurately assigning rewards to each state-action pair in Long-Term Reinforcement Learning (LTRL) tasks remains a significant challenge. As a…

Machine Learning · Computer Science 2025-06-03 Qi Ju , Falin Hei , Zhemei Fang , Yunfeng Luo

Preferences Implicit in the State of the World

Reinforcement learning (RL) agents optimize only the features specified in a reward function and are indifferent to anything left out inadvertently. This means that we must not only specify what to do, but also the much larger space of what…

Machine Learning · Computer Science 2019-04-22 Rohin Shah , Dmitrii Krasheninnikov , Jordan Alexander , Pieter Abbeel , Anca Dragan

Continuous Action Reinforcement Learning from a Mixture of Interpretable Experts

Reinforcement learning (RL) has demonstrated its ability to solve high dimensional tasks by leveraging non-linear function approximators. However, these successes are mostly achieved by 'black-box' policies in simulated domains. When…

Machine Learning · Computer Science 2021-11-19 Riad Akrour , Davide Tateo , Jan Peters

A Short Survey On Memory Based Reinforcement Learning

Reinforcement learning (RL) is a branch of machine learning which is employed to solve various sequential decision making problems without proper supervision. Due to the recent advancement of deep learning, the newly proposed Deep-RL…

Artificial Intelligence · Computer Science 2019-04-17 Dhruv Ramani

Learning the Preferences of a Learning Agent

For AI systems to be useful to humans, they must understand and act in accordance with our values and preferences. Since specifying preferences is a hard task, inverse reinforcement learning (IRL) aims to develop methods that allow for…

Artificial Intelligence · Computer Science 2026-05-12 Karim Abdel Sadek , Mark Bedaywi , Rhys Gould , Stuart Russell

A State Augmentation based approach to Reinforcement Learning from Human Preferences

Reinforcement Learning has suffered from poor reward specification, and issues for reward hacking even in simple enough domains. Preference Based Reinforcement Learning attempts to solve the issue by utilizing binary feedbacks on queried…

Artificial Intelligence · Computer Science 2023-02-20 Mudit Verma , Subbarao Kambhampati