Related papers: Active Preference-Based Gaussian Process Regressio…

Batch Active Preference-Based Learning of Reward Functions

Data generation and labeling are usually an expensive part of learning for robotics. While active learning methods are commonly used to tackle the former problem, preference-based learning is a concept that attempts to solve the latter by…

Machine Learning · Computer Science 2018-10-11 Erdem Bıyık , Dorsa Sadigh

Batch Active Learning of Reward Functions from Human Preferences

Data generation and labeling are often expensive in robot learning. Preference-based learning is a concept that enables reliable labeling by querying users with preference questions. Active querying methods are commonly employed in…

Machine Learning · Computer Science 2024-02-27 Erdem Bıyık , Nima Anari , Dorsa Sadigh

Learning Preferences for Interactive Autonomy

When robots enter everyday human environments, they need to understand their tasks and how they should perform those tasks. To encode these, reward functions, which specify the objective of a robot, are employed. However, designing reward…

Robotics · Computer Science 2022-10-21 Erdem Bıyık

A tutorial on learning from preferences and choices with Gaussian Processes

Preference modelling lies at the intersection of economics, decision theory, machine learning and statistics. By understanding individuals' preferences and how they make choices, we can build products that closely match their expectations,…

Machine Learning · Computer Science 2026-05-19 Alessio Benavoli , Dario Azzimonti

Learning Reward Functions by Integrating Human Demonstrations and Preferences

Our goal is to accurately and efficiently learn reward functions for autonomous robots. Current approaches to this problem include inverse reinforcement learning (IRL), which uses expert demonstrations, and preference-based learning, which…

Robotics · Computer Science 2019-06-24 Malayandi Palan , Nicholas C. Landolfi , Gleb Shevchuk , Dorsa Sadigh

A Generalized Acquisition Function for Preference-based Reward Learning

Preference-based reward learning is a popular technique for teaching robots and autonomous systems how a human user wants them to perform a task. Previous works have shown that actively synthesizing preference queries to maximize…

Robotics · Computer Science 2024-03-12 Evan Ellis , Gaurav R. Ghosal , Stuart J. Russell , Anca Dragan , Erdem Bıyık

Preference-based Learning of Reward Function Features

Preference-based learning of reward functions, where the reward function is learned using comparison data, has been well studied for complex robotic tasks such as autonomous driving. Existing algorithms have focused on learning reward…

Robotics · Computer Science 2021-03-05 Sydney M. Katz , Amir Maleki , Erdem Bıyık , Mykel J. Kochenderfer

Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning Approach

Human demonstrations can provide trustful samples to train reinforcement learning algorithms for robots to learn complex behaviors in real-world environments. However, obtaining sufficient demonstrations may be impractical because many…

Robotics · Computer Science 2020-10-16 Huixin Zhan , Feng Tao , Yongcan Cao

Programmatic Reward Design by Example

Reward design is a fundamental problem in reinforcement learning (RL). A misspecified or poorly designed reward can result in low sample efficiency and undesired behaviors. In this paper, we propose the idea of programmatic reward design,…

Machine Learning · Computer Science 2022-01-10 Weichao Zhou , Wenchao Li

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input

Humans use social context to specify preferences over behaviors, i.e. their reward functions. Yet, algorithms for inferring reward models from preference data do not take this social learning view into account. Inspired by pragmatic human…

Machine Learning · Computer Science 2024-05-24 Andi Peng , Yuying Sun , Tianmin Shu , David Abel

Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference

Recent advances in reinforcement learning have inspired increasing interest in learning user modeling adaptively through dynamic interactions, e.g., in reinforcement learning based recommender systems. Reward function is crucial for most of…

Machine Learning · Computer Science 2021-05-06 Xiaocong Chen , Lina Yao , Xianzhi Wang , Aixin Sun , Wenjie Zhang , Quan Z. Sheng

Task-Adaptive Robot Learning from Demonstration with Gaussian Process Models under Replication

Learning from Demonstration (LfD) is a paradigm that allows robots to learn complex manipulation tasks that can not be easily scripted, but can be demonstrated by a human teacher. One of the challenges of LfD is to enable robots to acquire…

Robotics · Computer Science 2021-02-08 Miguel Arduengo , Adrià Colomé , Júlia Borràs , Luis Sentis , Carme Torras

From Demonstrations to Rewards: Alignment Without Explicit Human Preferences

One of the challenges of aligning large models with human preferences lies in both the data requirements and the technical complexities of current approaches. Predominant methods, such as RLHF, involve multiple steps, each demanding…

Machine Learning · Computer Science 2025-03-19 Siliang Zeng , Yao Liu , Huzefa Rangwala , George Karypis , Mingyi Hong , Rasool Fakoor

Manifold Gaussian Processes for Regression

Off-the-shelf Gaussian Process (GP) covariance functions encode smoothness assumptions on the structure of the function to be modeled. To model complex and non-differentiable functions, these smoothness assumptions are often too…

Machine Learning · Statistics 2016-04-12 Roberto Calandra , Jan Peters , Carl Edward Rasmussen , Marc Peter Deisenroth

Inverse Preference Learning: Preference-based RL without a Reward Function

Reward functions are difficult to design and often hard to align with human intent. Preference-based Reinforcement Learning (RL) algorithms address these problems by learning reward functions from human feedback. However, the majority of…

Machine Learning · Computer Science 2023-11-28 Joey Hejna , Dorsa Sadigh

Gaussian-Process-based Robot Learning from Demonstration

Endowed with higher levels of autonomy, robots are required to perform increasingly complex manipulation tasks. Learning from demonstration is arising as a promising paradigm for transferring skills to robots. It allows to implicitly learn…

Robotics · Computer Science 2023-02-24 Miguel Arduengo , Adrià Colomé , Joan Lobo-Prat , Luis Sentis , Carme Torras

Capturing Individual Human Preferences with Reward Features

Reinforcement learning from human feedback usually models preferences using a reward function that does not distinguish between people. We argue that this is unlikely to be a good design choice in contexts with high potential for…

Artificial Intelligence · Computer Science 2026-02-20 André Barreto , Vincent Dumoulin , Yiran Mao , Mark Rowland , Nicolas Perez-Nieves , Bobak Shahriari , Yann Dauphin , Doina Precup , Hugo Larochelle

Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences

Reward functions are a common way to specify the objective of a robot. As designing reward functions can be extremely challenging, a more promising approach is to directly learn reward functions from human teachers. Importantly, data from…

Robotics · Computer Science 2021-08-05 Erdem Bıyık , Dylan P. Losey , Malayandi Palan , Nicholas C. Landolfi , Gleb Shevchuk , Dorsa Sadigh

Human Preference Modeling Using Visual Motion Prediction Improves Robot Skill Learning from Egocentric Human Video

We present an approach to robot learning from egocentric human videos by modeling human preferences in a reward function and optimizing robot behavior to maximize this reward. Prior work on reward learning from human videos attempts to…

Robotics · Computer Science 2026-02-13 Mrinal Verghese , Christopher G. Atkeson

Active Learning for Manifold Gaussian Process Regression

This paper introduces an active learning framework for manifold Gaussian Process (GP) regression, combining manifold learning with strategic data selection to improve accuracy in high-dimensional spaces. Our method jointly optimizes a…

Machine Learning · Statistics 2026-05-12 Yuanxing Cheng , Lulu Kang , Yiwei Wang , Chun Liu