English
Related papers

Related papers: Online Learning with Preference Feedback

200 papers

Robot policies need to adapt to human preferences and/or new environments. Human experts may have the domain knowledge required to help robots achieve this adaptation. However, existing works often require costly offline re-training on…

Machine Learning · Computer Science 2023-02-28 Vivek Myers , Erdem Bıyık , Dorsa Sadigh

We consider the problem of learning preferences over trajectories for mobile manipulators such as personal robots and assembly line robots. The preferences we learn are more intricate than simple geometric constraints on trajectories; they…

Robotics · Computer Science 2016-01-06 Ashesh Jain , Shikhar Sharma , Thorsten Joachims , Ashutosh Saxena

We consider interactive tools that help users search for their most preferred item in a large collection of options. In particular, we examine example-critiquing, a technique for enabling users to incrementally construct preference models…

Artificial Intelligence · Computer Science 2011-10-04 B. Faltings , P. Pu , P. Viappiani

In this paper, we propose a novel ranking framework for collaborative filtering with the overall aim of learning user preferences over items by minimizing a pairwise ranking loss. We show the minimization problem involves dependent random…

We consider the problem of learning good trajectories for manipulation tasks. This is challenging because the criterion defining a good trajectory varies with users, tasks and environments. In this paper, we propose a co-active online…

Robotics · Computer Science 2015-01-30 Ashesh Jain , Brian Wojcik , Thorsten Joachims , Ashutosh Saxena

A reciprocal recommendation problem is one where the goal of learning is not just to predict a user's preference towards a passive item (e.g., a book), but to recommend the targeted user on one side another user from the other side such…

Machine Learning · Computer Science 2018-06-05 Fabio Vitale , Nikos Parotsidis , Claudio Gentile

Negative user preference is an important context that is not sufficiently utilized by many existing recommender systems. This context is especially useful in scenarios where the cost of negative items is high for the users. In this work, we…

Information Retrieval · Computer Science 2021-02-19 Bibek Paudel , Sandro Luck , Abraham Bernstein

Recent preference learning frameworks for large language models (LLMs) simplify human preferences with binary pairwise comparisons and scalar rewards. This simplification could make LLMs' responses biased to mostly preferred features, and…

Machine Learning · Computer Science 2025-06-16 Dongyoung Kim , Jinsung Yoon , Jinwoo Shin , Jaehyung Kim

Learning a reward function from human preferences is challenging as it typically requires having a high-fidelity simulator or using expensive and potentially unsafe actual physical rollouts in the environment. However, in many tasks the…

Machine Learning · Computer Science 2023-01-05 Daniel Shin , Anca D. Dragan , Daniel S. Brown

Learning a reward function from human preferences is challenging as it typically requires having a high-fidelity simulator or using expensive and potentially unsafe actual physical rollouts in the environment. However, in many tasks the…

Machine Learning · Computer Science 2022-02-18 Daniel Shin , Daniel S. Brown , Anca D. Dragan

When faced with complex choices, users refine their own preference criteria as they explore the catalogue of options. In this paper we propose an approach to preference elicitation suited for this scenario. We extend Coactive Learning,…

Artificial Intelligence · Computer Science 2016-12-07 Stefano Teso , Paolo Dragone , Andrea Passerini

Today's robots are increasingly interacting with people and need to efficiently learn inexperienced user's preferences. A common framework is to iteratively query the user about which of two presented robot trajectories they prefer. While…

Robotics · Computer Science 2021-10-04 Nils Wilde , Erdem Bıyık , Dorsa Sadigh , Stephen L. Smith

Learning of preference models from human feedback has been central to recent advances in artificial intelligence. Motivated by the cost of obtaining high-quality human annotations, we study efficient human preference elicitation for…

Machine Learning · Computer Science 2026-02-17 Subhojyoti Mukherjee , Anusha Lalitha , Kousha Kalantari , Aniket Deshmukh , Ge Liu , Yifei Ma , Branislav Kveton

For summarization, human preference is critical to tame outputs of the summarizer in favor of human interests, as ground-truth summaries are scarce and ambiguous. Practical settings require dynamic exchanges between human and AI agent…

Artificial Intelligence · Computer Science 2022-05-13 Duy-Hung Nguyen , Nguyen Viet Dung Nghiem , Bao-Sinh Nguyen , Dung Tien Le , Shahab Sabahi , Minh-Tien Nguyen , Hung Le

In preference-based reinforcement learning (PbRL), a reward function is learned from a type of human feedback called preference. To expedite preference collection, recent works have leveraged \emph{offline preferences}, which are…

Machine Learning · Computer Science 2024-03-18 Guoxi Zhang , Han Bao , Hisashi Kashima

We consider two settings of online learning to rank where feedback is restricted to top ranked items. The problem is cast as an online game between a learner and sequence of users, over $T$ rounds. In both settings, the learners objective…

Machine Learning · Computer Science 2016-08-24 Sougata Chaudhuri , Ambuj Tewari

We introduce a new model for online ranking in which the click probability factors into an examination and attractiveness function and the attractiveness function is a linear function of a feature vector and an unknown parameter. Only…

Machine Learning · Statistics 2019-05-28 Shuai Li , Tor Lattimore , Csaba Szepesvári

Preference elicitation explicitly asks users what kind of recommendations they would like to receive. It is a popular technique for conversational recommender systems to deal with cold-starts. Previous work has studied selection bias in…

Information Retrieval · Computer Science 2024-05-02 Shashank Gupta , Harrie Oosterhuis , Maarten de Rijke

We study reinforcement learning from human feedback in general Markov decision processes, where agents learn from trajectory-level preference comparisons. A central challenge in this setting is to design algorithms that select informative…

Machine Learning · Computer Science 2025-12-05 Andreas Schlaginhaufen , Reda Ouhamma , Maryam Kamgarpour

We consider the problem of learning from revealed preferences in an online setting. In our framework, each period a consumer buys an optimal bundle of goods from a merchant according to her (linear) utility function and current prices,…

Data Structures and Algorithms · Computer Science 2014-12-02 Kareem Amin , Rachel Cummings , Lili Dworkin , Michael Kearns , Aaron Roth
‹ Prev 1 2 3 10 Next ›