Related papers: Optimizing Algorithms From Pairwise User Preferenc…

POLAR: Preference Optimization and Learning Algorithms for Robotics

Parameter tuning for robotic systems is a time-consuming and challenging task that often relies on domain expertise of the human operator. Moreover, existing learning methods are not well suited for parameter tuning for many reasons…

Robotics · Computer Science 2022-08-10 Maegan Tucker , Kejun Li , Yisong Yue , Aaron D. Ames

Improving User Experience in Preference-Based Optimization of Reward Functions for Assistive Robots

Assistive robots interact with humans and must adapt to different users' preferences to be effective. An easy and effective technique to learn non-expert users' preferences is through rankings of robot behaviors, for example, robot movement…

Robotics · Computer Science 2024-11-19 Nathaniel Dennler , Zhonghao Shi , Stefanos Nikolaidis , Maja Matarić

Social Robot Navigation through Constrained Optimization: a Comparative Study of Uncertainty-based Objectives and Constraints

This work is dedicated to the study of how uncertainty estimation of the human motion prediction can be embedded into constrained optimization techniques, such as Model Predictive Control (MPC) for the social robot navigation. We propose…

Robotics · Computer Science 2023-07-19 Timur Akhtyamov , Aleksandr Kashirin , Aleksey Postnikov , Gonzalo Ferrer

Learning Submodular Objectives for Team Environmental Monitoring

In this paper, we study the well-known team orienteering problem where a fleet of robots collects rewards by visiting locations. Usually, the rewards are assumed to be known to the robots; however, in applications such as environmental…

Robotics · Computer Science 2021-12-16 Nils Wilde , Armin Sadeghi , Stephen L. Smith

Improving through Interaction: Searching Behavioral Representation Spaces with CMA-ES-IG

Robots that interact with humans must adapt to individual users' preferences to operate effectively in human-centered environments. An intuitive and effective technique to learn non-expert users' preferences is through rankings of robot…

Robotics · Computer Science 2026-03-11 Nathaniel Dennler , Zhonghao Shi , Yiran Tao , Andreea Bobu , Stefanos Nikolaidis , Maja Matarić

Simulation-Aided Policy Tuning for Black-Box Robot Learning

How can robots learn and adapt to new tasks and situations with little data? Systematic exploration and simulation are crucial tools for efficient robot learning. We present a novel black-box policy search algorithm focused on…

Robotics · Computer Science 2025-02-11 Shiming He , Alexander von Rohr , Dominik Baumann , Ji Xiang , Sebastian Trimpe

Preference Completion: Large-scale Collaborative Ranking from Pairwise Comparisons

In this paper we consider the collaborative ranking setting: a pool of users each provides a small number of pairwise preferences between $d$ possible items; from these we need to predict preferences of the users for items they have not yet…

Machine Learning · Statistics 2015-07-17 Dohyung Park , Joe Neeman , Jin Zhang , Sujay Sanghavi , Inderjit S. Dhillon

Incorporating Human Path Preferences in Robot Navigation with Minimal Interventions

Robots that can effectively understand human intentions from actions are crucial for successful human-robot collaboration. In this work, we address the challenge of a robot navigating towards an unknown goal while also accounting for a…

Robotics · Computer Science 2023-03-17 Oriana Peltzer , Dylan M. Asmar , Mac Schwager , Mykel J. Kochenderfer

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input

Humans use social context to specify preferences over behaviors, i.e. their reward functions. Yet, algorithms for inferring reward models from preference data do not take this social learning view into account. Inspired by pragmatic human…

Machine Learning · Computer Science 2024-05-24 Andi Peng , Yuying Sun , Tianmin Shu , David Abel

Pareto-Optimal Learning from Preferences with Hidden Context

Ensuring AI models align with human values is essential for their safety and functionality. Reinforcement learning from human feedback (RLHF) leverages human preferences to achieve this alignment. However, when preferences are sourced from…

Machine Learning · Computer Science 2025-02-10 Ryan Bahlous-Boldi , Li Ding , Lee Spector , Scott Niekum

Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making

In multi-objective decision planning and learning, much attention is paid to producing optimal solution sets that contain an optimal policy for every possible user preference profile. We argue that the step that follows, i.e, determining…

Machine Learning · Computer Science 2018-02-22 Luisa M Zintgraf , Diederik M Roijers , Sjoerd Linders , Catholijn M Jonker , Ann Nowé

Comparing Covariate Prioritization via Matching to Machine Learning Methods for Causal Inference using Five Empirical Applications

When investigators seek to estimate causal effects, they often assume that selection into treatment is based only on observed covariates. Under this identification strategy, analysts must adjust for observed confounders. While basic…

Applications · Statistics 2019-01-09 Luke Keele , Dylan Small

Dimensionality Reduction and Prioritized Exploration for Policy Search

Black-box policy optimization is a class of reinforcement learning algorithms that explores and updates the policies at the parameter level. This class of algorithms is widely applied in robotics with movement primitives or…

Machine Learning · Computer Science 2022-03-22 Marius Memmel , Puze Liu , Davide Tateo , Jan Peters

Direct Preference Optimization with Rating Information: Practical Algorithms and Provable Gains

The class of direct preference optimization (DPO) algorithms has emerged as a promising approach for solving the alignment problem in foundation models. These algorithms work with very limited feedback in the form of pairwise preferences…

Machine Learning · Computer Science 2026-02-03 Luca Viano , Ruida Zhou , Yifan Sun , Mahdi Namazifar , Volkan Cevher , Shoham Sabach , Mohammad Ghavamzadeh

CatCMA : Stochastic Optimization for Mixed-Category Problems

Black-box optimization problems often require simultaneously optimizing different types of variables, such as continuous, integer, and categorical variables. Unlike integer variables, categorical variables do not necessarily have a…

Neural and Evolutionary Computing · Computer Science 2025-05-22 Ryoki Hamano , Shota Saito , Masahiro Nomura , Kento Uchida , Shinichi Shirakawa

Robot Design With Neural Networks, MILP Solvers and Active Learning

Central to the design of many robot systems and their controllers is solving a constrained blackbox optimization problem. This paper presents CNMA, a new method of solving this problem that is conservative in the number of potentially…

Artificial Intelligence · Computer Science 2021-02-09 Sanjai Narain , Emily Mak , Dana Chee , Todd Huster , Jeremy Cohen , Kishore Pochiraju , Brendan Englot , Niraj K. Jha , Karthik Narayan

Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction

Humans can leverage physical interaction to teach robot arms. This physical interaction takes multiple forms depending on the task, the user, and what the robot has learned so far. State-of-the-art approaches focus on learning from a single…

Robotics · Computer Science 2024-01-11 Shaunak A. Mehta , Dylan P. Losey

Collaboratively Learning Preferences from Ordinal Data

In applications such as recommendation systems and revenue management, it is important to predict preferences on items that have not been seen by a user or predict outcomes of comparisons among those that have never been compared. A popular…

Machine Learning · Computer Science 2015-06-29 Sewoong Oh , Kiran K. Thekumparampil , Jiaming Xu

Projective Preferential Bayesian Optimization

Bayesian optimization is an effective method for finding extrema of a black-box function. We propose a new type of Bayesian optimization for learning user preferences in high-dimensional spaces. The central assumption is that the underlying…

Machine Learning · Statistics 2020-08-17 Petrus Mikkola , Milica Todorović , Jari Järvi , Patrick Rinke , Samuel Kaski

Regret-Aware Black-Box Optimization with Natural Gradients, Trust-Regions and Entropy Control

Most successful stochastic black-box optimizers, such as CMA-ES, use rankings of the individual samples to obtain a new search distribution. Yet, the use of rankings also introduces several issues such as the underlying optimization…

Machine Learning · Statistics 2022-06-14 Maximilian Hüttenrauch , Gerhard Neumann