English
Related papers

Related papers: GLISp-r: A preference-based optimization algorithm…

200 papers

Preference-based global optimization algorithms minimize an unknown objective function only based on whether the function is better, worse, or similar for given pairs of candidate optimization vectors. Such optimization problems arise in…

Optimization and Control · Mathematics 2021-12-21 Mengjia Zhu , Dario Piga , Alberto Bemporad

Human-in-the-loop calibration is often addressed via preference-based optimization, where algorithms learn from pairwise comparisons rather than explicit cost evaluations. While effective, methods such as Preferential Bayesian Optimization…

Machine Learning · Computer Science 2025-11-10 Matteo Cercola , Michele Lomuscio , Dario Piga , Simone Formentin

Automating the calibration of the parameters of a control policy by means of global optimization requires quantifying a closed-loop performance function. As this can be impractical in many situations, in this paper we suggest a…

Optimization and Control · Mathematics 2021-05-27 Mengjia Zhu , Alberto Bemporad , Dario Piga

Black-box and preference-based optimization algorithms are global optimization procedures that aim to find the global solutions of an optimization problem using, respectively, the least amount of function evaluations or sample comparisons…

Optimization and Control · Mathematics 2022-02-04 Davide Previtali , Mirko Mazzoleni , Antonio Ferramosca , Fabio Previdi

This paper proposes a method for solving optimization problems in which the decision-maker cannot evaluate the objective function, but rather can only express a preference such as "this is better than that" between two candidate decision…

Machine Learning · Computer Science 2019-10-01 Alberto Bemporad , Dario Piga

Black-box optimization refers to the optimization problem whose objective function and/or constraint sets are either unknown, inaccessible, or non-existent. In many applications, especially with the involvement of humans, the only way to…

In interactive systems, feedback is often provided in the form of preference between queried options rather than precise scores, which motivates optimization methods to learn from such comparisons. In this work, we propose a…

Optimization and Control · Mathematics 2025-12-23 Siyi Wang , Zifan Wang , Karl Henrik Johanssson

In this paper, we construct and compare algorithmic approaches to solve the Preference Consistency Problem for preference statements based on hierarchical models. Instances of this problem contain a set of preference statements that are…

Logic in Computer Science · Computer Science 2024-11-01 Anne-Marie George , Nic Wilson , Barry O'Sullivan

We consider learning problems of an intuitive and concise preference model, called lexicographic preference lists (LP-lists). Given a set of examples that are pairwise ordinal preferences over a universe of objects built of attributes of…

Artificial Intelligence · Computer Science 2019-09-20 Ahmed Moussa , Xudong Liu

Preference Inference involves inferring additional user preferences from elicited or observed preferences, based on assumptions regarding the form of the user's preference relation. In this paper we consider a situation in which…

Logic in Computer Science · Computer Science 2024-09-18 Nic Wilson , Anne-Marie George , Barry O'Sullivan

In multi-objective decision planning and learning, much attention is paid to producing optimal solution sets that contain an optimal policy for every possible user preference profile. We argue that the step that follows, i.e, determining…

Machine Learning · Computer Science 2018-02-22 Luisa M Zintgraf , Diederik M Roijers , Sjoerd Linders , Catholijn M Jonker , Ann Nowé

Preference alignment methods are increasingly critical for steering large language models (LLMs) to generate outputs consistent with human values. While recent approaches often rely on synthetic data generated by LLMs for scalability and…

Computation and Language · Computer Science 2025-10-21 Mingye Zhu , Yi Liu , Zheren Fu , Yongdong Zhang , Zhendong Mao

Preference modelling lies at the intersection of economics, decision theory, machine learning and statistics. By understanding individuals' preferences and how they make choices, we can build products that closely match their expectations,…

Machine Learning · Computer Science 2026-05-19 Alessio Benavoli , Dario Azzimonti

Offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. Typically, preference optimization is approached as an offline supervised learning task using manually-crafted…

Machine Learning · Computer Science 2024-11-05 Chris Lu , Samuel Holt , Claudio Fanconi , Alex J. Chan , Jakob Foerster , Mihaela van der Schaar , Robert Tjarko Lange

The class of direct preference optimization (DPO) algorithms has emerged as a promising approach for solving the alignment problem in foundation models. These algorithms work with very limited feedback in the form of pairwise preferences…

Machine Learning · Computer Science 2026-02-03 Luca Viano , Ruida Zhou , Yifan Sun , Mahdi Namazifar , Volkan Cevher , Shoham Sabach , Mohammad Ghavamzadeh

Optimization problems involving mixed variables (i.e., variables of numerical and categorical nature) can be challenging to solve, especially in the presence of mixed-variable constraints. Moreover, when the objective function is the result…

Optimization and Control · Mathematics 2024-12-12 Mengjia Zhu , Alberto Bemporad

Reinforcement Learning (RL) has emerged as a powerful tool for neural combinatorial optimization, enabling models to learn heuristics that solve complex problems without requiring expert knowledge. Despite significant progress, existing RL…

Machine Learning · Computer Science 2025-05-14 Mingjun Pan , Guanquan Lin , You-Wei Luo , Bin Zhu , Zhien Dai , Lijun Sun , Chun Yuan

It is challenging to quantify numerical preferences for different objectives in a multi-objective decision-making problem. However, the demonstrations of a user are often accessible. We propose an algorithm to infer linear preference…

Artificial Intelligence · Computer Science 2023-04-28 Junlin Lu

Preference alignment is pivotal for empowering large language models (LLMs) to generate helpful and harmless responses. However, the performance of preference alignment is highly sensitive to the prevalent noise in the preference data.…

Machine Learning · Computer Science 2024-05-29 Xize Liang , Chao Chen , Shuang Qiu , Jie Wang , Yue Wu , Zhihang Fu , Zhihao Shi , Feng Wu , Jieping Ye

In this note, we examine the aggregation of preferences achieved by the Group Policy Optimisation (GRPO) algorithm, a reinforcement learning method used to train advanced artificial intelligence models such as DeepSeek-R1-Zero and…

Machine Learning · Computer Science 2025-03-14 Milan Vojnovic , Se-Young Yun
‹ Prev 1 2 3 10 Next ›