Related papers: Multi-Objective Preference Optimization: Improving…

Robust Multi-Objective Preference Alignment with Online DPO

Multi-objective preference alignment of large language models (LLMs) is critical for developing AI systems that are more configurable, personalizable, helpful, and safe. However, optimizing model outputs to satisfy diverse objectives with…

Computation and Language · Computer Science 2025-03-04 Raghav Gupta , Ryan Sullivan , Yunxuan Li , Samrat Phatale , Abhinav Rastogi

Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment

Multi-Objective Alignment (MOA) aims to align LLMs' responses with multiple human preference objectives, with Direct Preference Optimization (DPO) emerging as a prominent approach. However, we find that DPO-based MOA approaches suffer from…

Machine Learning · Computer Science 2025-12-09 Moxin Li , Yuantao Zhang , Wenjie Wang , Wentao Shi , Zhuo Liu , Fuli Feng , Tat-Seng Chua

Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization

A single language model, even when aligned with labelers through reinforcement learning from human feedback (RLHF), may not suit all human preferences. Recent approaches therefore prefer customization, gathering multi-dimensional feedback,…

Machine Learning · Computer Science 2024-08-20 Zhanhui Zhou , Jie Liu , Jing Shao , Xiangyu Yue , Chao Yang , Wanli Ouyang , Yu Qiao

MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment

Reinforcement Learning from Human Feedback (RLHF) has shown promise in aligning large language models (LLMs). Yet its reliance on a singular reward model often overlooks the diversity of human preferences. Recent approaches address this…

Computation and Language · Computer Science 2025-07-23 Tianze Wang , Dongnan Gui , Yifan Hu , Shuhang Lin , Linjun Zhang

Interactive Hyperparameter Optimization in Multi-Objective Problems via Preference Learning

Hyperparameter optimization (HPO) is important to leverage the full potential of machine learning (ML). In practice, users are often interested in multi-objective (MO) problems, i.e., optimizing potentially conflicting objectives, like…

Machine Learning · Computer Science 2024-01-12 Joseph Giovanelli , Alexander Tornede , Tanja Tornede , Marius Lindauer

Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models

Reinforcement Learning from Human Feedback (RLHF) has emerged as a powerful technique for aligning large language models (LLMs) with human preferences. However, effectively aligning LLMs with diverse human preferences remains a significant…

Computation and Language · Computer Science 2025-07-03 Chengao Li , Hanyu Zhang , Yunkun Xu , Hongyan Xue , Xiang Ao , Qing He

PEO: Improving Bi-Factorial Preference Alignment with Post-Training Policy Extrapolation

The alignment of large language models with human values presents a critical challenge, particularly when balancing conflicting objectives like helpfulness and harmlessness. Existing approaches, such as Reinforcement Learning from Human…

Computation and Language · Computer Science 2025-03-04 Yuxuan Liu

MPPO: Multi Pair-wise Preference Optimization for LLMs with Arbitrary Negative Samples

Aligning Large Language Models (LLMs) with human feedback is crucial for their development. Existing preference optimization methods such as DPO and KTO, while improved based on Reinforcement Learning from Human Feedback (RLHF), are…

Computation and Language · Computer Science 2024-12-23 Shuo Xie , Fangzhi Zhu , Jiahui Wang , Lulu Wen , Wei Dai , Xiaowei Chen , Junxiong Zhu , Kai Zhou , Bo Zheng

One Model for All: Multi-Objective Controllable Language Models

Aligning large language models (LLMs) with human preferences is critical for enhancing LLMs' safety, helpfulness, humor, faithfulness, etc. Current reinforcement learning from human feedback (RLHF) mainly focuses on a fixed reward learned…

Machine Learning · Computer Science 2026-04-07 Qiang He , Yucheng Yang , Tianyi Zhou , Meng Fang , Mykola Pechenizkiy , Setareh Maghsudi

Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model

Large Language Models (LLMs) have become increasingly popular due to their ability to process and generate natural language. However, as they are trained on massive datasets of text, LLMs can inherit harmful biases and produce outputs that…

Computation and Language · Computer Science 2025-01-23 Qi Gou , Cam-Tu Nguyen

MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge

As the era of large language models (LLMs) unfolds, Preference Optimization (PO) methods have become a central approach to aligning LLMs with human preferences and improving performance. We propose Maximum a Posteriori Preference…

Machine Learning · Computer Science 2026-05-11 Guangchen Lan , Sipeng Zhang , Tianle Wang , Yuwei Zhang , Daoan Zhang , Xinpeng Wei , Xiaoman Pan , Hongming Zhang , Dong-Jun Han , Christopher G. Brinton

Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier

For aligning large language models (LLMs), prior work has leveraged reinforcement learning via human feedback (RLHF) or variations of direct preference optimization (DPO). While DPO offers a simpler framework based on maximum likelihood…

Artificial Intelligence · Computer Science 2025-05-27 Anirudhan Badrinath , Prabhat Agarwal , Jiajing Xu

Multi-Reference Preference Optimization for Large Language Models

How can Large Language Models (LLMs) be aligned with human intentions and values? A typical solution is to gather human preference on model outputs and finetune the LLMs accordingly while ensuring that updates do not deviate too far from a…

Computation and Language · Computer Science 2024-05-28 Hung Le , Quan Tran , Dung Nguyen , Kien Do , Saloni Mittal , Kelechi Ogueji , Svetha Venkatesh

Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment…

Computation and Language · Computer Science 2024-10-14 Yiju Guo , Ganqu Cui , Lifan Yuan , Ning Ding , Zexu Sun , Bowen Sun , Huimin Chen , Ruobing Xie , Jie Zhou , Yankai Lin , Zhiyuan Liu , Maosong Sun

M3PO: Multimodal-Model-Guided Preference Optimization for Visual Instruction Following

Large Vision-Language Models (LVLMs) hold immense potential for complex multimodal instruction following, yet their development is often hindered by the high cost and inconsistency of human annotation required for effective fine-tuning and…

Computation and Language · Computer Science 2025-08-19 Ruirui Gao , Emily Johnson , Bowen Tan , Yanfei Qian

Multi-Objective Alignment of Large Language Models Through Hypervolume Maximization

Multi-objective alignment from human feedback (MOAHF) in large language models (LLMs) is a challenging problem as human preferences are complex, multifaceted, and often conflicting. Recent works on MOAHF considered a-priori multi-objective…

Machine Learning · Computer Science 2024-12-10 Subhojyoti Mukherjee , Anusha Lalitha , Sailik Sengupta , Aniket Deshmukh , Branislav Kveton

Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment

The rapid development of large language model (LLM) alignment algorithms has resulted in a complex and fragmented landscape, with limited clarity on the effectiveness of different methods and their inter-connections. This paper introduces…

Machine Learning · Computer Science 2025-02-11 Shengyang Sun , Yian Zhang , Alexander Bukharin , David Mosallanezhad , Jiaqi Zeng , Soumye Singhal , Gerald Shen , Adithya Renduchintala , Tugrul Konuk , Yi Dong , Zhilin Wang , Dmitry Chichkov , Olivier Delalleau , Oleksii Kuchaiev

Accelerated Preference Optimization for Large Language Model Alignment

Reinforcement Learning from Human Feedback (RLHF) has emerged as a pivotal tool for aligning large language models (LLMs) with human preferences. Direct Preference Optimization (DPO), one of the most popular approaches, formulates RLHF as a…

Machine Learning · Computer Science 2024-10-10 Jiafan He , Huizhuo Yuan , Quanquan Gu

Preference-Guided Multi-Objective UI Adaptation

3D Mixed Reality interfaces have nearly unlimited space for layout placement, making automatic UI adaptation crucial for enhancing the user experience. Such adaptation is often formulated as a multi-objective optimization (MOO) problem,…

Human-Computer Interaction · Computer Science 2025-09-24 Yao Song , Christoph Gebhardt , Yi-Chi Liao , Christian Holz

Pareto Multi-Objective Alignment for Language Models

Large language models (LLMs) are increasingly deployed in real-world applications that require careful balancing of multiple, often conflicting, objectives, such as informativeness versus conciseness, or helpfulness versus creativity.…

Machine Learning · Computer Science 2025-08-12 Qiang He , Setareh Maghsudi