English
Related papers

Related papers: Multi-Objective Preference Optimization: Improving…

200 papers

Multi-objective preference alignment of large language models (LLMs) is critical for developing AI systems that are more configurable, personalizable, helpful, and safe. However, optimizing model outputs to satisfy diverse objectives with…

Computation and Language · Computer Science 2025-03-04 Raghav Gupta , Ryan Sullivan , Yunxuan Li , Samrat Phatale , Abhinav Rastogi

Multi-Objective Alignment (MOA) aims to align LLMs' responses with multiple human preference objectives, with Direct Preference Optimization (DPO) emerging as a prominent approach. However, we find that DPO-based MOA approaches suffer from…

Machine Learning · Computer Science 2025-12-09 Moxin Li , Yuantao Zhang , Wenjie Wang , Wentao Shi , Zhuo Liu , Fuli Feng , Tat-Seng Chua

A single language model, even when aligned with labelers through reinforcement learning from human feedback (RLHF), may not suit all human preferences. Recent approaches therefore prefer customization, gathering multi-dimensional feedback,…

Machine Learning · Computer Science 2024-08-20 Zhanhui Zhou , Jie Liu , Jing Shao , Xiangyu Yue , Chao Yang , Wanli Ouyang , Yu Qiao

Reinforcement Learning from Human Feedback (RLHF) has shown promise in aligning large language models (LLMs). Yet its reliance on a singular reward model often overlooks the diversity of human preferences. Recent approaches address this…

Computation and Language · Computer Science 2025-07-23 Tianze Wang , Dongnan Gui , Yifan Hu , Shuhang Lin , Linjun Zhang

Hyperparameter optimization (HPO) is important to leverage the full potential of machine learning (ML). In practice, users are often interested in multi-objective (MO) problems, i.e., optimizing potentially conflicting objectives, like…

Machine Learning · Computer Science 2024-01-12 Joseph Giovanelli , Alexander Tornede , Tanja Tornede , Marius Lindauer

Reinforcement Learning from Human Feedback (RLHF) has emerged as a powerful technique for aligning large language models (LLMs) with human preferences. However, effectively aligning LLMs with diverse human preferences remains a significant…

Computation and Language · Computer Science 2025-07-03 Chengao Li , Hanyu Zhang , Yunkun Xu , Hongyan Xue , Xiang Ao , Qing He

The alignment of large language models with human values presents a critical challenge, particularly when balancing conflicting objectives like helpfulness and harmlessness. Existing approaches, such as Reinforcement Learning from Human…

Computation and Language · Computer Science 2025-03-04 Yuxuan Liu

Aligning Large Language Models (LLMs) with human feedback is crucial for their development. Existing preference optimization methods such as DPO and KTO, while improved based on Reinforcement Learning from Human Feedback (RLHF), are…

Computation and Language · Computer Science 2024-12-23 Shuo Xie , Fangzhi Zhu , Jiahui Wang , Lulu Wen , Wei Dai , Xiaowei Chen , Junxiong Zhu , Kai Zhou , Bo Zheng

Aligning large language models (LLMs) with human preferences is critical for enhancing LLMs' safety, helpfulness, humor, faithfulness, etc. Current reinforcement learning from human feedback (RLHF) mainly focuses on a fixed reward learned…

Machine Learning · Computer Science 2026-04-07 Qiang He , Yucheng Yang , Tianyi Zhou , Meng Fang , Mykola Pechenizkiy , Setareh Maghsudi

Large Language Models (LLMs) have become increasingly popular due to their ability to process and generate natural language. However, as they are trained on massive datasets of text, LLMs can inherit harmful biases and produce outputs that…

Computation and Language · Computer Science 2025-01-23 Qi Gou , Cam-Tu Nguyen

As the era of large language models (LLMs) unfolds, Preference Optimization (PO) methods have become a central approach to aligning LLMs with human preferences and improving performance. We propose Maximum a Posteriori Preference…

For aligning large language models (LLMs), prior work has leveraged reinforcement learning via human feedback (RLHF) or variations of direct preference optimization (DPO). While DPO offers a simpler framework based on maximum likelihood…

Artificial Intelligence · Computer Science 2025-05-27 Anirudhan Badrinath , Prabhat Agarwal , Jiajing Xu

How can Large Language Models (LLMs) be aligned with human intentions and values? A typical solution is to gather human preference on model outputs and finetune the LLMs accordingly while ensuring that updates do not deviate too far from a…

Computation and Language · Computer Science 2024-05-28 Hung Le , Quan Tran , Dung Nguyen , Kien Do , Saloni Mittal , Kelechi Ogueji , Svetha Venkatesh

Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment…

Computation and Language · Computer Science 2024-10-14 Yiju Guo , Ganqu Cui , Lifan Yuan , Ning Ding , Zexu Sun , Bowen Sun , Huimin Chen , Ruobing Xie , Jie Zhou , Yankai Lin , Zhiyuan Liu , Maosong Sun

Large Vision-Language Models (LVLMs) hold immense potential for complex multimodal instruction following, yet their development is often hindered by the high cost and inconsistency of human annotation required for effective fine-tuning and…

Computation and Language · Computer Science 2025-08-19 Ruirui Gao , Emily Johnson , Bowen Tan , Yanfei Qian

Multi-objective alignment from human feedback (MOAHF) in large language models (LLMs) is a challenging problem as human preferences are complex, multifaceted, and often conflicting. Recent works on MOAHF considered a-priori multi-objective…

Machine Learning · Computer Science 2024-12-10 Subhojyoti Mukherjee , Anusha Lalitha , Sailik Sengupta , Aniket Deshmukh , Branislav Kveton

The rapid development of large language model (LLM) alignment algorithms has resulted in a complex and fragmented landscape, with limited clarity on the effectiveness of different methods and their inter-connections. This paper introduces…

Reinforcement Learning from Human Feedback (RLHF) has emerged as a pivotal tool for aligning large language models (LLMs) with human preferences. Direct Preference Optimization (DPO), one of the most popular approaches, formulates RLHF as a…

Machine Learning · Computer Science 2024-10-10 Jiafan He , Huizhuo Yuan , Quanquan Gu

3D Mixed Reality interfaces have nearly unlimited space for layout placement, making automatic UI adaptation crucial for enhancing the user experience. Such adaptation is often formulated as a multi-objective optimization (MOO) problem,…

Human-Computer Interaction · Computer Science 2025-09-24 Yao Song , Christoph Gebhardt , Yi-Chi Liao , Christian Holz

Large language models (LLMs) are increasingly deployed in real-world applications that require careful balancing of multiple, often conflicting, objectives, such as informativeness versus conciseness, or helpfulness versus creativity.…

Machine Learning · Computer Science 2025-08-12 Qiang He , Setareh Maghsudi
‹ Prev 1 2 3 10 Next ›