English
Related papers

Related papers: ROPO: Robust Preference Optimization for Large Lan…

200 papers

Instruction following (IF) is a critical capability for large language models (LLMs). However, handling complex instructions with multiple constraints remains challenging. Previous methods typically select preference pairs based on the…

Computation and Language · Computer Science 2025-05-29 Xiang Huang , Ting-En Lin , Feiteng Fang , Yuchuan Wu , Hangyu Li , Yuzhong Qu , Fei Huang , Yongbin Li

This study addresses the challenge of noise in training datasets for Direct Preference Optimization (DPO), a method for aligning Large Language Models (LLMs) with human preferences. We categorize noise into pointwise noise, which includes…

Machine Learning · Computer Science 2025-04-21 Junkang Wu , Yuexiang Xie , Zhengyi Yang , Jiancan Wu , Jiawei Chen , Jinyang Gao , Bolin Ding , Xiang Wang , Xiangnan He

Direct Preference Optimization (DPO) has become a popular method for fine-tuning large language models (LLMs) due to its stability and simplicity. However, it is also known to be sensitive to noise in the data and prone to overfitting.…

Machine Learning · Computer Science 2025-10-28 Cheol Woo Kim , Shresth Verma , Mauricio Tec , Milind Tambe

Standard human preference-based alignment methods, such as Reinforcement Learning from Human Feedback (RLHF), are a cornerstone for aligning large language models (LLMs) with human values. However, these methods typically assume that…

Artificial Intelligence · Computer Science 2026-03-02 Xiaoyang Cao , Zelai Xu , Mo Guang , Kaiwen Long , Michiel A. Bakker , Yu Wang , Chao Yu

The rapid development of large language model (LLM) alignment algorithms has resulted in a complex and fragmented landscape, with limited clarity on the effectiveness of different methods and their inter-connections. This paper introduces…

Large Language Models (LLMs) have made significant strides in generating human-like responses, largely due to preference alignment techniques. However, these methods often assume unbiased human feedback, which is rarely the case in…

Machine Learning · Computer Science 2025-09-16 Amirabbas Afzali , Amirhossein Afsharrad , Seyed Shahabeddin Mousavi , Sanjay Lall

In the field of large language models (LLMs), aligning models with the diverse preferences of users is a critical challenge. Direct Preference Optimization (DPO) has played a key role in this area. It works by using pairs of preferences…

Computation and Language · Computer Science 2024-05-29 Yueqin Yin , Zhendong Wang , Yi Gu , Hai Huang , Weizhu Chen , Mingyuan Zhou

Large language models (LLMs) have shown great potential in natural language processing tasks, but their application to machine translation (MT) remains challenging due to pretraining on English-centric data and the complexity of…

Computation and Language · Computer Science 2025-01-24 Guofeng Cui , Pichao Wang , Yang Liu , Zemian Ke , Zhu Liu , Vimal Bhat

Optimizing policies based on human preferences is key to aligning language models with human intent. This work focuses on reward modeling, a core component in reinforcement learning from human feedback (RLHF), and offline preference…

Machine Learning · Computer Science 2025-06-02 Soichiro Nishimori , Yu-Jie Zhang , Thanawat Lodkaew , Masashi Sugiyama

Direct alignment methods are increasingly used for aligning large language models (LLMs) with human preferences. However, these methods suffer from the issues of verbosity and likelihood displacement, which can be driven by the noisy…

Computation and Language · Computer Science 2025-10-28 Peter Chen , Xi Chen , Wotao Yin , Tianyi Lin

While Retrieval-Augmented Generation (RAG) has exhibited promise in utilizing external knowledge, its generation process heavily depends on the quality and accuracy of the retrieved context. Large language models (LLMs) struggle to evaluate…

Computation and Language · Computer Science 2025-10-13 Shi-Qi Yan , Quan Liu , Zhen-Hua Ling

Despite the efficacy of Direct Preference Optimization (DPO) in aligning Large Language Models (LLMs), reward hacking remains a pivotal challenge. This issue emerges when LLMs excessively reduce the probability of rejected completions to…

Computation and Language · Computer Science 2025-08-26 Chenxu Yang , Ruipeng Jia , Mingyu Zheng , Naibin Gu , Zheng Lin , Siyuan Chen , Weichong Yin , Hua Wu , Weiping Wang

Direct Preference Optimisation (DPO) has emerged as a powerful method for aligning Large Language Models (LLMs) with human preferences, offering a stable and efficient alternative to approaches that use Reinforcement learning via Human…

Artificial Intelligence · Computer Science 2025-05-06 Sarvesh Shashidhar , Ritik , Nachiketa Patil , Suraj Racha , Ganesh Ramakrishnan

Aligning large language models (LLMs) with human preferences is critical for real-world deployment, yet existing methods like RLHF face computational and stability challenges. While DPO establishes an offline paradigm with single…

Machine Learning · Computer Science 2025-10-28 Junkang Wu , Kexin Huang , Xue Wang , Jinyang Gao , Bolin Ding , Jiancan Wu , Xiangnan He , Xiang Wang

Learning from preference-based feedback has recently gained traction as a promising approach to align language models with human interests. While these aligned generative models have demonstrated impressive capabilities across various…

Machine Learning · Computer Science 2024-04-15 Sayak Ray Chowdhury , Anush Kini , Nagarajan Natarajan

Iterative preference optimization has recently become one of the de-facto training paradigms for large language models (LLMs), but the performance is still underwhelming due to too much noisy preference data yielded in the loop. To combat…

Computation and Language · Computer Science 2024-09-18 Jianing Wang , Yang Zhou , Xiaocheng Zhang , Mengjiao Bao , Peng Yan

Direct Preference Optimization (DPO) has emerged as a promising approach for aligning large language models with human preferences. While prior work mainly extends DPO from the aspect of the objective function, we instead improve DPO from…

Machine Learning · Computer Science 2026-02-17 Xun Deng , Han Zhong , Rui Ai , Fuli Feng , Zheng Wang , Xiangnan He

As large language models (LLMs) advance their capabilities, aligning these models with human preferences has become crucial. Preference optimization, which trains models to distinguish between preferred and non-preferred responses based on…

Machine Learning · Computer Science 2026-02-02 Shawn Im , Sharon Li

Large Language Models (LLMs) have become increasingly popular due to their ability to process and generate natural language. However, as they are trained on massive datasets of text, LLMs can inherit harmful biases and produce outputs that…

Computation and Language · Computer Science 2025-01-23 Qi Gou , Cam-Tu Nguyen

Preference optimization for diffusion models aims to align them with human preferences for images. Previous methods typically use Vision-Language Models (VLMs) as pixel-level reward models to approximate human preferences. However, when…

Computer Vision and Pattern Recognition · Computer Science 2025-10-03 Tao Zhang , Cheng Da , Kun Ding , Huan Yang , Kun Jin , Yan Li , Tingting Gao , Di Zhang , Shiming Xiang , Chunhong Pan
‹ Prev 1 2 3 10 Next ›