Related papers: PHOENIX: Open-Source Language Adaption for Direct …

BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization

Large Language Models (LLMs) have become pivotal in advancing natural language processing, yet their potential to perpetuate biases poses significant concerns. This paper introduces a new framework employing Direct Preference Optimization…

Computation and Language · Computer Science 2024-07-22 Ahmed Allam

A Comprehensive Survey of Direct Preference Optimization: Datasets, Theories, Variants, and Applications

With the rapid advancement of large language models (LLMs), aligning policy models with human preferences has become increasingly critical. Direct Preference Optimization (DPO) has emerged as a promising approach for alignment, acting as an…

Artificial Intelligence · Computer Science 2025-07-15 Wenyi Xiao , Zechuan Wang , Leilei Gan , Shuai Zhao , Zongrui Li , Ruirui Lei , Wanggui He , Luu Anh Tuan , Long Chen , Hao Jiang , Zhou Zhao , Fei Wu

A Survey of Direct Preference Optimization

Large Language Models (LLMs) have demonstrated unprecedented generative capabilities, yet their alignment with human values remains critical for ensuring helpful and harmless deployments. While Reinforcement Learning from Human Feedback…

Machine Learning · Computer Science 2025-03-18 Shunyu Liu , Wenkai Fang , Zetian Hu , Junjie Zhang , Yang Zhou , Kongcheng Zhang , Rongcheng Tu , Ting-En Lin , Fei Huang , Mingli Song , Yongbin Li , Dacheng Tao

SGDPO: Self-Guided Direct Preference Optimization for Language Model Alignment

Direct Preference Optimization (DPO) is broadly utilized for aligning Large Language Models (LLMs) with human values because of its flexibility. Despite its effectiveness, it has been observed that the capability of DPO to generate…

Machine Learning · Computer Science 2025-05-20 Wenqiao Zhu , Ji Liu , Lulu Wang , Jun Wu , Yulun Zhang

Empathy by Design: Aligning Large Language Models for Healthcare Dialogue

General-purpose large language models (LLMs) have demonstrated remarkable generative and reasoning capabilities but remain limited in healthcare and caregiving applications due to two key deficiencies: factual unreliability and a lack of…

Computation and Language · Computer Science 2025-12-09 Emre Umucu , Guillermina Solis , Leon Garza , Emilia Rivas , Beatrice Lee , Anantaa Kotal , Aritran Piplai

Diffusion Model Alignment Using Direct Preference Optimization

Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences. In contrast to LLMs, human preference learning has…

Computer Vision and Pattern Recognition · Computer Science 2023-11-23 Bram Wallace , Meihua Dang , Rafael Rafailov , Linqi Zhou , Aaron Lou , Senthil Purushwalkam , Stefano Ermon , Caiming Xiong , Shafiq Joty , Nikhil Naik

$\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$

Direct Preference Optimization (DPO) has emerged as a compelling approach for training Large Language Models (LLMs) to adhere to human preferences. However, the performance of DPO is sensitive to the fine-tuning of its trade-off parameter…

Artificial Intelligence · Computer Science 2024-10-15 Junkang Wu , Yuexiang Xie , Zhengyi Yang , Jiancan Wu , Jinyang Gao , Bolin Ding , Xiang Wang , Xiangnan He

Multi-Reference Preference Optimization for Large Language Models

How can Large Language Models (LLMs) be aligned with human intentions and values? A typical solution is to gather human preference on model outputs and finetune the LLMs accordingly while ensuring that updates do not deviate too far from a…

Computation and Language · Computer Science 2024-05-28 Hung Le , Quan Tran , Dung Nguyen , Kien Do , Saloni Mittal , Kelechi Ogueji , Svetha Venkatesh

Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier

For aligning large language models (LLMs), prior work has leveraged reinforcement learning via human feedback (RLHF) or variations of direct preference optimization (DPO). While DPO offers a simpler framework based on maximum likelihood…

Artificial Intelligence · Computer Science 2025-05-27 Anirudhan Badrinath , Prabhat Agarwal , Jiajing Xu

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

Preference modeling techniques, such as direct preference optimization (DPO), has shown effective in enhancing the generalization abilities of large language model (LLM). However, in tasks involving video instruction-following, providing…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Ruohong Zhang , Liangke Gui , Zhiqing Sun , Yihao Feng , Keyang Xu , Yuanhan Zhang , Di Fu , Chunyuan Li , Alexander Hauptmann , Yonatan Bisk , Yiming Yang

Optimizing LLMs with Direct Preferences: A Data Efficiency Perspective

Aligning the output of Large Language Models (LLMs) with human preferences (e.g., by means of reinforcement learning with human feedback, or RLHF) is essential for ensuring their effectiveness in real-world scenarios. Despite significant…

Artificial Intelligence · Computer Science 2024-10-23 Pietro Bernardelle , Gianluca Demartini

FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment

Aligning large language models (LLMs) with human preferences in federated learning (FL) is challenging due to decentralized, privacy-sensitive, and highly non-IID preference data. Direct Preference Optimization (DPO) offers an efficient…

Machine Learning · Computer Science 2026-03-23 Kewen Zhu , Liping Yi , Zhiming Zhao , Zhuang Qi , Han Yu , Qinghua Hu

DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

The rapid rise of large language models (LLMs) has unlocked many applications but also underscores the challenge of aligning them with diverse values and preferences. Direct Preference Optimization (DPO) is central to alignment but…

Machine Learning · Computer Science 2025-01-22 Amitava Das , Suranjana Trivedy , Danush Khanna , Rajarshi Roy , Gurpreet Singh , Basab Ghosh , Yaswanth Narsupalli , Vinija Jain , Vasu Sharma , Aishwarya Naresh Reganti , Aman Chadha

Benchmarking Direct Preference Optimization for Medical Large Vision-Language Models

Large Vision-Language Models (LVLMs) hold significant promise for medical applications, yet their deployment is often constrained by insufficient alignment and reliability. While Direct Preference Optimization (DPO) has emerged as a potent…

Computer Vision and Pattern Recognition · Computer Science 2026-01-27 Dain Kim , Jiwoo Lee , Jaehoon Yun , Yong Hoe Koo , Qingyu Chen , Hyunjae Kim , Jaewoo Kang

Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization

Aligning large language models with human preferences has emerged as a critical focus in language modeling research. Yet, integrating preference learning into Text-to-Image (T2I) generative models is still relatively uncharted territory.…

Computer Vision and Pattern Recognition · Computer Science 2024-06-11 Yi Gu , Zhendong Wang , Yueqin Yin , Yujia Xie , Mingyuan Zhou

Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment

Direct Preference Optimization (DPO) has become a prominent method for aligning Large Language Models (LLMs) with human preferences. While DPO has enabled significant progress in aligning English LLMs, multilingual preference alignment is…

Computation and Language · Computer Science 2025-06-06 Wen Yang , Junhong Wu , Chen Wang , Chengqing Zong , Jiajun Zhang

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

While large-scale unsupervised language models (LMs) learn broad world knowledge and some reasoning skills, achieving precise control of their behavior is difficult due to the completely unsupervised nature of their training. Existing…

Machine Learning · Computer Science 2024-07-31 Rafael Rafailov , Archit Sharma , Eric Mitchell , Stefano Ermon , Christopher D. Manning , Chelsea Finn

Active Preference Learning for Large Language Models

As large language models (LLMs) become more capable, fine-tuning techniques for aligning with human intent are increasingly important. A key consideration for aligning these models is how to most effectively use human resources, or model…

Machine Learning · Computer Science 2024-07-01 William Muldrew , Peter Hayes , Mingtian Zhang , David Barber

Understanding Reference Policies in Direct Preference Optimization

Direct Preference Optimization (DPO) has become a widely used training method for the instruction fine-tuning of large language models (LLMs). In this work, we explore an under-investigated aspect of DPO - its dependency on the reference…

Computation and Language · Computer Science 2024-08-23 Yixin Liu , Pengfei Liu , Arman Cohan

$f$-PO: Generalizing Preference Optimization with $f$-divergence Minimization

Preference optimization has made significant progress recently, with numerous methods developed to align language models with human preferences. This paper introduces $f$-divergence Preference Optimization ($f$-PO), a novel framework that…

Computation and Language · Computer Science 2025-02-18 Jiaqi Han , Mingjian Jiang , Yuxuan Song , Stefano Ermon , Minkai Xu