Related papers: Direct Judgement Preference Optimization

Beyond Scalar Reward Model: Learning Generative Judge from Preference Data

Learning from preference feedback is a common practice for aligning large language models~(LLMs) with human value. Conventionally, preference data is learned and encoded into a scalar reward model that connects a value head with an LLM to…

Computation and Language · Computer Science 2025-09-03 Ziyi Ye , Xiangsheng Li , Qiuchi Li , Qingyao Ai , Yujia Zhou , Wei Shen , Dong Yan , Yiqun Liu

Think-J: Learning to Think for Generative LLM-as-a-Judge

LLM-as-a-Judge refers to the automatic modeling of preferences for responses generated by Large Language Models (LLMs), which is of significant importance for both LLM evaluation and reward modeling. Although generative LLMs have made…

Computation and Language · Computer Science 2026-01-13 Hui Huang , Yancheng He , Hongli Zhou , Rui Zhang , Wei Liu , Weixun Wang , Jiaheng Liu , Wenbo Su

Improve LLM-as-a-Judge Ability as a General Ability

LLM-as-a-Judge leverages the generative and reasoning capabilities of large language models (LLMs) to evaluate LLM responses across diverse scenarios, providing accurate preference signals. This approach plays a vital role in aligning LLMs…

Computation and Language · Computer Science 2025-09-09 Jiachen Yu , Shaoning Sun , Xiaohui Hu , Jiaxu Yan , Kaidong Yu , Xuelong Li

Leveraging LLMs as Meta-Judges: A Multi-Agent Framework for Evaluating LLM Judgments

Large language models (LLMs) are being widely applied across various fields, but as tasks become more complex, evaluating their responses is increasingly challenging. Compared to human evaluators, the use of LLMs to support performance…

Artificial Intelligence · Computer Science 2025-04-25 Yuran Li , Jama Hussein Mohamud , Chongren Sun , Di Wu , Benoit Boulet

Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization

Large language model (LLM)-based judges are widely adopted for automated evaluation and reward modeling, yet their judgments are often affected by judgment biases. Accurately evaluating these biases is essential for ensuring the reliability…

Computation and Language · Computer Science 2026-03-10 Hongli Zhou , Hui Huang , Rui Zhang , Kehai Chen , Bing Xu , Conghui Zhu , Tiejun Zhao , Muyun Yang

Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling

Large Language Models (LLMs) as automatic evaluators, commonly referred to as LLM-as-a-Judge, have also attracted growing attention. This approach plays a vital role in aligning LLMs with human judgments, providing accurate and reliable…

Computation and Language · Computer Science 2026-04-22 Shuliang Liu , Zhipeng Xu , Zhenghao Liu , Yukun Yan , Minghe Yu , Yu Gu , Chong Chen , Huiyuan Xie , Ge Yu

Play Favorites: A Statistical Method to Measure Self-Bias in LLM-as-a-Judge

Large language models (LLMs) can serve as judges that offer rapid and reliable assessments of other LLM outputs. However, models may systematically assign overly favorable ratings to their own outputs, a phenomenon known as self-bias, which…

Computation and Language · Computer Science 2025-08-12 Evangelia Spiliopoulou , Riccardo Fogliato , Hanna Burnsky , Tamer Soliman , Jie Ma , Graham Horwood , Miguel Ballesteros

Refine-n-Judge: Curating High-Quality Preference Chains for LLM-Fine-Tuning

Large Language Models (LLMs) have demonstrated remarkable progress through preference-based fine-tuning, which critically depends on the quality of the underlying training data. While human feedback is essential for improving data quality,…

Artificial Intelligence · Computer Science 2025-10-31 Derin Cayir , Renjie Tao , Rashi Rungta , Kai Sun , Sean Chen , Haidar Khan , Minseok Kim , Julia Reinspach , Yue Liu

Do LLM Evaluators Prefer Themselves for a Reason?

Large language models (LLMs) are increasingly used as automatic evaluators in applications such as benchmarking, reward modeling, and self-refinement. Prior work highlights a potential self-preference bias where LLMs favor their own…

Computation and Language · Computer Science 2025-12-16 Wei-Lin Chen , Zhepei Wei , Xinyu Zhu , Shi Feng , Yu Meng

Multi-Response Preference Optimization with Augmented Ranking Dataset

Recent advancements in Large Language Models (LLMs) have been remarkable, with new models consistently surpassing their predecessors. These advancements are underpinned by extensive research on various training mechanisms. Among these,…

Computation and Language · Computer Science 2024-12-12 Hansle Gwon , Imjin Ahn , Young-Hak Kim , Sanghyun Park , Tae Joon Jun

Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning

Effective training of language models (LMs) for mathematical reasoning tasks demands high-quality supervised fine-tuning data. Besides obtaining annotations from human experts, a common alternative is sampling from larger and more powerful…

Computation and Language · Computer Science 2024-07-26 Tianduo Wang , Shichen Li , Wei Lu

Reverse Engineering Human Preferences with Reinforcement Learning

The capabilities of Large Language Models (LLMs) are routinely evaluated by other LLMs trained to predict human preferences. This framework--known as LLM-as-a-judge--is highly scalable and relatively low cost. However, it is also vulnerable…

Computation and Language · Computer Science 2026-02-03 Lisa Alazraki , Tan Yi-Chern , Jon Ander Campos , Maximilian Mozes , Marek Rei , Max Bartolo

Generative Judge for Evaluating Alignment

The rapid development of Large Language Models (LLMs) has substantially expanded the range of tasks they can address. In the field of Natural Language Processing (NLP), researchers have shifted their focus from conventional NLP tasks (e.g.,…

Computation and Language · Computer Science 2023-12-08 Junlong Li , Shichao Sun , Weizhe Yuan , Run-Ze Fan , Hai Zhao , Pengfei Liu

Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes

Large Language Models (LLMs) are often used as automated judges to evaluate text, but their effectiveness can be hindered by various unintentional biases. We propose using linear classifying probes, trained by leveraging differences between…

Computation and Language · Computer Science 2025-03-25 Sharan Maiya , Yinhong Liu , Ramit Debnath , Anna Korhonen

Building Math Agents with Multi-Turn Iterative Preference Learning

Recent studies have shown that large language models' (LLMs) mathematical problem-solving capabilities can be enhanced by integrating external tools, such as code interpreters, and employing multi-turn Chain-of-Thought (CoT) reasoning.…

Machine Learning · Computer Science 2025-03-03 Wei Xiong , Chengshuai Shi , Jiaming Shen , Aviv Rosenberg , Zhen Qin , Daniele Calandriello , Misha Khalman , Rishabh Joshi , Bilal Piot , Mohammad Saleh , Chi Jin , Tong Zhang , Tianqi Liu

JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Evaluating Large Language Models (LLMs) in open-ended scenarios is challenging because existing benchmarks and metrics can not measure them comprehensively. To address this problem, we propose to fine-tune LLMs as scalable judges (JudgeLM)…

Computation and Language · Computer Science 2025-03-04 Lianghui Zhu , Xinggang Wang , Xinlong Wang

Self-Preference Bias in LLM-as-a-Judge

Automated evaluation leveraging large language models (LLMs), commonly referred to as LLM evaluators or LLM-as-a-judge, has been widely used in measuring the performance of dialogue systems. However, the self-preference bias in LLMs has…

Computation and Language · Computer Science 2025-06-24 Koki Wataoka , Tsubasa Takahashi , Ryokan Ri

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs

Preference alignment in Large Language Models (LLMs) has significantly improved their ability to adhere to human instructions and intentions. However, existing direct alignment algorithms primarily focus on relative preferences and often…

Machine Learning · Computer Science 2025-05-13 Shenao Zhang , Zhihan Liu , Boyi Liu , Yufeng Zhang , Yingxiang Yang , Yongfei Liu , Liyu Chen , Tao Sun , Zhaoran Wang

Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation

Automatic evaluation by large language models (LLMs) is a prominent topic today; however, judgment and evaluation tasks are often subjective and influenced by various factors, making adaptation challenging. While many studies demonstrate…

Computation and Language · Computer Science 2024-12-11 Javad Seraj , Mohammad Mahdi Mohajeri , Mohammad Javad Dousti , Majid Nili Ahmadabadi

Do Before You Judge: Self-Reference as a Pathway to Better LLM Evaluation

LLM-as-Judge frameworks are increasingly popular for AI evaluation, yet research findings on the relationship between models' generation and judgment abilities remain inconsistent. We investigate this relationship through systematic…

Computation and Language · Computer Science 2025-09-25 Wei-Hsiang Lin , Sheng-Lun Wei , Hen-Hsen Huang , Hsin-Hsi Chen