Related papers: Unsupervised Human Preference Learning

Investigating on RLHF methodology

In this article, we investigate the alignment of Large Language Models according to human preferences. We discuss the features of training a Preference Model, which simulates human preferences, and the methods and details we found essential…

Machine Learning · Computer Science 2024-10-03 Alexey Kutalev , Sergei Markoff

Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization

Researchers have been studying approaches to steer the behavior of Large Language Models (LLMs) and build personalized LLMs tailored for various applications. While fine-tuning seems to be a direct solution, it requires substantial…

Computation and Language · Computer Science 2024-07-31 Yuanpu Cao , Tianrong Zhang , Bochuan Cao , Ziyi Yin , Lu Lin , Fenglong Ma , Jinghui Chen

A Survey on Human Preference Learning for Large Language Models

The recent surge of versatile large language models (LLMs) largely depends on aligning increasingly capable foundation models with human intentions by preference learning, enhancing LLMs with excellent applicability and effectiveness in a…

Computation and Language · Computer Science 2024-06-19 Ruili Jiang , Kehai Chen , Xuefeng Bai , Zhixuan He , Juntao Li , Muyun Yang , Tiejun Zhao , Liqiang Nie , Min Zhang

LoRe: Personalizing LLMs via Low-Rank Reward Modeling

Personalizing large language models (LLMs) to accommodate diverse user preferences is essential for enhancing alignment and user satisfaction. Traditional reinforcement learning from human feedback (RLHF) approaches often rely on monolithic…

Machine Learning · Computer Science 2025-04-22 Avinandan Bose , Zhihan Xiong , Yuejie Chi , Simon Shaolei Du , Lin Xiao , Maryam Fazel

Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models

Reinforcement learning is used to align language models with human preference signals after first pre-training the model to predict the next token of text within a large corpus using likelihood maximization. Before being deployed in a…

Computation and Language · Computer Science 2024-08-30 Alec Solway

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

While large-scale unsupervised language models (LMs) learn broad world knowledge and some reasoning skills, achieving precise control of their behavior is difficult due to the completely unsupervised nature of their training. Existing…

Machine Learning · Computer Science 2024-07-31 Rafael Rafailov , Archit Sharma , Eric Mitchell , Stefano Ermon , Christopher D. Manning , Chelsea Finn

Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment

Aligning language models with human preferences presents significant challenges, particularly in achieving personalization without incurring excessive computational costs. Existing methods rely on reward signals and additional annotated…

Computation and Language · Computer Science 2025-06-12 Xiaotian Zhang , Ruizhe Chen , Yang Feng , Zuozhu Liu

Personalized Language Modeling from Personalized Human Feedback

Personalized large language models (LLMs) are designed to tailor responses to individual user preferences. While Reinforcement Learning from Human Feedback (RLHF) is a commonly used framework for aligning LLMs with human preferences,…

Computation and Language · Computer Science 2024-12-10 Xinyu Li , Ruiyang Zhou , Zachary C. Lipton , Liu Leqi

A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models

Personalized preference alignment for large language models (LLMs), the process of tailoring LLMs to individual users' preferences, is an emerging research direction spanning the area of NLP and personalization. In this survey, we present…

Computation and Language · Computer Science 2025-04-10 Zhouhang Xie , Junda Wu , Yiran Shen , Yu Xia , Xintong Li , Aaron Chang , Ryan Rossi , Sachin Kumar , Bodhisattwa Prasad Majumder , Jingbo Shang , Prithviraj Ammanabrolu , Julian McAuley

HyPerAlign: Interpretable Personalized LLM Alignment via Hypothesis Generation

Alignment algorithms are widely used to align large language models (LLMs) to human users based on preference annotations. Typically these (often divergent) preferences are aggregated over a diverse set of users, resulting in fine-tuned…

Computation and Language · Computer Science 2025-05-21 Cristina Garbacea , Chenhao Tan

User-Specific Dialogue Generation with User Profile-Aware Pre-Training Model and Parameter-Efficient Fine-Tuning

This paper addresses user-specific dialogs. In contrast to previous research on personalized dialogue focused on achieving virtual user dialogue as defined by persona descriptions, user-specific dialogue aims to reproduce real-user dialogue…

Computation and Language · Computer Science 2024-09-04 Atsushi Otsuka , Kazuya Matsuo , Ryo Ishii , Narichika Nomoto , Hiroaki Sugiyama

A Survey of Personalized Large Language Models: Progress and Future Directions

Large Language Models (LLMs) excel in handling general knowledge tasks, yet they struggle with user-specific personalization, such as understanding individual emotions, writing styles, and preferences. Personalized Large Language Models…

Artificial Intelligence · Computer Science 2025-09-23 Jiahong Liu , Zexuan Qiu , Zhongyang Li , Quanyu Dai , Wenhao Yu , Jieming Zhu , Minda Hu , Menglin Yang , Tat-Seng Chua , Irwin King

Preference Heads in Large Language Models: A Mechanistic Framework for Interpretable Personalization

Large Language Models (LLMs) exhibit strong implicit personalization ability, yet most existing approaches treat this behavior as a black box, relying on prompt engineering or fine tuning on user data. In this work, we adopt a mechanistic…

Computation and Language · Computer Science 2026-04-27 Weixu Zhang , Ye Yuan , Changjiang Han , Yuxing Tian , Zipeng Sun , Linfeng Du , Jikun Kang , Hong Kang , Xue Liu , Haolun Wu

Language Model Personalization via Reward Factorization

Modern large language models (LLMs) are optimized for human-aligned responses using Reinforcement Learning from Human Feedback (RLHF). However, existing RLHF approaches assume a universal preference model and fail to account for individual…

Machine Learning · Computer Science 2025-03-11 Idan Shenfeld , Felix Faltings , Pulkit Agrawal , Aldo Pacchiano

Personas within Parameters: Fine-Tuning Small Language Models with Low-Rank Adapters to Mimic User Behaviors

A long-standing challenge in developing accurate recommendation models is simulating user behavior, mainly due to the complex and stochastic nature of user interactions. Towards this, one promising line of work has been the use of Large…

Information Retrieval · Computer Science 2025-09-15 Himanshu Thakur , Eshani Agrawal , Smruthi Mukund

Personalized Large Language Models

Large language models (LLMs) have significantly advanced Natural Language Processing (NLP) tasks in recent years. However, their universal nature poses limitations in scenarios requiring personalized responses, such as recommendation…

Computation and Language · Computer Science 2024-11-08 Stanisław Woźniak , Bartłomiej Koptyra , Arkadiusz Janz , Przemysław Kazienko , Jan Kocoń

Toward Preference-aligned Large Language Models via Residual-based Model Steering

Preference alignment is a critical step in making Large Language Models (LLMs) useful and aligned with (human) preferences. Existing approaches such as Reinforcement Learning from Human Feedback or Direct Preference Optimization typically…

Computation and Language · Computer Science 2025-09-30 Lucio La Cava , Andrea Tagarelli

PALR: Personalization Aware LLMs for Recommendation

Large language models (LLMs) have recently received significant attention for their exceptional capabilities. Despite extensive efforts in developing general-purpose LLMs that can be utilized in various natural language processing (NLP)…

Information Retrieval · Computer Science 2023-06-08 Fan Yang , Zheng Chen , Ziyan Jiang , Eunah Cho , Xiaojiang Huang , Yanbin Lu

A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications

Large Language Models (LLMs) have demonstrated remarkable capabilities, yet their transition to real-world applications reveals a critical limitation: the inability to adapt to individual preferences while maintaining alignment with…

Computation and Language · Computer Science 2025-05-06 Jian Guan , Junfei Wu , Jia-Nan Li , Chuanqi Cheng , Wei Wu

Pretraining Language Models with Human Preferences

Language models (LMs) are pretrained to imitate internet text, including content that would violate human preferences if generated by an LM: falsehoods, offensive comments, personally identifiable information, low-quality or buggy code, and…

Computation and Language · Computer Science 2023-06-16 Tomasz Korbak , Kejian Shi , Angelica Chen , Rasika Bhalerao , Christopher L. Buckley , Jason Phang , Samuel R. Bowman , Ethan Perez