English
Related papers

Related papers: Preference-aware Influence-function-based Data Sel…

200 papers

Alignment, endowing a pre-trained Large language model (LLM) with the ability to follow instructions, is crucial for its real-world applications. Conventional supervised fine-tuning (SFT) methods formalize it as causal language modeling…

Computation and Language · Computer Science 2024-12-18 Yuchen Fan , Yuzhong Hong , Qiushi Wang , Junwei Bao , Hongfei Jiang , Yang Song

Leveraging Large Language Models (LLMs) for recommendation has recently garnered considerable attention, where fine-tuning plays a key role in LLMs' adaptation. However, the cost of fine-tuning LLMs on rapidly expanding recommendation data…

Information Retrieval · Computer Science 2024-06-05 Xinyu Lin , Wenjie Wang , Yongqi Li , Shuo Yang , Fuli Feng , Yinwei Wei , Tat-Seng Chua

Visual instruction tuning adapts pre-trained Multimodal Large Language Models (MLLMs) to follow human instructions for real-world applications. However, the rapid growth of these datasets introduces significant redundancy, leading to…

Computer Vision and Pattern Recognition · Computer Science 2026-01-14 Jinhe Bi , Aniri , Yifan Wang , Danqi Yan , Wenke Huang , Zengjie Jin , Xiaowen Ma , Sikuan Yan , Artur Hecker , Mang Ye , Xun Xiao , Hinrich Schuetze , Volker Tresp , Yunpu Ma

Effective data selection is critical for efficient training of modern Large Language Models (LLMs). This paper introduces Influence Distillation, a novel, mathematically-justified framework for data selection that employs second-order…

Computation and Language · Computer Science 2025-05-27 Mahdi Nikdan , Vincent Cohen-Addad , Dan Alistarh , Vahab Mirrokni

Preference learning provides a promising solution to address the limitations of supervised fine-tuning (SFT) for code language models, where the model is not explicitly trained to differentiate between correct and incorrect code. Recent…

Computation and Language · Computer Science 2024-10-15 Dylan Zhang , Shizhe Diao , Xueyan Zou , Hao Peng

Large language models (LLMs) alignment aims to ensure that the behavior of LLMs meets human preferences. While collecting data from multiple fine-grained, aspect-specific preferences becomes more and more feasible, existing alignment…

Machine Learning · Computer Science 2026-03-03 Jia Zhang , Yao Liu , Chen-Xi Zhang , Yi Liu , Yi-Xuan Jin , Lan-Zhe Guo , Yu-Feng Li

This work focuses on leveraging and selecting from vast, unlabeled, open data to pre-fine-tune a pre-trained language model. The goal is to minimize the need for costly domain-specific data for subsequent fine-tuning while achieving desired…

Machine Learning · Computer Science 2024-05-07 Feiyang Kang , Hoang Anh Just , Yifan Sun , Himanshu Jahagirdar , Yuanzhi Zhang , Rongxing Du , Anit Kumar Sahu , Ruoxi Jia

Test-time alignment methods offer a promising alternative to fine-tuning by steering the outputs of large language models (LLMs) at inference time with lightweight interventions on their internal representations. Recently, a prominent and…

Computation and Language · Computer Science 2026-04-28 Imranul Ashrafi , Inigo Jauregi Unanue , Massimo Piccardi

With ever-increasing dataset sizes, subset selection techniques are becoming increasingly important for a plethora of tasks. It is often necessary to guide the subset selection to achieve certain desiderata, which includes focusing or…

Computer Vision and Pattern Recognition · Computer Science 2022-03-10 Suraj Kothawade , Vishal Kaushal , Ganesh Ramakrishnan , Jeff Bilmes , Rishabh Iyer

Preference-based reinforcement learning (RL) offers a promising approach for aligning policies with human intent but is often constrained by the high cost of human feedback. In this work, we introduce PrefVLM, a framework that integrates…

Machine Learning · Computer Science 2025-02-04 Udita Ghosh , Dripta S. Raychaudhuri , Jiachen Li , Konstantinos Karydis , Amit Roy-Chowdhury

While Hybrid Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL) has become the standard paradigm for training LLM agents, effective mechanisms for data allocation between these stages remain largely underexplored. Current…

Artificial Intelligence · Computer Science 2026-04-14 Yang Zhao , Yangou Ouyang , Xiao Ding , Hepeng Wang , Bibo Cai , Kai Xiong , Jinglong Gao , Zhouhao Sun , Li Du , Bing Qin , Ting Liu

We present Preference Flow Matching (PFM), a new framework for preference-based reinforcement learning (PbRL) that streamlines the integration of preferences into an arbitrary class of pre-trained models. Existing PbRL methods require…

Machine Learning · Computer Science 2024-10-29 Minu Kim , Yongsik Lee , Sehyeok Kang , Jihwan Oh , Song Chong , Se-Young Yun

We propose PRISM, a novel framework designed to overcome the limitations of 2D-based Preference-Based Reinforcement Learning (PBRL) by unifying 3D point cloud modeling and future-aware preference refinement. At its core, PRISM adopts a 3D…

Computation and Language · Computer Science 2025-03-20 Yirong Sun , Yanjun Chen

Learning from preference labels plays a crucial role in fine-tuning large language models. There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and…

Language models are commonly fine-tuned via reinforcement learning to alter their behavior or elicit new capabilities. Datasets used for these purposes, and particularly human preference datasets, are often noisy. The relatively small size…

Machine Learning · Computer Science 2025-07-22 Daniel Fein , Gabriela Aranguiz-Dias

Data selection for finetuning Large Language Models (LLMs) can be framed as a budget-constrained optimization problem: maximizing a model's downstream performance under a strict training data budget. Solving this problem is generally…

Machine Learning · Computer Science 2025-10-01 Animesh Jha , Harshit Gupta , Ananjan Nandi

While large-scale training data is fundamental for developing capable large language models (LLMs), strategically selecting high-quality data has emerged as a critical approach to enhance training efficiency and reduce computational costs.…

Machine Learning · Computer Science 2025-07-23 Yang Yu , Kai Han , Hang Zhou , Yehui Tang , Kaiqi Huang , Yunhe Wang , Dacheng Tao

Large language model (LLM) alignment is typically achieved through learning from human preference comparisons, making the quality of preference data critical to its success. Existing studies often pre-process raw training datasets to…

Machine Learning · Computer Science 2026-03-17 Zizhuo Zhang , Qizhou Wang , Shanshan Ye , Jianing Zhu , Jiangchao Yao , Bo Han , Masashi Sugiyama

Selecting appropriate training data is crucial for effective instruction fine-tuning of large language models (LLMs), which aims to (1) elicit strong capabilities, and (2) achieve balanced performance across a diverse range of tasks.…

Computation and Language · Computer Science 2025-01-22 Qirun Dai , Dylan Zhang , Jiaqi W. Ma , Hao Peng

Effective data selection is essential for pretraining large language models (LLMs), enhancing efficiency and improving generalization to downstream tasks. However, existing approaches often require leveraging external pretrained models,…

Machine Learning · Computer Science 2026-02-04 Jie Hao , Rui Yu , Wei Zhang , Huixia Wang , Jie Xu , Mingrui Liu
‹ Prev 1 2 3 10 Next ›