Related papers: Logits-Based Finetuning

Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality

Supervised fine-tuning (SFT) is a critical step in aligning large language models (LLMs) with human instructions and values, yet many aspects of SFT remain poorly understood. We trained a wide range of base models on a variety of datasets…

Computation and Language · Computer Science 2025-10-31 Yuto Harada , Yusuke Yamauchi , Yusuke Oda , Yohei Oseki , Yusuke Miyao , Yu Takagi

FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain

Supervised fine-tuning (SFT) is a standard approach to adapting large language models (LLMs) to new domains. In this work, we improve the statistical efficiency of SFT by selecting an informative subset of training examples. Specifically,…

Machine Learning · Computer Science 2025-05-22 Rohan Deb , Kiran Thekumparampil , Kousha Kalantari , Gaurush Hiranandani , Shoham Sabach , Branislav Kveton

Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning

In recent years, Large Language Models (LLMs) have shown remarkable performance in generating human-like text, proving to be a valuable asset across various applications. However, adapting these models to incorporate new, out-of-domain…

Computation and Language · Computer Science 2024-04-04 Nick Mecklenburg , Yiyou Lin , Xiaoxiao Li , Daniel Holstein , Leonardo Nunes , Sara Malvar , Bruno Silva , Ranveer Chandra , Vijay Aski , Pavan Kumar Reddy Yannam , Tolga Aktas , Todd Hendry

Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse domains, but developing high-performing models for specialized applications often requires substantial human annotation -- a process that is…

Computation and Language · Computer Science 2025-07-30 Abhinav Arabelly , Jagrut Nemade , Robert D Nowak , Jifan Zhang

Enhancing Large Language Model Reasoning via Selective Critical Token Fine-Tuning

Large language models (LLMs) primarily rely on supervised fine-tuning (SFT) as a key method to adapt pre-trained models to domain-specific tasks such as mathematical reasoning. However, standard SFT uniformly penalizes all tokens,…

Computation and Language · Computer Science 2025-10-14 Zhiwen Ruan , Yixia Li , He Zhu , Yun Chen , Peng Li , Yang Liu , Guanhua Chen

Improving Multilingual Instruction Finetuning via Linguistically Natural and Diverse Datasets

Advancements in Large Language Models (LLMs) have significantly enhanced instruction-following capabilities. However, most Instruction Fine-Tuning (IFT) datasets are predominantly in English, limiting model performance in other languages.…

Computation and Language · Computer Science 2024-07-03 Sathish Reddy Indurthi , Wenxuan Zhou , Shamil Chollampatt , Ravi Agrawal , Kaiqiang Song , Lingxiao Zhao , Chenguang Zhu

SLearnLLM: A Self-Learning Framework for Efficient Domain-Specific Adaptation of Large Language Models

When using supervised fine-tuning (SFT) to adapt large language models (LLMs) to specific domains, a significant challenge arises: should we use the entire SFT dataset for fine-tuning? Common practice often involves fine-tuning directly on…

Computation and Language · Computer Science 2025-05-26 Xiang Liu , Zhaoxiang Liu , Peng Wang , Kohou Wang , Huan Hu , Kai Wang , Shiguo Lian

Complexity-aware fine-tuning

General-purpose Large Language Models (LLMs) are frequently fine-tuned through supervised fine-tuning (SFT) to enhance performance in specific domains. Better results can be achieved by distilling the chain-of-thought of a larger model at…

Machine Learning · Computer Science 2026-03-24 Andrey Goncharov , Daniil Vyazhev , Petr Sychev , Edvard Khalafyan , Alexey Zaytsev

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

In this work, we present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through…

Machine Learning · Computer Science 2026-03-02 Yongliang Wu , Yizhou Zhou , Zhou Ziheng , Yingzhe Peng , Xinyu Ye , Xinting Hu , Wenbo Zhu , Lu Qi , Ming-Hsuan Yang , Xu Yang

Natural Language Fine-Tuning

Large language model fine-tuning techniques typically depend on extensive labeled data, external guidance, and feedback, such as human alignment, scalar rewards, and demonstration. However, in practical application, the scarcity of specific…

Computation and Language · Computer Science 2024-12-31 Jia Liu , Yue Wang , Zhiqi Lin , Min Chen , Yixue Hao , Long Hu

Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models

Alignment, endowing a pre-trained Large language model (LLM) with the ability to follow instructions, is crucial for its real-world applications. Conventional supervised fine-tuning (SFT) methods formalize it as causal language modeling…

Computation and Language · Computer Science 2024-12-18 Yuchen Fan , Yuzhong Hong , Qiushi Wang , Junwei Bao , Hongfei Jiang , Yang Song

UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset

Open-source large language models (LLMs) have gained significant strength across diverse fields. Nevertheless, the majority of studies primarily concentrate on English, with only limited exploration into the realm of multilingual abilities.…

Computation and Language · Computer Science 2024-02-20 Haoyu Wang , Shuo Wang , Yukun Yan , Xujia Wang , Zhiyu Yang , Yuzhuang Xu , Zhenghao Liu , Liner Yang , Ning Ding , Xu Han , Zhiyuan Liu , Maosong Sun

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

Pre-trained Large Language Model (LLM) exhibits broad capabilities, yet, for specific tasks or domains their attainment of higher accuracy and more reliable reasoning generally depends on post-training through Supervised Fine-Tuning (SFT)…

Artificial Intelligence · Computer Science 2026-03-17 Haitao Jiang , Wenbo Zhang , Jiarui Yao , Hengrui Cai , Sheng Wang , Rui Song

Narrowing the Gap: Supervised Fine-Tuning of Open-Source LLMs as a Viable Alternative to Proprietary Models for Pedagogical Tools

Frontier Large language models (LLMs) like ChatGPT and Gemini can decipher cryptic compiler errors for novice programmers, but their computational scale, cost, and tendency to over-assist make them problematic for widespread pedagogical…

Computers and Society · Computer Science 2025-07-09 Lorenzo Lee Solano , Charles Koutcheme , Juho Leinonen , Alexandra Vassar , Jake Renzella

Enhancing the Reasoning Capabilities of Small Language Models via Solution Guidance Fine-Tuning

Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks. Advances in prompt engineering and fine-tuning techniques have further enhanced their ability to address complex reasoning challenges.…

Computation and Language · Computer Science 2024-12-16 Jing Bi , Yuting Wu , Weiwei Xing , Zhenjie Wei

Instruction Tuning for Large Language Models: A Survey

This paper surveys research works in the quickly advancing field of instruction tuning (IT), which can also be referred to as supervised fine-tuning (SFT)\footnote{In this paper, unless specified otherwise, supervised fine-tuning (SFT) and…

Computation and Language · Computer Science 2025-10-07 Shengyu Zhang , Linfeng Dong , Xiaoya Li , Sen Zhang , Xiaofei Sun , Shuhe Wang , Jiwei Li , Runyi Hu , Tianwei Zhang , Fei Wu , Guoyin Wang

Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models

Fine-tuning Large Language Models (LLMs) on specific datasets is a common practice to improve performance on target tasks. However, this performance gain often leads to overfitting, where the model becomes too specialized in either the task…

Computation and Language · Computer Science 2025-02-21 Sonam Gupta , Yatin Nandwani , Asaf Yehudai , Dinesh Khandelwal , Dinesh Raghu , Sachindra Joshi

SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning

Recent development in Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) have leverage Attention-based Transformer architectures and achieved superior performance and generalization capabilities. They have since…

Computation and Language · Computer Science 2025-05-20 Yuze Zhao , Jintao Huang , Jinghan Hu , Xingjun Wang , Yunlin Mao , Daoze Zhang , Hong Zhang , Zeyinzi Jiang , Zhikai Wu , Baole Ai , Ang Wang , Wenmeng Zhou , Yingda Chen

SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

To acquire instruction-following capabilities, large language models (LLMs) undergo instruction tuning, where they are trained on instruction-response pairs using next-token prediction (NTP). Efforts to improve instruction tuning often…

Computation and Language · Computer Science 2026-04-21 Yuxin Xiao , Shujian Zhang , Wenxuan Zhou , Marzyeh Ghassemi , Sanqiang Zhao

LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models

Fine-tuning Large Language Models (LLMs) adapts a trained model to specific downstream tasks, significantly improving task-specific performance. Supervised Fine-Tuning (SFT) is a common approach, where an LLM is trained to produce desired…

Machine Learning · Computer Science 2024-01-03 Qianxi Li , Yingyue Cao , Jikun Kang , Tianpei Yang , Xi Chen , Jun Jin , Matthew E. Taylor