Related papers: Beyond Single-Task: Robust Multi-Task Length Gener…

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

Pre-trained Large Language Model (LLM) exhibits broad capabilities, yet, for specific tasks or domains their attainment of higher accuracy and more reliable reasoning generally depends on post-training through Supervised Fine-Tuning (SFT)…

Artificial Intelligence · Computer Science 2026-03-17 Haitao Jiang , Wenbo Zhang , Jiarui Yao , Hengrui Cai , Sheng Wang , Rui Song

Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models

Fine-tuning Large Language Models (LLMs) on specific datasets is a common practice to improve performance on target tasks. However, this performance gain often leads to overfitting, where the model becomes too specialized in either the task…

Computation and Language · Computer Science 2025-02-21 Sonam Gupta , Yatin Nandwani , Asaf Yehudai , Dinesh Khandelwal , Dinesh Raghu , Sachindra Joshi

HFT: Half Fine-Tuning for Large Language Models

Large language models (LLMs) with one or more fine-tuning phases have become a necessary step to unlock various capabilities, enabling LLMs to follow natural language instructions or align with human preferences. However, it carries the…

Computation and Language · Computer Science 2024-04-30 Tingfeng Hui , Zhenyu Zhang , Shuohuan Wang , Weiran Xu , Yu Sun , Hua Wu

UFT: Unifying Supervised and Reinforcement Fine-Tuning

Post-training has demonstrated its importance in enhancing the reasoning capabilities of large language models (LLMs). The primary post-training methods can be categorized into supervised fine-tuning (SFT) and reinforcement fine-tuning…

Machine Learning · Computer Science 2025-10-21 Mingyang Liu , Gabriele Farina , Asuman Ozdaglar

How and Why LLMs Generalize: A Fine-Grained Analysis of LLM Reasoning from Cognitive Behaviors to Low-Level Patterns

Large Language Models (LLMs) display strikingly different generalization behaviors: supervised fine-tuning (SFT) often narrows capability, whereas reinforcement-learning (RL) tuning tends to preserve it. The reasons behind this divergence…

Machine Learning · Computer Science 2026-01-01 Haoyue Bai , Yiyou Sun , Wenjie Hu , Shi Qiu , Maggie Ziyu Huan , Peiyang Song , Robert Nowak , Dawn Song

ReFT: Reasoning with Reinforced Fine-Tuning

One way to enhance the reasoning capability of Large Language Models (LLMs) is to conduct Supervised Fine-Tuning (SFT) using Chain-of-Thought (CoT) annotations. This approach does not show sufficiently strong generalization ability,…

Computation and Language · Computer Science 2024-12-16 Trung Quoc Luong , Xinbo Zhang , Zhanming Jie , Peng Sun , Xiaoran Jin , Hang Li

Boosting Large Language Models with Mask Fine-Tuning

The large language model (LLM) is typically integrated into the mainstream optimization protocol. No work has questioned whether maintaining the model integrity is \textit{indispensable} for promising performance. In this work, we introduce…

Computation and Language · Computer Science 2026-03-17 Mingyuan Zhang , Yue Bai , Huan Wang , Yizhou Wang , Qihua Dong , Yitian Zhang , Yun Fu

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Mathematical reasoning is a challenging task for large language models (LLMs), while the scaling relationship of it with respect to LLM capacity is under-explored. In this paper, we investigate how the pre-training loss, supervised data…

Computation and Language · Computer Science 2023-09-14 Zheng Yuan , Hongyi Yuan , Chengpeng Li , Guanting Dong , Keming Lu , Chuanqi Tan , Chang Zhou , Jingren Zhou

Beyond English-Centric Training: How Reinforcement Learning Improves Cross-Lingual Reasoning in LLMs

Enhancing the complex reasoning capabilities of Large Language Models (LLMs) attracts widespread attention. While reinforcement learning (RL) has shown superior performance for improving complex reasoning, its impact on cross-lingual…

Computation and Language · Computer Science 2025-09-30 Shulin Huang , Yiran Ding , Junshu Pan , Yue Zhang

One Model, Many Skills: Parameter-Efficient Fine-Tuning for Multitask Code Analysis

Large language models have recently surpassed specialized systems on code generation, yet their effectiveness on other code-analysis tasks remains less clear. At the same time, multi-task learning offers a way to unify diverse objectives…

Software Engineering · Computer Science 2026-03-12 Amal Akli , Maxime Cordy , Mike Papadakis , Yves Le Traon

Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model

A common challenge towards the adaptability of Large Language Models (LLMs) is their ability to learn new languages over time without hampering the model's performance on languages in which the model is already proficient (usually English).…

Computation and Language · Computer Science 2026-04-24 Divyanshu Aggarwal , Sankarshan Damle , Navin Goyal , Satya Lokam , Sunayana Sitaram

Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning

Reinforcement fine-tuning (RFT) has become a core paradigm for post-training large language models, yet its training process remains highly fragile. Existing efforts mainly improve reliability at the system level or address specific issues…

Software Engineering · Computer Science 2026-05-07 Lingzhe Zhang , Tong Jia , Yunpeng Zhai , Liancheng Fang , Kening Zheng , Hongyi Liu , Xiaosong Huang , Philip S. Yu , Ying Li

A Novel Paradigm Boosting Translation Capabilities of Large Language Models

This paper presents a study on strategies to enhance the translation capabilities of large language models (LLMs) in the context of machine translation (MT) tasks. The paper proposes a novel paradigm consisting of three stages: Secondary…

Computation and Language · Computer Science 2024-04-16 Jiaxin Guo , Hao Yang , Zongyao Li , Daimeng Wei , Hengchao Shang , Xiaoyu Chen

60 Data Points are Sufficient to Fine-Tune LLMs for Question-Answering

Large language models (LLMs) encode extensive world knowledge through pre-training on massive datasets, which can then be fine-tuned for the question-answering (QA) task. However, effective strategies for fine-tuning LLMs for the QA task…

Computation and Language · Computer Science 2025-01-22 Junjie Ye , Yuming Yang , Qi Zhang , Tao Gui , Xuanjing Huang , Peng Wang , Zhongchao Shi , Jianping Fan

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

In this work, we present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through…

Machine Learning · Computer Science 2026-03-02 Yongliang Wu , Yizhou Zhou , Zhou Ziheng , Yingzhe Peng , Xinyu Ye , Xinting Hu , Wenbo Zhu , Lu Qi , Ming-Hsuan Yang , Xu Yang

Why Does Reinforcement Learning Generalize? A Feature-Level Mechanistic Study of Post-Training in Large Language Models

Reinforcement learning (RL)-based post-training often improves the reasoning performance of large language models (LLMs) beyond the training domain, while supervised fine-tuning (SFT) frequently leads to general capabilities forgetting.…

Computation and Language · Computer Science 2026-04-29 Dan Shi , Zhuowen Han , Simon Ostermann , Renren Jin , Josef van Genabith , Deyi Xiong

Unlock the Correlation between Supervised Fine-Tuning and Reinforcement Learning in Training Code Large Language Models

Automatic code generation has been a longstanding research topic. With the advancement of general-purpose large language models (LLMs), the ability to code stands out as one important measure to the model's reasoning performance. Usually, a…

Software Engineering · Computer Science 2024-12-18 Jie Chen , Xintian Han , Yu Ma , Xun Zhou , Liang Xiang

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

Large language models (LLMs) achieve remarkable advancements by leveraging tools to interact with environments, a critical step toward generalized AI. However, the standard supervised fine-tuning (SFT) approach, which relies on large-scale…

Computation and Language · Computer Science 2025-08-27 Junjie Ye , Yilong Wu , Sixian Li , Yuming Yang , Zhiheng Xi , Tao Gui , Qi Zhang , Xuanjing Huang , Peng Wang , Zhongchao Shi , Jianping Fan , Zhengyin Du

Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models

Fine-tuning Large Language Models (LLMs) on specific datasets is a common practice to improve performance on target tasks. However, this performance gain often leads to overfitting, where the model becomes too specialized in either the task…

Computation and Language · Computer Science 2024-09-10 Sonam Gupta , Yatin Nandwani , Asaf Yehudai , Mayank Mishra , Gaurav Pandey , Dinesh Raghu , Sachindra Joshi

On the Suitability of Reinforcement Fine-Tuning to Visual Tasks

Reinforcement Fine-Tuning (RFT) is proved to be greatly valuable for enhancing the reasoning ability of LLMs. Researchers have been starting to apply RFT to MLLMs, hoping it will also enhance the capabilities of visual understanding.…

Computer Vision and Pattern Recognition · Computer Science 2025-09-23 Xiaxu Chen , Wei Li , Chunxu Liu , Chi Xie , Xiaoyan Hu , Chengqian Ma , Feng Zhu , Rui Zhao