Related papers: Enhancing Large Language Model Performance with Gr…

Boosting Large Language Models with Mask Fine-Tuning

The large language model (LLM) is typically integrated into the mainstream optimization protocol. No work has questioned whether maintaining the model integrity is \textit{indispensable} for promising performance. In this work, we introduce…

Computation and Language · Computer Science 2026-03-17 Mingyuan Zhang , Yue Bai , Huan Wang , Yizhou Wang , Qihua Dong , Yitian Zhang , Yun Fu

Enhancing Large Language Model Performance To Answer Questions and Extract Information More Accurately

Large Language Models (LLMs) generate responses to questions; however, their effectiveness is often hindered by sub-optimal quality of answers and occasional failures to provide accurate responses to questions. To address these challenges,…

Computation and Language · Computer Science 2024-02-06 Liang Zhang , Katherine Jijo , Spurthi Setty , Eden Chung , Fatima Javid , Natan Vidra , Tommy Clifford

Enhancing the Reasoning Capabilities of Small Language Models via Solution Guidance Fine-Tuning

Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks. Advances in prompt engineering and fine-tuning techniques have further enhanced their ability to address complex reasoning challenges.…

Computation and Language · Computer Science 2024-12-16 Jing Bi , Yuting Wu , Weiwei Xing , Zhenjie Wei

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

While large language models (LLMs) often adopt finetuning to unlock their capabilities for downstream applications, our understanding on the inductive biases (especially the scaling properties) of different finetuning methods is still…

Computation and Language · Computer Science 2024-02-28 Biao Zhang , Zhongtao Liu , Colin Cherry , Orhan Firat

Learn More, Forget Less: A Gradient-Aware Data Selection Approach for LLM

Despite large language models (LLMs) have achieved impressive achievements across numerous tasks, supervised fine-tuning (SFT) remains essential for adapting these models to specialized domains. However, SFT for domain specialization can be…

Computation and Language · Computer Science 2025-11-13 Yibai Liu , Shihang Wang , Zeming Liu , Zheming Song , Junzhe Wang , Jingjing Liu , Qingjie Liu , Yunhong Wang

Can Gradient Descent Simulate Prompting?

There are two primary ways of incorporating new information into a language model (LM): changing its prompt or changing its parameters, e.g. via fine-tuning. Parameter updates incur no long-term storage cost for model changes. However, for…

Computation and Language · Computer Science 2025-06-27 Eric Zhang , Leshem Choshen , Jacob Andreas

FGIT: Fault-Guided Fine-Tuning for Code Generation

Modern instruction-tuned large language models (LLMs) have made remarkable progress in code generation. However, these LLMs fine-tuned with standard supervised fine-tuning (SFT) sometimes generate plausible-looking but functionally…

Software Engineering · Computer Science 2026-01-14 Lishui Fan , Zhongxin Liu , Haoye Wang , Lingfeng Bao , Xin Xia , Shanping Li

G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data Selection for Machine Translation

Large Language Models (LLMs) have demonstrated remarkable abilities in general scenarios. Instruction finetuning empowers them to align with humans in various tasks. Nevertheless, the Diversity and Quality of the instruction data remain two…

Computation and Language · Computer Science 2024-07-09 Xingyuan Pan , Luyang Huang , Liyan Kang , Zhicheng Liu , Yu Lu , Shanbo Cheng

Fine-tuning Large Language Models for Domain-specific Machine Translation

Large language models (LLMs) have shown great potential in domain-specific machine translation (MT). However, one major issue is that LLMs pre-trained on general domain corpus might not generalize well to specific domains due to the lack of…

Computation and Language · Computer Science 2024-12-18 Jiawei Zheng , Hanghai Hong , Feiyan Liu , Xiaoli Wang , Jingsong Su , Yonggui Liang , Shikai Wu

GradPruner: Gradient-Guided Layer Pruning Enabling Efficient Fine-Tuning and Inference for LLMs

Fine-tuning Large Language Models (LLMs) with downstream data is often considered time-consuming and expensive. Structured pruning methods are primarily employed to improve the inference efficiency of pre-trained models. Meanwhile, they…

Computation and Language · Computer Science 2026-01-28 Wei Huang , Anda Cheng , Yinggui Wang

Enhancing Parameter Efficiency and Generalization in Large-Scale Models: A Regularized and Masked Low-Rank Adaptation Approach

Large pre-trained models, such as large language models (LLMs), present significant resource challenges for fine-tuning due to their extensive parameter sizes, especially for applications in mobile systems. To address this, Low-Rank…

Machine Learning · Computer Science 2024-07-18 Yuzhu Mao , Siqi Ping , Zihao Zhao , Yang Liu , Wenbo Ding

Gradient-based Fine-Tuning through Pre-trained Model Regularization

Large pre-trained models have demonstrated extensive applications across various fields. However, fine-tuning these models for specific downstream tasks demands significant computational resources and storage. One fine-tuning method,…

Machine Learning · Computer Science 2025-07-02 Xuanbo Liu , Liu Liu , Fuxiang Wu , Fusheng Hao , Xianglong Liu

Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers

Automatic prompt optimization is an important approach to improving the performance of large language models (LLMs). Recent research demonstrates the potential of using LLMs as prompt optimizers, which can generate improved task prompts via…

Computation and Language · Computer Science 2025-01-28 Xinyu Tang , Xiaolei Wang , Wayne Xin Zhao , Siyuan Lu , Yaliang Li , Ji-Rong Wen

Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models

Fine-tuning large language models (LLMs) on downstream tasks requires substantial computational resources. Selective PEFT, a class of parameter-efficient fine-tuning (PEFT) methodologies, aims to mitigate these computational challenges by…

Computation and Language · Computer Science 2025-06-24 Aradhye Agarwal , Suhas K Ramesh , Ayan Sengupta , Tanmoy Chakraborty

FineGates: LLMs Finetuning with Compression using Stochastic Gates

Large Language Models (LLMs), with billions of parameters, present significant challenges for full finetuning due to the high computational demands, memory requirements, and impracticality of many real-world applications. When faced with…

Machine Learning · Computer Science 2024-12-18 Jonathan Svirsky , Yehonathan Refael , Ofir Lindenbaum

Taming LLMs by Scaling Learning Rates with Gradient Grouping

Training large language models (LLMs) poses challenges due to their massive scale and heterogeneous architectures. While adaptive optimizers like AdamW help address gradient variations, they still struggle with efficient and effective…

Machine Learning · Computer Science 2025-06-03 Siyuan Li , Juanxi Tian , Zedong Wang , Xin Jin , Zicheng Liu , Wentao Zhang , Dan Xu

Improving Large Language Model Fine-tuning for Solving Math Problems

Despite their success in many natural language tasks, solving math problems remains a significant challenge for large language models (LLMs). A large gap exists between LLMs' pass-at-one and pass-at-N performance in solving math problems,…

Computation and Language · Computer Science 2023-10-17 Yixin Liu , Avi Singh , C. Daniel Freeman , John D. Co-Reyes , Peter J. Liu

Does LLM Focus on the Right Words? Mitigating Context Bias in LLM-based Recommenders

Large language models (LLMs), owing to their extensive open-domain knowledge and semantic reasoning capabilities, have been increasingly integrated into recommender systems (RS). However, a substantial gap remains between the pre-training…

Information Retrieval · Computer Science 2026-01-27 Bohao Wang , Jiawei Chen , Feng Liu , Changwang Zhang , Jun Wang , Canghong Jin , Chun Chen , Can Wang

Fine-Tuning or Fine-Failing? Debunking Performance Myths in Large Language Models

Large Language Models (LLMs) have the unique capability to understand and generate human-like text from input queries. When fine-tuned, these models show enhanced performance on domain-specific queries. OpenAI highlights the process of…

Computation and Language · Computer Science 2024-07-02 Scott Barnett , Zac Brannelly , Stefanus Kurniawan , Sheng Wong

Efficient Task Adaptation in Large Language Models via Selective Parameter Optimization

Large Language Models (LLMs) have demonstrated excellent performance in general language understanding, generation and other tasks. However, when fine-tuning for specific domain tasks, the general knowledge accumulated in the pre-training…

Computation and Language · Computer Science 2026-04-21 Weijie Wan , Jiangjiang Zhao