Related papers: MFTCoder: Boosting Code LLMs with Multitask Fine-T…

One Model, Many Skills: Parameter-Efficient Fine-Tuning for Multitask Code Analysis

Large language models have recently surpassed specialized systems on code generation, yet their effectiveness on other code-analysis tasks remains less clear. At the same time, multi-task learning offers a way to unify diverse objectives…

Software Engineering · Computer Science 2026-03-12 Amal Akli , Maxime Cordy , Mike Papadakis , Yves Le Traon

SynthCoder: A Synthetical Strategy to Tune LLMs for Code Completion

Code completion is a prominent application of Large Language Models (LLMs) in software engineering. Due to the near real-time response requirements of this task, base models with small to medium-sized parameters are typically employed,…

Software Engineering · Computer Science 2025-09-18 Dongjun Yu , Xiao Yan , Zhenrui Li , Jipeng Xiao , Haochuan He , Yongda Yu , Hao Zhang , Guoping Rong , Xiaobo Huang

EffiCoder: Enhancing Code Generation in Large Language Models through Efficiency-Aware Fine-tuning

As large language models (LLMs) play an increasingly important role in code generation, enhancing both correctness and efficiency has become crucial. Current methods primarily focus on correctness, often overlooking efficiency. To address…

Computation and Language · Computer Science 2025-06-17 Dong Huang , Guangtao Zeng , Jianbo Dai , Meng Luo , Han Weng , Yuhao Qing , Heming Cui , Zhijiang Guo , Jie M. Zhang

WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning

Recent work demonstrates that, after instruction tuning, Code Large Language Models (Code LLMs) can obtain impressive capabilities to address a wide range of code-related tasks. However, current instruction tuning methods for Code LLMs…

Computation and Language · Computer Science 2024-06-10 Zhaojian Yu , Xin Zhang , Ning Shang , Yangyu Huang , Can Xu , Yishujie Zhao , Wenxiang Hu , Qiufeng Yin

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence. However, existing code LLMs have two main limitations in terms of architecture and pretraining tasks. First, they often adopt…

Computation and Language · Computer Science 2023-05-23 Yue Wang , Hung Le , Akhilesh Deepak Gotmare , Nghi D. Q. Bui , Junnan Li , Steven C. H. Hoi

Finetuning Large Language Models for Vulnerability Detection

This paper presents the results of finetuning large language models (LLMs) for the task of detecting vulnerabilities in source code. We leverage WizardCoder, a recent improvement of the state-of-the-art LLM StarCoder, and adapt it for…

Cryptography and Security · Computer Science 2024-07-30 Alexey Shestov , Rodion Levichev , Ravil Mussabayev , Evgeny Maslov , Anton Cheshkov , Pavel Zadorozhny

AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data

Open-source Large Language Models (LLMs) and their specialized variants, particularly Code LLMs, have recently delivered impressive performance. However, previous Code LLMs are typically fine-tuned on single-source data with limited quality…

Computation and Language · Computer Science 2025-02-04 Zifan Song , Yudong Wang , Wenwei Zhang , Kuikun Liu , Chengqi Lyu , Demin Song , Qipeng Guo , Hang Yan , Dahua Lin , Kai Chen , Cairong Zhao

MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-tuning

Within the realm of software engineering, specialized tasks on code, such as program repair, present unique challenges, necessitating fine-tuning Large language models~(LLMs) to unlock state-of-the-art performance. Fine-tuning approaches…

Software Engineering · Computer Science 2025-09-23 Boyang Yang , Haoye Tian , Jiadong Ren , Hongyu Zhang , Jacques Klein , Tegawendé F. Bissyandé , Claire Le Goues , Shunfu Jin

CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models

Multi-task learning (MTL) benefits the fine-tuning of large language models (LLMs) by providing a single model with improved performance and generalization ability across tasks, presenting a resource-efficient alternative to developing…

Computation and Language · Computer Science 2024-10-29 Zi Gong , Hang Yu , Cong Liao , Bingchang Liu , Chaoyu Chen , Jianguo Li

Code Review Automation Via Multi-task Federated LLM -- An Empirical Study

Code review is a crucial process before deploying code to production, as it validates the code, provides suggestions for improvements, and identifies errors such as missed edge cases. In projects with regular production releases, the effort…

Software Engineering · Computer Science 2024-12-23 Jahnavi Kumar , Sridhar Chimalakonda

Multi-task Code LLMs: Data Mix or Model Merge?

Recent research advocates deploying smaller, specialized code LLMs in agentic frameworks alongside frontier models, sparking interest in efficient strategies for multi-task learning that balance performance, constraints, and costs. We…

Computation and Language · Computer Science 2026-01-30 Mingzhi Zhu , Boris Sobolev , Rahul Krishna , Raju Pavuluri , Stacy Patterson , Michele Merler

MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks

Large Language Models (LLMs) have showcased impressive capabilities in handling straightforward programming tasks. However, their performance tends to falter when confronted with more challenging programming problems. We observe that…

Machine Learning · Computer Science 2025-04-01 Jingyao Li , Pengguang Chen , Bin Xia , Hong Xu , Jiaya Jia

Mixing It Up: The Cocktail Effect of Multi-Task Fine-Tuning on LLM Performance -- A Case Study in Finance

The application of large language models (LLMs) in domain-specific contexts, including finance, has expanded rapidly. Domain-specific LLMs are typically evaluated based on their performance in various downstream tasks relevant to the…

Artificial Intelligence · Computer Science 2024-12-06 Meni Brief , Oded Ovadia , Gil Shenderovitz , Noga Ben Yoash , Rachel Lemberg , Eitam Sheetrit

Parameter-Efficient Multi-Task Fine-Tuning in Code-Related Tasks

Large Language Models (LLMs) have proven highly effective in automating software engineering tasks, bridging natural language and code semantics to achieve notable results in code generation and summarization. However, their scale incurs…

Software Engineering · Computer Science 2026-01-22 Md Zahidul Haque , Saima Afrin , Antonio Mastropaolo

Improving the Robustness of Large Language Models for Code Tasks via Fine-tuning with Perturbed Data

Context: In the fast-paced evolution of software development, Large Language Models (LLMs) have become indispensable tools for tasks such as code generation, completion, analysis, and bug fixing. Ensuring the robustness of these models…

Software Engineering · Computer Science 2026-02-13 Yang Liu , Armstrong Foundjem , Xingfang Wu , Heng Li , Foutse Khomh

ConceptCoder: Improve Code Reasoning via Concept Learning

Large language models (LLMs) have shown promising results for software engineering applications, but still struggle with code reasoning tasks such as vulnerability detection (VD). We introduce ConceptCoder, a fine-tuning method that…

Software Engineering · Computer Science 2026-03-25 Md Mahbubur Rahman , Hengbo Tong , Wei Le

MCCoder: Streamlining Motion Control with LLM-Assisted Code Generation and Rigorous Verification

Large Language Models (LLMs) have demonstrated significant potential in code generation. However, in the factory automation sector, particularly motion control, manual programming, alongside inefficient and unsafe debugging practices,…

Artificial Intelligence · Computer Science 2025-07-03 Yin Li , Liangwei Wang , Shiyuan Piao , Boo-Ho Yang , Ziyue Li , Wei Zeng , Fugee Tsung

Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs

Length generalization, the ability to solve problems longer than those seen during training, remains a critical challenge for large language models (LLMs). Previous work modifies positional encodings (PEs) and data formats to improve length…

Computation and Language · Computer Science 2025-05-20 Yi Hu , Shijia Kang , Haotong Yang , Haotian Xu , Muhan Zhang

DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning

Code Large Language Models (Code LLMs) have demonstrated outstanding performance in code-related tasks. Several instruction tuning approaches have been proposed to boost the code generation performance of pre-trained Code LLMs. In this…

Computation and Language · Computer Science 2024-02-15 Yejie Wang , Keqing He , Guanting Dong , Pei Wang , Weihao Zeng , Muxi Diao , Yutao Mou , Mengdi Zhang , Jingang Wang , Xunliang Cai , Weiran Xu

InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation

Code translation aims to convert a program from one programming language (PL) to another. This long-standing software engineering task is crucial for modernizing legacy systems, ensuring cross-platform compatibility, enhancing performance,…

Software Engineering · Computer Science 2024-11-06 Marcos Macedo , Yuan Tian , Pengyu Nie , Filipe R. Cogo , Bram Adams