Related papers: CodeLutra: Boosting LLM Code Generation via Prefer…

Enhancing Code Generation for Low-Resource Languages: No Silver Bullet

The advent of Large Language Models (LLMs) has significantly advanced the field of automated code generation. LLMs rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource…

Software Engineering · Computer Science 2025-02-03 Alessandro Giagnorio , Alberto Martin-Lopez , Gabriele Bavota

Prompting and Fine-tuning Large Language Models for Automated Code Review Comment Generation

Generating accurate code review comments remains a significant challenge due to the inherently diverse and non-unique nature of the task output. Large language models pretrained on both programming and natural language data tend to perform…

Software Engineering · Computer Science 2024-11-18 Md. Asif Haider , Ayesha Binte Mostofa , Sk. Sabit Bin Mosaddek , Anindya Iqbal , Toufique Ahmed

LLM-Assisted Code Cleaning For Training Accurate Code Generators

Natural language to code generation is an important application area of LLMs and has received wide attention from the community. The majority of relevant studies have exclusively concentrated on increasing the quantity and functional…

Machine Learning · Computer Science 2023-11-28 Naman Jain , Tianjun Zhang , Wei-Lin Chiang , Joseph E. Gonzalez , Koushik Sen , Ion Stoica

CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning

Large language models (LLMs) have made significant progress in natural language understanding and generation, driven by scalable pretraining and advanced finetuning. However, enhancing reasoning abilities in LLMs, particularly via…

Artificial Intelligence · Computer Science 2025-05-30 Huimu Yu , Xing Wu , Haotian Xu , Debing Zhang , Songlin Hu

Data-efficient LLM Fine-tuning for Code Generation

Large language models (LLMs) have demonstrated significant potential in code generation tasks. However, there remains a performance gap between open-source and closed-source models. To address this gap, existing approaches typically…

Computation and Language · Computer Science 2025-04-18 Weijie Lv , Xuan Xia , Sheng-Jun Huang

Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning

Large Language Models (LLMs) have been widely adopted in commercial code completion engines, significantly enhancing coding efficiency and productivity. However, LLMs may generate code with quality issues that violate coding standards and…

Software Engineering · Computer Science 2025-03-20 Yuan Jiang , Yujian Zhang , Liang Lu , Christoph Treude , Xiaohong Su , Shan Huang , Tiantian Wang

Does Few-Shot Learning Help LLM Performance in Code Synthesis?

Large language models (LLMs) have made significant strides at code generation through improved model design, training, and chain-of-thought. However, prompt-level optimizations remain an important yet under-explored aspect of LLMs for…

Software Engineering · Computer Science 2024-12-05 Derek Xu , Tong Xie , Botao Xia , Haoyu Li , Yunsheng Bai , Yizhou Sun , Wei Wang

LLM4EFFI: Leveraging Large Language Models to Enhance Code Efficiency and Correctness

Large Language Models (LLMs), particularly Code LLMs, have demonstrated impressive performance in code generation. Current research primarily focuses on the correctness of generated code, while efficiency remains less explored. Recent works…

Software Engineering · Computer Science 2025-02-27 Tong Ye , Weigang Huang , Xuhong Zhang , Tengfei Ma , Peiyu Liu , Jianwei Yin , Wenhai Wang

Code Evolution Graphs: Understanding Large Language Model Driven Design of Algorithms

Large Language Models (LLMs) have demonstrated great promise in generating code, especially when used inside an evolutionary computation framework to iteratively optimize the generated algorithms. However, in some cases they fail to…

Neural and Evolutionary Computing · Computer Science 2025-03-24 Niki van Stein , Anna V. Kononova , Lars Kotthoff , Thomas Bäck

Code Generation with Small Language Models: A Codeforces-Based Study

Large Language Models (LLMs) demonstrate capabilities in code generation, potentially boosting developer productivity. However, their adoption remains limited by high computational costs, among other factors. Small Language Models (SLMs)…

Software Engineering · Computer Science 2025-09-23 Débora Souza , Rohit Gheyi , Lucas Albuquerque , Gustavo Soares , Márcio Ribeiro

Fine-tuning Large Language Model for Automated Algorithm Design

The integration of large language models (LLMs) into automated algorithm design has shown promising potential. A prevalent approach embeds LLMs within search routines to iteratively generate and refine candidate algorithms. However, most…

Machine Learning · Computer Science 2026-05-20 Fei Liu , Rui Zhang , Xi Lin , Zhichao Lu , Qingfu Zhang

Rethinking Code Refinement: Learning to Judge Code Efficiency

Large Language Models (LLMs) have demonstrated impressive capabilities in understanding and generating codes. Due to these capabilities, many recent methods are proposed to automatically refine the codes with LLMs. However, we should…

Software Engineering · Computer Science 2024-10-31 Minju Seo , Jinheon Baek , Sung Ju Hwang

Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs

Code-generating Large Language Models (LLMs) have become essential tools in modern software development, enhancing productivity and accelerating development. This paper aims to investigate the fine-tuning of code-generating LLMs using…

Software Engineering · Computer Science 2025-05-06 Marina Sakharova , Abhinav Anand , Mira Mezini

CodeJudge: Evaluating Code Generation with Large Language Models

Large Language Models (LLMs) have shown promising performance in code generation. However, how to reliably evaluate code generated by LLMs remains an unresolved problem. This paper presents CodeJudge, a code evaluation framework that…

Machine Learning · Computer Science 2024-10-04 Weixi Tong , Tianyi Zhang

LLaMoCo: Instruction Tuning of Large Language Models for Optimization Code Generation

Recent research explores optimization using large language models (LLMs) by either iteratively seeking next-step solutions from LLMs or directly prompting LLMs for an optimizer. However, these approaches exhibit inherent limitations,…

Optimization and Control · Mathematics 2024-03-06 Zeyuan Ma , Hongshu Guo , Jiacheng Chen , Guojun Peng , Zhiguang Cao , Yining Ma , Yue-Jiao Gong

Exploring the Potential of Llama Models in Automated Code Refinement: A Replication Study

Code reviews are an integral part of software development and have been recognized as a crucial practice for minimizing bugs and favouring higher code quality. They serve as an important checkpoint before committing code and play an…

Software Engineering · Computer Science 2024-12-05 Genevieve Caumartin , Qiaolin Qin , Sharon Chatragadda , Janmitsinh Panjrolia , Heng Li , Diego Elias Costa

Refine-n-Judge: Curating High-Quality Preference Chains for LLM-Fine-Tuning

Large Language Models (LLMs) have demonstrated remarkable progress through preference-based fine-tuning, which critically depends on the quality of the underlying training data. While human feedback is essential for improving data quality,…

Artificial Intelligence · Computer Science 2025-10-31 Derin Cayir , Renjie Tao , Rashi Rungta , Kai Sun , Sean Chen , Haidar Khan , Minseok Kim , Julia Reinspach , Yue Liu

Code Repair with LLMs gives an Exploration-Exploitation Tradeoff

Iteratively improving and repairing source code with large language models (LLMs), known as refinement, has emerged as a popular way of generating programs that would be too complex to construct in one shot. Given a bank of test cases,…

Software Engineering · Computer Science 2024-10-31 Hao Tang , Keya Hu , Jin Peng Zhou , Sicheng Zhong , Wei-Long Zheng , Xujie Si , Kevin Ellis

Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation

In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension and generation tasks. We have the following main findings. First, for the zero-shot setting, instructed LLMs are very competitive on code…

Computation and Language · Computer Science 2023-08-03 Zhiqiang Yuan , Junwei Liu , Qiancheng Zi , Mingwei Liu , Xin Peng , Yiling Lou

Performance-Aligned LLMs for Generating Fast Code

Optimizing scientific software is a difficult task because codebases are often large and complex, and performance can depend upon several factors including the algorithm, its implementation, and hardware among others. Causes of poor…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-30 Daniel Nichols , Pranav Polasam , Harshitha Menon , Aniruddha Marathe , Todd Gamblin , Abhinav Bhatele