Related papers: Rethinking Code Refinement: Learning to Judge Code…

LLM4EFFI: Leveraging Large Language Models to Enhance Code Efficiency and Correctness

Large Language Models (LLMs), particularly Code LLMs, have demonstrated impressive performance in code generation. Current research primarily focuses on the correctness of generated code, while efficiency remains less explored. Recent works…

Software Engineering · Computer Science 2025-02-27 Tong Ye , Weigang Huang , Xuhong Zhang , Tengfei Ma , Peiyu Liu , Jianwei Yin , Wenhai Wang

On Evaluating the Efficiency of Source Code Generated by LLMs

Recent years have seen the remarkable capabilities of large language models (LLMs) for code generation. Different from existing work that evaluate the correctness of the code generated by LLMs, we propose to further evaluate its efficiency.…

Software Engineering · Computer Science 2024-04-10 Changan Niu , Ting Zhang , Chuanyi Li , Bin Luo , Vincent Ng

Code Evolution Graphs: Understanding Large Language Model Driven Design of Algorithms

Large Language Models (LLMs) have demonstrated great promise in generating code, especially when used inside an evolutionary computation framework to iteratively optimize the generated algorithms. However, in some cases they fail to…

Neural and Evolutionary Computing · Computer Science 2025-03-24 Niki van Stein , Anna V. Kononova , Lars Kotthoff , Thomas Bäck

A Survey on Evaluating Large Language Models in Code Generation Tasks

This paper provides a comprehensive review of the current methods and metrics used to evaluate the performance of Large Language Models (LLMs) in code generation tasks. With the rapid growth in demand for automated software development,…

Software Engineering · Computer Science 2025-03-05 Liguo Chen , Qi Guo , Hongrui Jia , Zhengran Zeng , Xin Wang , Yijiang Xu , Jian Wu , Yidong Wang , Qing Gao , Jindong Wang , Wei Ye , Shikun Zhang

Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis

In the past few years, Large Language Models (LLMs) have exploded in usefulness and popularity for code generation tasks. However, LLMs still struggle with accuracy and are unsuitable for high-risk applications without additional oversight…

Software Engineering · Computer Science 2024-10-29 William Murphy , Nikolaus Holzer , Feitong Qiao , Leyi Cui , Raven Rothkopf , Nathan Koenig , Mark Santolucito

A Performance Study of LLM-Generated Code on Leetcode

This study evaluates the efficiency of code generation by Large Language Models (LLMs) and measures their performance against human-crafted solutions using a dataset from Leetcode. We compare 18 LLMs, considering factors such as model…

Software Engineering · Computer Science 2024-08-01 Tristan Coignion , Clément Quinton , Romain Rouvoy

Performance-Aligned LLMs for Generating Fast Code

Optimizing scientific software is a difficult task because codebases are often large and complex, and performance can depend upon several factors including the algorithm, its implementation, and hardware among others. Causes of poor…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-30 Daniel Nichols , Pranav Polasam , Harshitha Menon , Aniruddha Marathe , Todd Gamblin , Abhinav Bhatele

Beyond Functional Correctness: Investigating Coding Style Inconsistencies in Large Language Models

Large language models (LLMs) have brought a paradigm shift to the field of code generation, offering the potential to enhance the software development process. However, previous research mainly focuses on the accuracy of code generation,…

Software Engineering · Computer Science 2025-06-24 Yanlin Wang , Tianyue Jiang , Mingwei Liu , Jiachi Chen , Mingzhi Mao , Xilin Liu , Yuchi Ma , Zibin Zheng

Exploring the Potential of Large Language Models in Fine-Grained Review Comment Classification

Code review is a crucial practice in software development. As code review nowadays is lightweight, various issues can be identified, and sometimes, they can be trivial. Research has investigated automated approaches to classify review…

Software Engineering · Computer Science 2025-08-14 Linh Nguyen , Chunhua Liu , Hong Yi Lin , Patanamon Thongtanunam

CodeMind: Evaluating Large Language Models for Code Reasoning

Large Language Models (LLMs) have been widely used to automate programming tasks. Their capabilities have been evaluated by assessing the quality of generated code through tests or proofs. The extent to which they can reason about code is a…

Software Engineering · Computer Science 2026-04-08 Changshu Liu , Yang Chen , Reyhaneh Jabbarvand

Self-Edit: Fault-Aware Code Editor for Code Generation

Large language models (LLMs) have demonstrated an impressive ability to generate codes on competitive programming tasks. However, with limited sample numbers, LLMs still suffer from poor accuracy. Inspired by the process of human…

Software Engineering · Computer Science 2023-09-12 Kechi Zhang , Zhuo Li , Jia Li , Ge Li , Zhi Jin

Should AI Optimize Your Code? A Comparative Study of Classical Optimizing Compilers Versus Current Large Language Models

Traditional optimizing compilers have played an important role in adapting to the growing complexity of modern software systems. The need for efficient parallel programming in current architectures requires strong optimization techniques.…

Artificial Intelligence · Computer Science 2025-04-03 Miguel Romero Rosas , Miguel Torres Sanchez , Rudolf Eigenmann

Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs

In large language models (LLMs), code and reasoning reinforce each other: code offers an abstract, modular, and logic-driven structure that supports reasoning, while reasoning translates high-level goals into smaller, executable steps that…

Computation and Language · Computer Science 2025-02-27 Dayu Yang , Tianyang Liu , Daoan Zhang , Antoine Simoulin , Xiaoyi Liu , Yuwei Cao , Zhaopu Teng , Xin Qian , Grey Yang , Jiebo Luo , Julian McAuley

Code Review Without Borders: Evaluating Synthetic vs. Real Data for Review Recommendation

Automating the decision of whether a code change requires manual review is vital for maintaining software quality in modern development workflows. However, the emergence of new programming languages and frameworks creates a critical…

Software Engineering · Computer Science 2025-09-08 Yogev Cohen , Dudi Ohayon , Romy Somkin , Yehudit Aperstein , Alexander Apartsin

Evaluating the Generalization Capabilities of Large Language Models on Code Reasoning

We assess how the code reasoning abilities of large language models (LLMs) generalize to different kinds of programs. We present techniques for obtaining in- and out-of-distribution programs with different characteristics: code sampled from…

Software Engineering · Computer Science 2025-04-09 Rem Yang , Julian Dai , Nikos Vasilakis , Martin Rinard

From Effectiveness to Efficiency: Uncovering Linguistic Bias in Large Language Model-based Code Generation

Large Language Models (LLMs) have demonstrated promising capabilities for code generation. While existing benchmarks evaluate the correctness and efficiency of LLM-generated code, the potential linguistic bias - where code quality varies…

Software Engineering · Computer Science 2025-05-02 Weipeng Jiang , Xuanqi Gao , Juan Zhai , Shiqing Ma , Xiaoyu Zhang , Ziyan Lei , Chao Shen

AI-powered Code Review with LLMs: Early Results

In this paper, we present a novel approach to improving software quality and efficiency through a Large Language Model (LLM)-based model designed to review code and identify potential issues. Our proposed LLM-based AI agent model is trained…

Software Engineering · Computer Science 2025-12-11 Zeeshan Rasheed , Malik Abdul Sami , Muhammad Waseem , Kai-Kristian Kemell , Xiaofeng Wang , Anh Nguyen , Kari Systä , Pekka Abrahamsson

How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark

The emergence of large language models (LLMs) has significantly pushed the frontiers of program synthesis. Advancement of LLM-based program synthesis calls for a thorough evaluation of LLM-generated code. Most evaluation frameworks focus on…

Software Engineering · Computer Science 2025-02-20 Ruizhong Qiu , Weiliang Will Zeng , James Ezick , Christopher Lott , Hanghang Tong

LLMLOOP: Improving LLM-Generated Code and Tests through Automated Iterative Feedback Loops

Large Language Models (LLMs) are showing remarkable performance in generating source code, yet the generated code often has issues like compilation errors or incorrect code. Researchers and developers often face wasted effort in…

Software Engineering · Computer Science 2026-03-26 Ravin Ravi , Dylan Bradshaw , Stefano Ruberto , Gunel Jahangirova , Valerio Terragni

Self-planning Code Generation with Large Language Models

Although large language models (LLMs) have demonstrated impressive ability in code generation, they are still struggling to address the complicated intent provided by humans. It is widely acknowledged that humans typically employ planning…

Software Engineering · Computer Science 2025-10-21 Xue Jiang , Yihong Dong , Lecheng Wang , Zheng Fang , Qiwei Shang , Ge Li , Zhi Jin , Wenpin Jiao