Related papers: Evaluating Language Models for Efficient Code Gene…

DFPE: A Diverse Fingerprint Ensemble for Enhancing LLM Performance

Large Language Models (LLMs) have shown remarkable capabilities across various natural language processing tasks but often struggle to excel uniformly in diverse or complex domains. We propose a novel ensemble method - Diverse Fingerprint…

Machine Learning · Computer Science 2025-02-10 Seffi Cohen , Niv Goldshlager , Nurit Cohen-Inger , Bracha Shapira , Lior Rokach

Multi-Programming Language Ensemble for Code Generation in Large Language Model

Large language models (LLMs) have significantly improved code generation, particularly in one-pass code generation. However, most existing approaches focus solely on generating code in a single programming language, overlooking the…

Computation and Language · Computer Science 2024-09-09 Tengfei Xue , Xuefeng Li , Tahir Azim , Roman Smirnov , Jianhui Yu , Arash Sadrieh , Babak Pahlavan

CodePDE: An Inference Framework for LLM-driven PDE Solver Generation

Partial differential equations (PDEs) are fundamental to modeling physical systems, yet solving them remains a complex challenge. Traditional numerical solvers rely on expert knowledge to implement and are computationally expensive, while…

Machine Learning · Computer Science 2026-02-24 Shanda Li , Tanya Marwah , Junhong Shen , Weiwei Sun , Andrej Risteski , Yiming Yang , Ameet Talwalkar

On Evaluating the Efficiency of Source Code Generated by LLMs

Recent years have seen the remarkable capabilities of large language models (LLMs) for code generation. Different from existing work that evaluate the correctness of the code generated by LLMs, we propose to further evaluate its efficiency.…

Software Engineering · Computer Science 2024-04-10 Changan Niu , Ting Zhang , Chuanyi Li , Bin Luo , Vincent Ng

CRPE: Expanding The Reasoning Capability of Large Language Model for Code Generation

We introduce CRPE (Code Reasoning Process Enhancer), an innovative three-stage framework for data synthesis and model training that advances the development of sophisticated code reasoning capabilities in large language models (LLMs).…

Software Engineering · Computer Science 2025-05-19 Ningxin Gui , Qianghuai Jia , Feijun Jiang , Yuling Jiao , dechun wang , Jerry Zhijian Yang

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Code performance optimization is paramount in real-world software engineering and critical for production-level systems. While Large Language Models (LLMs) have demonstrated impressive capabilities in code generation and bug fixing, their…

Software Engineering · Computer Science 2025-07-17 Xinyi He , Qian Liu , Mingzhe Du , Lin Yan , Zhijie Fan , Yiming Huang , Zejian Yuan , Zejun Ma

Holistic Evaluation of State-of-the-Art LLMs for Code Generation

This study presents a comprehensive empirical evaluation of six state-of-the-art large language models (LLMs) for code generation, including both general-purpose and code-specialized models. Using a dataset of 944 real-world LeetCode…

Software Engineering · Computer Science 2025-12-23 Le Zhang , Suresh Kothari

PerfCodeBench: Benchmarking LLMs for System-Level High-Performance Code Optimization

Large language models (LLMs) can often generate functionally correct code, but their ability to produce efficient implementations for performance-critical systems tasks remains limited. Existing code benchmarks mainly emphasize correctness…

Software Engineering · Computer Science 2026-05-18 Huihao Jing , Wenbin Hu , Haochen Shi , Hanyu Yang , Sirui Zhang , Shaojin Chen , Haoran Li , Yangqiu Song

COFFE: A Code Efficiency Benchmark for Code Generation

Code generation has largely improved development efficiency in the era of large language models (LLMs). With the ability to follow instructions, current LLMs can be prompted to generate code solutions given detailed descriptions in natural…

Software Engineering · Computer Science 2025-02-06 Yun Peng , Jun Wan , Yichen Li , Xiaoxue Ren

LLM4EFFI: Leveraging Large Language Models to Enhance Code Efficiency and Correctness

Large Language Models (LLMs), particularly Code LLMs, have demonstrated impressive performance in code generation. Current research primarily focuses on the correctness of generated code, while efficiency remains less explored. Recent works…

Software Engineering · Computer Science 2025-02-27 Tong Ye , Weigang Huang , Xuhong Zhang , Tengfei Ma , Peiyu Liu , Jianwei Yin , Wenhai Wang

A Survey on Evaluating Large Language Models in Code Generation Tasks

This paper provides a comprehensive review of the current methods and metrics used to evaluate the performance of Large Language Models (LLMs) in code generation tasks. With the rapid growth in demand for automated software development,…

Software Engineering · Computer Science 2025-03-05 Liguo Chen , Qi Guo , Hongrui Jia , Zhengran Zeng , Xin Wang , Yijiang Xu , Jian Wu , Yidong Wang , Qing Gao , Jindong Wang , Wei Ye , Shikun Zhang

DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning

Code Large Language Models (Code LLMs) have demonstrated outstanding performance in code-related tasks. Several instruction tuning approaches have been proposed to boost the code generation performance of pre-trained Code LLMs. In this…

Computation and Language · Computer Science 2024-02-15 Yejie Wang , Keqing He , Guanting Dong , Pei Wang , Weihao Zeng , Muxi Diao , Yutao Mou , Mengdi Zhang , Jingang Wang , Xunliang Cai , Weiran Xu

Efficient Code LLM Training via Distribution-Consistent and Diversity-Aware Data Selection

Recent advancements in large language models (LLMs) have significantly improved code generation and program comprehension, accelerating the evolution of software engineering. Current methods primarily enhance model performance by leveraging…

Computation and Language · Computer Science 2025-07-04 Weijie Lyu , Sheng-Jun Huang , Xuan Xia

MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation

Large language models have demonstrated the ability to generate both natural language and programming language text. Such models open up the possibility of multi-language code generation: could code generation models generalize knowledge…

Machine Learning · Computer Science 2022-12-20 Federico Cassano , John Gouwar , Daniel Nguyen , Sydney Nguyen , Luna Phipps-Costin , Donald Pinckney , Ming-Ho Yee , Yangtian Zi , Carolyn Jane Anderson , Molly Q Feldman , Arjun Guha , Michael Greenberg , Abhinav Jangda

FasterPy: An LLM-based Code Execution Efficiency Optimization Framework

Code often suffers from performance bugs. These bugs necessitate the research and practice of code optimization. Traditional rule-based methods rely on manually designing and maintaining rules for specific performance bugs (e.g., redundant…

Software Engineering · Computer Science 2025-12-30 Yue Wu , Minghao Han , Ruiyin Li , Peng Liang , Amjed Tahir , Zengyang Li , Qiong Feng , Mojtaba Shahin

ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?

Although large language models (LLMs) have been largely successful in generating functionally correct programs, conditioning models to produce efficient solutions while ensuring correctness remains a challenge. Further, unreliability in…

Computation and Language · Computer Science 2024-10-11 Siddhant Waghjale , Vishruth Veerendranath , Zora Zhiruo Wang , Daniel Fried

Evaluating LLM-Generated Code: A Benchmark and Developer Study

Code generation is one of the tasks for which the use of Large Language Models is widely adopted and highly successful. Given this popularity, there are many benchmarks dedicated to code generation that can help select the best model.…

Software Engineering · Computer Science 2026-05-12 Joanna Szych , Anne Schwerk

EvoCodeBench: A Human-Performance Benchmark for Self-Evolving LLM-Driven Coding Systems

As large language models (LLMs) continue to advance in programming tasks, LLM-driven coding systems have evolved from one-shot code generation into complex systems capable of iterative improvement during inference. However, existing code…

Software Engineering · Computer Science 2026-02-12 Wentao Zhang , Jianfeng Wang , Liheng Liang , Yilei Zhao , HaiBin Wen , Zhe Zhao

Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation

Large Language Models (LLMs) often struggle to process and generate coherent context when the number of input tokens exceeds the pre-trained length. Recent advancements in long-context extension have significantly expanded the context…

Computation and Language · Computer Science 2025-04-29 Yi Lu , Wanxu Zhao , Xin Zhou , Chenxin An , Chenglong Wang , Shuo Li , Yuming Yang , Jun Zhao , Tao Ji , Tao Gui , Qi Zhang , Xuanjing Huang

DSCodeBench: A Realistic Benchmark for Data Science Code Generation

We introduce DSCodeBench, a new benchmark designed to evaluate large language models (LLMs) on complicated and realistic data science code generation tasks. DSCodeBench consists of 1,000 carefully constructed problems sourced from realistic…

Software Engineering · Computer Science 2025-11-18 Shuyin Ouyang , Dong Huang , Jingwen Guo , Zeyu Sun , Qihao Zhu , Jie M. Zhang