English
Related papers

Related papers: LLM-Assisted Code Cleaning For Training Accurate C…

200 papers

Recent work targeting large language models (LLMs) for code generation demonstrated that increasing the amount of training data through synthetic code generation often leads to exceptional performance. In this paper we explore data pruning…

Software Engineering · Computer Science 2024-07-09 Yun-Da Tsai , Mingjie Liu , Haoxing Ren

Large language models (LLMs) have demonstrated significant potential in code generation tasks. However, there remains a performance gap between open-source and closed-source models. To address this gap, existing approaches typically…

Computation and Language · Computer Science 2025-04-18 Weijie Lv , Xuan Xia , Sheng-Jun Huang

Large Language Models (LLMs) have revolutionized code generation but require significant resources and often over-generalize, limiting their task-specific efficiency. Fine-tuning smaller, open-source LLMs provides a cost-effective…

Computation and Language · Computer Science 2025-06-27 Leitian Tao , Xiang Chen , Tong Yu , Tung Mai , Ryan Rossi , Yixuan Li , Saayan Mitra

Optimizing scientific software is a difficult task because codebases are often large and complex, and performance can depend upon several factors including the algorithm, its implementation, and hardware among others. Causes of poor…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-30 Daniel Nichols , Pranav Polasam , Harshitha Menon , Aniruddha Marathe , Todd Gamblin , Abhinav Bhatele

The advent of Large Language Models (LLMs) has significantly advanced the field of automated code generation. LLMs rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource…

Software Engineering · Computer Science 2025-02-03 Alessandro Giagnorio , Alberto Martin-Lopez , Gabriele Bavota

Code generation, the automatic creation of source code from natural language descriptions, has garnered significant attention due to its potential to streamline software development. Inspired by research that links task-personality…

Software Engineering · Computer Science 2025-05-30 Yaoqi Guo , Zhenpeng Chen , Jie M. Zhang , Yang Liu , Yun Ma

Large Language Models (LLMs) have demonstrated their remarkable capabilities in numerous fields. This survey focuses on how LLMs empower users, regardless of their technical background, to use human languages to automatically generate…

Software Engineering · Computer Science 2025-04-03 Nam Huynh , Beiyu Lin

Large Language Models (LLMs) and pre-trained Language Models (LMs) have achieved impressive success on many software engineering tasks (e.g., code completion and code generation). By leveraging huge existing code corpora (e.g., GitHub),…

Software Engineering · Computer Science 2025-01-16 Xin Yin , Chao Ni , Xiaodan Xu , Xinrui Li , Xiaohu Yang

Code generation problems differ from common natural language problems - they require matching the exact syntax of the target language, identifying happy paths and edge cases, paying attention to numerous small details in the problem spec,…

Machine Learning · Computer Science 2024-01-17 Tal Ridnik , Dedy Kredo , Itamar Friedman

Large Language Models (LLMs), particularly Code LLMs, have demonstrated impressive performance in code generation. Current research primarily focuses on the correctness of generated code, while efficiency remains less explored. Recent works…

Software Engineering · Computer Science 2025-02-27 Tong Ye , Weigang Huang , Xuhong Zhang , Tengfei Ma , Peiyu Liu , Jianwei Yin , Wenhai Wang

Many applications today use large language models for code generation; however, production systems have strict latency requirements that can be difficult to meet with large models. Small language models with a few billion parameters are…

Machine Learning · Computer Science 2026-04-14 Renjini R. Nair , Damian K. Kowalczyk , Marco Gaudesi , Chhaya Methani

Background: The rise of Large Language Models (LLMs) in software development has opened new possibilities for code generation. Despite the widespread use of this technology, it remains unclear how well LLMs generate code solutions in terms…

Software Engineering · Computer Science 2025-08-04 Alfred Santa Molison , Marcia Moraes , Glaucia Melo , Fabio Santos , Wesley K. G. Assuncao

Large Language Models (LLMs) have been widely adopted in commercial code completion engines, significantly enhancing coding efficiency and productivity. However, LLMs may generate code with quality issues that violate coding standards and…

Software Engineering · Computer Science 2025-03-20 Yuan Jiang , Yujian Zhang , Liang Lu , Christoph Treude , Xiaohong Su , Shan Huang , Tiantian Wang

Recent advancements in Large Language Models (LLMs) have significantly improved their capabilities in natural language processing and code synthesis, enabling more complex applications across different fields. This paper explores the…

Cryptography and Security · Computer Science 2024-10-30 Mohammad Setak , Pooria Madani

Sequence-to-sequence models have been used to transform erroneous programs into correct ones when trained with a large enough dataset. Some recent studies also demonstrated strong empirical evidence that code review could improve the…

Machine Learning · Computer Science 2023-07-25 Rishov Paul , Md. Mohib Hossain , Mohammed Latif Siddiq , Masum Hasan , Anindya Iqbal , Joanna C. S. Santos

Data profiling is critical in machine learning for generating descriptive statistics, supporting both deeper understanding and downstream tasks like data valuation and curation. This work addresses profiling specifically in the context of…

Software Engineering · Computer Science 2025-03-21 Pankaj Thorat , Adnan Qidwai , Adrija Dhar , Aishwariya Chakraborty , Anand Eswaran , Hima Patel , Praveen Jayachandran

Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP), such as text generation and understanding. Recently, these models have extended their capabilities to coding tasks,…

Machine Learning · Computer Science 2024-10-23 Nishat Raihan , Mohammed Latif Siddiq , Joanna C. S. Santos , Marcos Zampieri

Code review is essential for maintaining software quality but often time-consuming and cognitively demanding, especially in industrial environments. Recent advancements in language models (LMs) have opened new avenues for automating core…

Software Engineering · Computer Science 2025-10-24 Igli Begolli , Meltem Aksoy , Daniel Neider

Large Language Models (LLMs) are widely used in software engineering to generate, complete, translate, and fix code, improving developer productivity. While most research focuses on the energy consumption and carbon emissions of model…

Software Engineering · Computer Science 2026-04-15 Sabiya Banu Masthan Ali , Oussema Kirmani , Aroosa Hameed , Syed Muhammad Danish , Gautam Srivastava

Large language models (LLMs) have achieved remarkable progress in code generation, largely driven by the availability of high-quality code datasets for effective training. To further improve data quality, numerous training data optimization…

Software Engineering · Computer Science 2026-01-01 Shiqi Kuang , Zhao Tian , Tao Xiao , Dong Wang , Junjie Chen
‹ Prev 1 2 3 10 Next ›