Related papers: Compilable Neural Code Generation with Compiler Fe…

Input-Gen: Guided Generation of Stateful Inputs for Testing, Tuning, and Training

The size and complexity of software applications is increasing at an accelerating pace. Source code repositories (along with their dependencies) require vast amounts of labor to keep them tested, maintained, and up to date. As the…

Software Engineering · Computer Science 2024-06-14 Ivan R. Ivanov , Joachim Meyer , Aiden Grossman , William S. Moses , Johannes Doerfert

Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback

Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this…

Computation and Language · Computer Science 2024-06-12 Zhangqian Bi , Yao Wan , Zheng Wang , Hongyu Zhang , Batu Guan , Fangxin Lu , Zili Zhang , Yulei Sui , Hai Jin , Xuanhua Shi

CodeCoR: An LLM-Based Self-Reflective Multi-Agent Framework for Code Generation

Code generation aims to produce code that fulfills requirements written in natural languages automatically. Large language Models (LLMs) like ChatGPT have demonstrated promising effectiveness in this area. Nonetheless, these LLMs often fail…

Software Engineering · Computer Science 2025-01-15 Ruwei Pan , Hongyu Zhang , Chao Liu

CompCodeVet: A Compiler-guided Validation and Enhancement Approach for Code Dataset

Large language models (LLMs) have become increasingly prominent in academia and industry due to their remarkable performance in diverse applications. As these models evolve with increasing parameters, they excel in tasks like sentiment…

Machine Learning · Computer Science 2023-11-14 Le Chen , Arijit Bhattacharjee , Nesreen K. Ahmed , Niranjan Hasabnis , Gal Oren , Bin Lei , Ali Jannesari

Compiler generated feedback for Large Language Models

We introduce a novel paradigm in compiler optimization powered by Large Language Models with compiler feedback to optimize the code size of LLVM assembly. The model takes unoptimized LLVM IR as input and produces optimized IR, the best…

Programming Languages · Computer Science 2024-03-25 Dejan Grubisic , Chris Cummins , Volker Seeker , Hugh Leather

On the Reliability and Explainability of Language Models for Program Generation

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…

Software Engineering · Computer Science 2024-01-09 Yue Liu , Chakkrit Tantithamthavorn , Yonghui Liu , Li Li

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code…

Software Engineering · Computer Science 2024-02-06 Shihan Dou , Yan Liu , Haoxiang Jia , Limao Xiong , Enyu Zhou , Wei Shen , Junjie Shan , Caishuang Huang , Xiao Wang , Xiaoran Fan , Zhiheng Xi , Yuhao Zhou , Tao Ji , Rui Zheng , Qi Zhang , Xuanjing Huang , Tao Gui

Targeted Example Generation for Compilation Errors

We present TEGCER, an automated feedback tool for novice programmers. TEGCER uses supervised classification to match compilation errors in new code submissions with relevant pre-existing errors, submitted by other students before. The dense…

Software Engineering · Computer Science 2019-10-28 Umair Z. Ahmed , Renuka Sindhgatta , Nisheeth Srivastava , Amey Karkare

IntelliCode Compose: Code Generation Using Transformer

In software development through integrated development environments (IDEs), code completion is one of the most widely used features. Nevertheless, majority of integrated development environments only support completion of methods and APIs,…

Computation and Language · Computer Science 2020-11-02 Alexey Svyatkovskiy , Shao Kun Deng , Shengyu Fu , Neel Sundaresan

LEGO-Compiler: Enhancing Neural Compilation Through Translation Composability

Large language models (LLMs) have the potential to revolutionize how we design and implement compilers and code translation tools. However, existing LLMs struggle to handle long and complex programs. We introduce LEGO-Compiler, a novel…

Programming Languages · Computer Science 2025-05-28 Shuoming Zhang , Jiacheng Zhao , Chunwei Xia , Zheng Wang , Yunji Chen , Xiaobing Feng , Huimin Cui

An Empirical Study of Retrieval-Augmented Code Generation: Challenges and Opportunities

Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…

Software Engineering · Computer Science 2025-01-24 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Xing Hu , Kui Liu , Xin Xia

Decaf: Improving Neural Decompilation with Automatic Feedback and Search

Decompilers are useful tools used in reverse engineering to understand compiled source code. Reconstructing source code from compiled binaries is a challenging task, because high-level syntax, identifiers, and custom data types are…

Software Engineering · Computer Science 2026-05-13 Alexander Shypula , Osbert Bastani , Edward Schwartz

CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports

Current compiler optimization reports often present complex, technical information that is difficult for programmers to interpret and act upon effectively. This paper assesses the capability of large language models (LLM) to understand…

Programming Languages · Computer Science 2025-06-16 Peter Pirkelbauer , Chunhua Liao

ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation

Code generation plays a crucial role in various tasks, such as code auto-completion and mathematical reasoning. Previous work has proposed numerous methods to enhance code generation performance, including integrating feedback from the…

Computation and Language · Computer Science 2025-05-30 Houxing Ren , Mingjie Zhan , Zhongyuan Wu , Aojun Zhou , Junting Pan , Hongsheng Li

Developing a Modular Compiler for a Subset of a C-like Language

The paper introduces the development of a modular compiler for a subset of a C-like language, which addresses the challenges in constructing a compiler for high-level languages. This modular approach will allow developers to modify a…

Programming Languages · Computer Science 2025-01-10 Debasish Dutta , Neeharika Sonowal , Irani Hazarika

Learning to Make Compiler Optimizations More Effective

Because loops execute their body many times, compiler developers place much emphasis on their optimization. Nevertheless, in view of highly diverse source code and hardware, compilers still struggle to produce optimal target code. The sheer…

Programming Languages · Computer Science 2021-03-01 Rahim Mammadli , Marija Selakovic , Felix Wolf , Michael Pradel

langcc: A Next-Generation Compiler Compiler

Traditionally, parsing has been a laborious and error-prone component of compiler development, and most parsers for full industrial programming languages are still written by hand. The author [Zim22] shows that automatic parser generation…

Programming Languages · Computer Science 2022-09-20 Joe Zimmerman

Automated code generation for discontinuous Galerkin methods

A compiler approach for generating low-level computer code from high-level input for discontinuous Galerkin finite element forms is presented. The input language mirrors conventional mathematical notation, and the compiler generates…

Mathematical Software · Computer Science 2011-04-05 Kristian B. Ølgaard , Anders Logg , Garth N. Wells

Neural Text Generation: A Practical Guide

Deep learning methods have recently achieved great empirical success on machine translation, dialogue response generation, summarization, and other text generation tasks. At a high level, the technique has been to train end-to-end neural…

Computation and Language · Computer Science 2017-11-28 Ziang Xie

Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction

Despite significant investment in software infrastructure, machine learning systems, runtimes and compilers do not compose properly. We propose a new design aiming at providing unprecedented degrees of modularity, composability and…

Programming Languages · Computer Science 2022-02-08 Nicolas Vasilache , Oleksandr Zinenko , Aart J. C. Bik , Mahesh Ravishankar , Thomas Raoux , Alexander Belyaev , Matthias Springer , Tobias Gysi , Diego Caballero , Stephan Herhut , Stella Laurenzo , Albert Cohen