Related papers: Compilable Neural Code Generation with Compiler Fe…
The size and complexity of software applications is increasing at an accelerating pace. Source code repositories (along with their dependencies) require vast amounts of labor to keep them tested, maintained, and up to date. As the…
Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this…
Code generation aims to produce code that fulfills requirements written in natural languages automatically. Large language Models (LLMs) like ChatGPT have demonstrated promising effectiveness in this area. Nonetheless, these LLMs often fail…
Large language models (LLMs) have become increasingly prominent in academia and industry due to their remarkable performance in diverse applications. As these models evolve with increasing parameters, they excel in tasks like sentiment…
We introduce a novel paradigm in compiler optimization powered by Large Language Models with compiler feedback to optimize the code size of LLVM assembly. The model takes unoptimized LLVM IR as input and produces optimized IR, the best…
Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…
The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code…
We present TEGCER, an automated feedback tool for novice programmers. TEGCER uses supervised classification to match compilation errors in new code submissions with relevant pre-existing errors, submitted by other students before. The dense…
In software development through integrated development environments (IDEs), code completion is one of the most widely used features. Nevertheless, majority of integrated development environments only support completion of methods and APIs,…
Large language models (LLMs) have the potential to revolutionize how we design and implement compilers and code translation tools. However, existing LLMs struggle to handle long and complex programs. We introduce LEGO-Compiler, a novel…
Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…
Decompilers are useful tools used in reverse engineering to understand compiled source code. Reconstructing source code from compiled binaries is a challenging task, because high-level syntax, identifiers, and custom data types are…
Current compiler optimization reports often present complex, technical information that is difficult for programmers to interpret and act upon effectively. This paper assesses the capability of large language models (LLM) to understand…
Code generation plays a crucial role in various tasks, such as code auto-completion and mathematical reasoning. Previous work has proposed numerous methods to enhance code generation performance, including integrating feedback from the…
The paper introduces the development of a modular compiler for a subset of a C-like language, which addresses the challenges in constructing a compiler for high-level languages. This modular approach will allow developers to modify a…
Because loops execute their body many times, compiler developers place much emphasis on their optimization. Nevertheless, in view of highly diverse source code and hardware, compilers still struggle to produce optimal target code. The sheer…
Traditionally, parsing has been a laborious and error-prone component of compiler development, and most parsers for full industrial programming languages are still written by hand. The author [Zim22] shows that automatic parser generation…
A compiler approach for generating low-level computer code from high-level input for discontinuous Galerkin finite element forms is presented. The input language mirrors conventional mathematical notation, and the compiler generates…
Deep learning methods have recently achieved great empirical success on machine translation, dialogue response generation, summarization, and other text generation tasks. At a high level, the technique has been to train end-to-end neural…
Despite significant investment in software infrastructure, machine learning systems, runtimes and compilers do not compose properly. We propose a new design aiming at providing unprecedented degrees of modularity, composability and…