Related papers: Hygienic Source-Code Generation Using Functors

HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data

Large language models (LLMs) have shown great potential for automatic code generation and form the basis for various tools such as GitHub Copilot. However, recent studies highlight that many LLM-generated code contains serious security…

Cryptography and Security · Computer Science 2024-09-11 Hossein Hajipour , Lea Schönherr , Thorsten Holz , Mario Fritz

Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework

In real world software development, improper or missing exception handling can severely impact the robustness and reliability of code. Exception handling mechanisms require developers to detect, capture, and manage exceptions according to…

Computation and Language · Computer Science 2024-12-17 Xuanming Zhang , Yuxuan Chen , Yiming Zheng , Zhexin Zhang , Yuan Yuan , Minlie Huang

Extensible type checker for parser generation

Parser generators generate translators from language specifications. In many cases, such specifications contain semantic actions written in the same language as the generated code. Since these actions are subject to little static checking,…

Programming Languages · Computer Science 2010-02-09 Andrey Breslav

DocCGen: Document-based Controlled Code Generation

Recent developments show that Large Language Models (LLMs) produce state-of-the-art performance on natural language (NL) to code generation for resource-rich general-purpose languages like C++, Java, and Python. However, their practical…

Software Engineering · Computer Science 2024-07-04 Sameer Pimparkhede , Mehant Kammakomati , Srikanth Tamilselvam , Prince Kumar , Ashok Pon Kumar , Pushpak Bhattacharyya

Clover: Closed-Loop Verifiable Code Generation

The use of large language models for code generation is a rapidly growing trend in software development. However, without effective methods for ensuring the correctness of generated code, this trend could lead to undesirable outcomes. In…

Artificial Intelligence · Computer Science 2024-11-19 Chuyue Sun , Ying Sheng , Oded Padon , Clark Barrett

Rethinking the Evaluation of Secure Code Generation

Large language models (LLMs) are widely used in software development. However, the code generated by LLMs often contains vulnerabilities. Several secure code generation methods have been proposed to address this issue, but their current…

Cryptography and Security · Computer Science 2025-11-14 Shih-Chieh Dai , Jun Xu , Guanhong Tao

Type-Constrained Code Generation with Language Models

Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although…

Machine Learning · Computer Science 2025-05-09 Niels Mündler , Jingxuan He , Hao Wang , Koushik Sen , Dawn Song , Martin Vechev

RA-Gen: A Controllable Code Generation Framework Using ReAct for Multi-Agent Task Execution

Code generation models based on large language models (LLMs) have gained wide adoption, but challenges remain in ensuring safety, accuracy, and controllability, especially for complex tasks. Existing methods often lack dynamic integration…

Software Engineering · Computer Science 2025-10-13 Aofan Liu , Haoxuan Li , Bin Wang , Ao Yang , Hui Li

Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

$ $Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure…

Cryptography and Security · Computer Science 2025-09-30 Ran Elgedawy , Porter Dosch , John Sadik , Senjuti Dutta , Anuj Gautam , Konstantinos Georgiou , Farzin Gholamrezae , Fujiao Ji , Kyungchan Lim , Qian Liu , Scott Ruoti

Coqlex: Generating Formally Verified Lexers

A compiler consists of a sequence of phases going from lexical analysis to code generation. Ideally, the formal verification of a compiler should include the formal verification of each component of the tool-chain. An example is the…

Programming Languages · Computer Science 2023-06-22 Wendlasida Ouedraogo , Gabriel Scherer , Lutz Strassburger

A^3-CodGen: A Repository-Level Code Generation Framework for Code Reuse with Local-Aware, Global-Aware, and Third-Party-Library-Aware

LLM-based code generation tools are essential to help developers in the software development process. Existing tools often disconnect with the working context, i.e., the code repository, causing the generated code to be not similar to human…

Software Engineering · Computer Science 2024-10-29 Dianshu Liao , Shidong Pan , Xiaoyu Sun , Xiaoxue Ren , Qing Huang , Zhenchang Xing , Huan Jin , Qinying Li

Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks

Large Language Models (LLMs) have shown remarkable capabilities in code generation tasks, yet they face significant limitations in handling complex, long-context programming challenges and demonstrating complex compositional reasoning…

Artificial Intelligence · Computer Science 2025-01-14 Amr Almorsi , Mohanned Ahmed , Walid Gomaa

Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis

In the past few years, Large Language Models (LLMs) have exploded in usefulness and popularity for code generation tasks. However, LLMs still struggle with accuracy and are unsuitable for high-risk applications without additional oversight…

Software Engineering · Computer Science 2024-10-29 William Murphy , Nikolaus Holzer , Feitong Qiao , Leyi Cui , Raven Rothkopf , Nathan Koenig , Mark Santolucito

Fixing Function-Level Code Generation Errors for Foundation Large Language Models

Function-level code generation leverages foundation Large Language Models (LLMs) to automatically produce source code with expected functionality. It has been widely investigated and applied in intelligent programming assistants, such as…

Software Engineering · Computer Science 2025-01-22 Hao Wen , Yueheng Zhu , Chao Liu , Xiaoxue Ren , Weiwei Du , Meng Yan

Improving the Readability of Automatically Generated Tests using Large Language Models

Search-based test generators are effective at producing unit tests with high coverage. However, such automatically generated tests have no meaningful test and variable names, making them hard to understand and interpret by developers. On…

Software Engineering · Computer Science 2025-06-12 Matteo Biagiola , Gianluca Ghislotti , Paolo Tonella

The Code2Text Challenge: Text Generation in Source Code Libraries

We propose a new shared task for tactical data-to-text generation in the domain of source code libraries. Specifically, we focus on text generation of function descriptions from example software projects. Data is drawn from existing…

Computation and Language · Computer Science 2018-07-12 Kyle Richardson , Sina Zarrieß , Jonas Kuhn

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

Large language models (LLMs) are increasingly used in software development, yet their tendency to generate insecure code remains a major barrier to real-world deployment. Existing secure code alignment methods often suffer from a…

Cryptography and Security · Computer Science 2026-02-10 Tianyi Wu , Mingzhe Du , Yue Liu , Chengran Yang , Terry Yue Zhuo , Jiaheng Zhang , See-Kiong Ng

Practical LR Parser Generation

Parsing is a fundamental building block in modern compilers, and for industrial programming languages, it is a surprisingly involved task. There are known approaches to generate parsers automatically, but the prevailing consensus is that…

Formal Languages and Automata Theory · Computer Science 2022-09-20 Joe Zimmerman

Hallucination by Code Generation LLMs: Taxonomy, Benchmarks, Mitigation, and Challenges

Recent technical breakthroughs in large language models (LLMs) have enabled them to fluently generate source code. Software developers often leverage both general-purpose and code-specialized LLMs to revise existing code or even generate a…

Software Engineering · Computer Science 2025-05-14 Yunseo Lee , John Youngeun Song , Dongsun Kim , Jindae Kim , Mijung Kim , Jaechang Nam

SALLM: Security Assessment of Generated Code

With the growing popularity of Large Language Models (LLMs) in software engineers' daily practices, it is important to ensure that the code generated by these tools is not only functionally correct but also free of vulnerabilities. Although…

Software Engineering · Computer Science 2024-09-06 Mohammed Latif Siddiq , Joanna C. S. Santos , Sajith Devareddy , Anna Muller