Related papers: Interactive Code Generation via Test-Driven User-I…

LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation

Large language models (LLMs) have shown great potential in automating significant aspects of coding by producing natural code from informal natural language (NL) intent. However, given NL is informal, it does not lend easily to checking…

Software Engineering · Computer Science 2024-10-04 Sarah Fakhoury , Aaditya Naik , Georgios Sakkas , Saikat Chakraborty , Shuvendu K. Lahiri

Towards Machine-Generated Code for the Resolution of User Intentions

The growing capabilities of Artificial Intelligence (AI), particularly Large Language Models (LLMs), prompt a reassessment of the interaction mechanisms between users and their devices. Currently, users are required to use a set of…

Artificial Intelligence · Computer Science 2025-10-10 Justus Flerlage , Ilja Behnke , Odej Kao

IntentCoding: Amplifying User Intent in Code Generation

Large Language Models (LLMs) have shown strong capabilities in code generation, but their adherence to fine-grained user intent with multiple constraints remains a significant challenge. Our empirical analysis reveals two key observations:…

Software Engineering · Computer Science 2026-02-03 Zheng Fang , Yihong Dong , Lili Mou , Dongming Jin , Zhi Jin , Ge Li

Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks

Large Language Models (LLMs) have shown remarkable capabilities in code generation tasks, yet they face significant limitations in handling complex, long-context programming challenges and demonstrating complex compositional reasoning…

Artificial Intelligence · Computer Science 2025-01-14 Amr Almorsi , Mohanned Ahmed , Walid Gomaa

UICoder: Finetuning Large Language Models to Generate User Interface Code through Automated Feedback

Large language models (LLMs) struggle to consistently generate UI code that compiles and produces visually relevant designs. Existing approaches to improve generation rely on expensive human feedback or distilling a proprietary model. In…

Computation and Language · Computer Science 2024-06-13 Jason Wu , Eldon Schoop , Alan Leung , Titus Barik , Jeffrey P. Bigham , Jeffrey Nichols

Towards Formal Verification of LLM-Generated Code from Natural Language Prompts

In the past few years LLMs have emerged as a tool that can aid programmers by taking natural language descriptions and generating code based on it. However, the reliability of LLM code generation and current validation techniques for it are…

Programming Languages · Computer Science 2025-11-24 Aaron Councilman , David Jiahao Fu , Aryan Gupta , Chengxiao Wang , David Grove , Yu-Xiong Wang , Vikram Adve

CodeCoR: An LLM-Based Self-Reflective Multi-Agent Framework for Code Generation

Code generation aims to produce code that fulfills requirements written in natural languages automatically. Large language Models (LLMs) like ChatGPT have demonstrated promising effectiveness in this area. Nonetheless, these LLMs often fail…

Software Engineering · Computer Science 2025-01-15 Ruwei Pan , Hongyu Zhang , Chao Liu

Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents

Agentic AI systems can now generate code with remarkable fluency, but a fundamental question remains: \emph{does the generated code actually do what the user intended?} The gap between informal natural language requirements and precise…

Software Engineering · Computer Science 2026-03-19 Shuvendu K. Lahiri

Personality-Guided Code Generation Using Large Language Models

Code generation, the automatic creation of source code from natural language descriptions, has garnered significant attention due to its potential to streamline software development. Inspired by research that links task-personality…

Software Engineering · Computer Science 2025-05-30 Yaoqi Guo , Zhenpeng Chen , Jie M. Zhang , Yang Liu , Yun Ma

Dafny as Verification-Aware Intermediate Language for Code Generation

Using large language models (LLMs) to generate source code from natural language prompts is a popular and promising idea with a wide range of applications. One of its limitations is that the generated code can be faulty at times, often in a…

Software Engineering · Computer Science 2025-01-14 Yue Chen Li , Stefan Zetzsche , Siva Somayyajula

Self-Edit: Fault-Aware Code Editor for Code Generation

Large language models (LLMs) have demonstrated an impressive ability to generate codes on competitive programming tasks. However, with limited sample numbers, LLMs still suffer from poor accuracy. Inspired by the process of human…

Software Engineering · Computer Science 2023-09-12 Kechi Zhang , Zhuo Li , Jia Li , Ge Li , Zhi Jin

Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding

Large Language Models (LLMs) have demonstrated unprecedented capability in code generation. However, LLM-generated code is still plagued with a wide range of functional errors, especially for complex programming tasks that LLMs have not…

Software Engineering · Computer Science 2025-05-13 Yifeng Di , Tianyi Zhang

Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents

Recent advancements on Large Language Models (LLMs) enable AI Agents to automatically generate and execute multi-step plans to solve complex tasks. However, since LLM's content generation process is hardly controllable, current LLM-based…

Machine Learning · Computer Science 2024-08-13 Zelong Li , Wenyue Hua , Hao Wang , He Zhu , Yongfeng Zhang

mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code Generation

Recent advancements in large language models (LLMs) have significantly enhanced code generation from natural language prompts. The HumanEval Benchmark, developed by OpenAI, remains the most widely used code generation benchmark. However,…

Computation and Language · Computer Science 2025-05-19 Nishat Raihan , Antonios Anastasopoulos , Marcos Zampieri

Test-Driven Development for Code Generation

Recent Large Language Models (LLMs) have demonstrated significant capabilities in generating code snippets directly from problem statements. This increasingly automated process mirrors traditional human-led software development, where code…

Software Engineering · Computer Science 2024-10-23 Noble Saji Mathews , Meiyappan Nagappan

Improving Code Generation by Training with Natural Language Feedback

The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural…

Software Engineering · Computer Science 2024-02-26 Angelica Chen , Jérémy Scheurer , Tomasz Korbak , Jon Ander Campos , Jun Shern Chan , Samuel R. Bowman , Kyunghyun Cho , Ethan Perez

Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering

Code generation problems differ from common natural language problems - they require matching the exact syntax of the target language, identifying happy paths and edge cases, paying attention to numerous small details in the problem spec,…

Machine Learning · Computer Science 2024-01-17 Tal Ridnik , Dedy Kredo , Itamar Friedman

AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

The advancement of natural language processing (NLP) has been significantly boosted by the development of transformer-based large language models (LLMs). These models have revolutionized NLP tasks, particularly in code generation, aiding…

Computation and Language · Computer Science 2024-05-27 Dong Huang , Jie M. Zhang , Michael Luck , Qingwen Bu , Yuhao Qing , Heming Cui

Self-planning Code Generation with Large Language Models

Although large language models (LLMs) have demonstrated impressive ability in code generation, they are still struggling to address the complicated intent provided by humans. It is widely acknowledged that humans typically employ planning…

Software Engineering · Computer Science 2025-10-21 Xue Jiang , Yihong Dong , Lecheng Wang , Zheng Fang , Qiwei Shang , Ge Li , Zhi Jin , Wenpin Jiao

WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

Large language models (LLMs) support data analysis through conversational user interfaces, as exemplified in OpenAI's ChatGPT (formally known as Advanced Data Analysis or Code Interpreter). Essentially, LLMs produce code for accomplishing…

Human-Computer Interaction · Computer Science 2024-08-06 Liwenhan Xie , Chengbo Zheng , Haijun Xia , Huamin Qu , Chen Zhu-Tian