Related papers: CodeRL: Mastering Code Generation through Pretrain…

Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis

The advent of large pre-trained language models in the domain of Code Synthesis has shown remarkable performance on various benchmarks, treating the problem of Code Generation in a fashion similar to Natural Language Generation, trained…

Machine Learning · Computer Science 2023-10-23 Philip John Gorinski , Matthieu Zimmer , Gerasimos Lampouras , Derrick Goh Xin Deik , Ignacio Iacobacci

Improving LLM Code Generation via Requirement-Aware Curriculum Reinforcement Learning

Code generation, which aims to automatically generate source code from given programming requirements, has the potential to substantially improve software development efficiency. With the rapid advancement of large language models (LLMs),…

Software Engineering · Computer Science 2026-05-04 Shouyu Yin , Zhao Tian , Junjie Chen , Shikai Guo

Learning to Synthesize Programs as Interpretable and Generalizable Policies

Recently, deep reinforcement learning (DRL) methods have achieved impressive performance on tasks in a variety of domains. However, neural network policies produced with DRL methods are not human-interpretable and often have difficulty…

Machine Learning · Computer Science 2022-02-02 Dweep Trivedi , Jesse Zhang , Shao-Hua Sun , Joseph J. Lim

CodeRL+: Improving Code Generation via Reinforcement with Execution Semantics Alignment

While Large Language Models (LLMs) excel at code generation by learning from vast code corpora, a fundamental semantic gap remains between their training on textual patterns and the goal of functional correctness, which is governed by…

Software Engineering · Computer Science 2026-04-23 Xue Jiang , Yihong Dong , Mengyang Liu , Hongyi Deng , Tian Wang , Yongding Tao , Rongyu Cao , Binhua Li , Zhi Jin , Wenpin Jiao , Fei Huang , Yongbin Li , Ge Li

Synchromesh: Reliable code generation from pre-trained language models

Large pre-trained language models have been used to generate code,providing a flexible interface for synthesizing programs from natural language specifications. However, they often violate syntactic and semantic rules of their output…

Machine Learning · Computer Science 2022-01-28 Gabriel Poesia , Oleksandr Polozov , Vu Le , Ashish Tiwari , Gustavo Soares , Christopher Meek , Sumit Gulwani

Function-constrained Program Synthesis

This work introduces (1) a technique that allows large language models (LLMs) to leverage user-provided code when solving programming tasks and (2) a method to iteratively generate modular sub-functions that can aid future code generation…

Machine Learning · Computer Science 2023-12-05 Patrick Hajali , Ignas Budvytis

Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis

Program synthesis is the task of automatically generating a program consistent with a specification. Recent years have seen proposal of a number of neural approaches for program synthesis, many of which adopt a sequence generation paradigm…

Machine Learning · Computer Science 2018-05-23 Rudy Bunel , Matthew Hausknecht , Jacob Devlin , Rishabh Singh , Pushmeet Kohli

Type-Constrained Code Generation with Language Models

Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although…

Machine Learning · Computer Science 2025-05-09 Niels Mündler , Jingxuan He , Hao Wang , Koushik Sen , Dawn Song , Martin Vechev

Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning

Reinforcement Learning (RL) has emerged as a popular training paradigm, particularly when paired with reasoning models. While effective, it primarily focuses on generating responses and lacks mechanisms to explicitly foster critique or…

Computation and Language · Computer Science 2026-03-13 Chi Ruan , Dongfu Jiang , Yubo Wang , Wenhu Chen

Program Synthesis Through Reinforcement Learning Guided Tree Search

Program Synthesis is the task of generating a program from a provided specification. Traditionally, this has been treated as a search problem by the programming languages (PL) community and more recently as a supervised learning problem by…

Artificial Intelligence · Computer Science 2018-06-11 Riley Simmons-Edler , Anders Miltner , Sebastian Seung

Learning to Generate Unit Test via Adversarial Reinforcement Learning

Unit testing is a core practice in programming, enabling systematic evaluation of programs produced by human developers or large language models (LLMs). Given the challenges in writing comprehensive unit tests, LLMs have been employed to…

Software Engineering · Computer Science 2026-03-17 Dongjun Lee , Changho Hwang , Kimin Lee

VERIRL: Boosting the LLM-based Verilog Code Generation via Reinforcement Learning

Recent advancements in code generation have shown remarkable success across software domains, yet hardware description languages (HDLs) such as Verilog remain underexplored due to their concurrency semantics, syntactic rigidity, and…

Machine Learning · Computer Science 2025-08-27 Fu Teng , Miao Pan , Xuhong Zhang , Zhezhi He , Yiyao Yang , Xinyi Chai , Mengnan Qi , Liqiang Lu , Jianwei Yin

Increasing LLM Coding Capabilities through Diverse Synthetic Coding Tasks

Large language models (LLMs) have shown impressive promise in code generation, yet their progress remains limited by the shortage of large-scale datasets that are both diverse and well-aligned with human reasoning. Most existing resources…

Machine Learning · Computer Science 2025-10-28 Amal Abed , Ivan Lukic , Jörg K. H. Franke , Frank Hutter

Training Language Models to Generate Quality Code with Program Analysis Feedback

Code generation with large language models (LLMs), often termed vibe coding, is increasingly adopted in production but fails to ensure code quality, particularly in security (e.g., SQL injection vulnerabilities) and maintainability (e.g.,…

Computation and Language · Computer Science 2025-05-30 Feng Yao , Zilong Wang , Liyuan Liu , Junxia Cui , Li Zhong , Xiaohan Fu , Haohui Mai , Vish Krishnan , Jianfeng Gao , Jingbo Shang

Learning to Generate Better Than Your LLM

Reinforcement learning (RL) has emerged as a powerful paradigm for fine-tuning Large Language Models (LLMs) for text generation. In particular, recent LLMs such as ChatGPT and GPT-4 can engage in fluent conversations with users after…

Machine Learning · Computer Science 2023-11-14 Jonathan D. Chang , Kiante Brantley , Rajkumar Ramamurthy , Dipendra Misra , Wen Sun

Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey

With the rapid evolution of large language models (LLM), reinforcement learning (RL) has emerged as a pivotal technique for code generation and optimization in various domains. This paper presents a systematic survey of the application of…

Software Engineering · Computer Science 2025-08-08 Junqiao Wang , Zeng Zhang , Yangfan He , Zihao Zhang , Xinyuan Song , Yuyang Song , Tianyu Shi , Yuchen Li , Hengyuan Xu , Kunyu Wu , Xin Yi , Zhongwei Wan , Xinhang Yuan , Zijun Wang , Kuan Lu , Menghao Huo , Tang Jingqun , Guangwu Qian , Keqin Li , Qiuwu Chen , Lewei He

Teaching Language Models to Critique via Reinforcement Learning

Teaching large language models (LLMs) to critique and refine their outputs is crucial for building systems that can iteratively improve, yet it is fundamentally limited by the ability to provide accurate judgments and actionable…

Machine Learning · Computer Science 2025-12-02 Zhihui Xie , Jie Chen , Liyu Chen , Weichao Mao , Jingjing Xu , Lingpeng Kong

Leveraging Reward Models for Guiding Code Review Comment Generation

Code review is a crucial component of modern software development, involving the evaluation of code quality, providing feedback on potential issues, and refining the code to address identified problems. Despite these benefits, code review…

Software Engineering · Computer Science 2025-06-06 Oussama Ben Sghaier , Rosalia Tufano , Gabriele Bavota , Houari Sahraoui

Code Review Without Borders: Evaluating Synthetic vs. Real Data for Review Recommendation

Automating the decision of whether a code change requires manual review is vital for maintaining software quality in modern development workflows. However, the emergence of new programming languages and frameworks creates a critical…

Software Engineering · Computer Science 2025-09-08 Yogev Cohen , Dudi Ohayon , Romy Somkin , Yehudit Aperstein , Alexander Apartsin

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code…

Software Engineering · Computer Science 2024-02-06 Shihan Dou , Yan Liu , Haoxiang Jia , Limao Xiong , Enyu Zhou , Wei Shen , Junjie Shan , Caishuang Huang , Xiao Wang , Xiaoran Fan , Zhiheng Xi , Yuhao Zhou , Tao Ji , Rui Zheng , Qi Zhang , Xuanjing Huang , Tao Gui