English
Related papers

Related papers: Multi-Turn Code Generation Through Single-Step Rew…

200 papers

While large language models have proven effective in a huge range of downstream applications, they often generate text that is problematic or lacks a desired attribute. In this paper, we introduce Reward-Augmented Decoding (RAD), a text…

Computation and Language · Computer Science 2024-01-03 Haikang Deng , Colin Raffel

Code generation is a core application of large language models (LLMs), yet LLMs still frequently fail on complex programming tasks. Given its success in mathematical reasoning, test-time scaling approaches such as Process Reward Model…

Machine Learning · Computer Science 2026-02-02 Ruiyi Zhang , Peijia Qin , Qi Cao , Eric Xue , Pengtao Xie

Self-correction has demonstrated potential in code generation by allowing language models to revise and improve their outputs through successive refinement. Recent studies have explored prompting-based strategies that incorporate…

Computation and Language · Computer Science 2025-08-26 Jeonghun Cho , Deokhyung Kang , Hyounghun Kim , Gary Geunbae Lee

Tree search-based methods have made significant progress in enhancing the code generation capabilities of large language models. However, due to the difficulty in effectively evaluating intermediate algorithmic steps and the inability to…

Artificial Intelligence · Computer Science 2025-12-18 Yuanyuan Lin , Xiangyu Ouyang , Teng Zhang , Kaixin Sui

Reinforcement learning (RL) with unit test feedback has enhanced large language models' (LLMs) code generation, but relies on sparse rewards provided only after complete code evaluation, limiting learning efficiency and incremental…

Artificial Intelligence · Computer Science 2025-02-05 Ning Dai , Zheng Wu , Renjie Zheng , Ziyun Wei , Wenlei Shi , Xing Jin , Guanlin Liu , Chen Dun , Liang Huang , Lin Yan

Tool use enables large language models to solve complex tasks through sequences of API calls, yet existing reinforcement learning approaches fail to scale to multi-step composition settings. Outcome-based rewards provide only sparse…

Machine Learning · Computer Science 2026-05-19 Anay Kulkarni , ChiaEn Lu , Dheeraj Mekala , Jayanth Srinivasa , Gaowen Liu , Jingbo Shang

Reinforcement Learning with Verifiable Rewards (RLVR) has become a standard recipe for post-training LLMs on reasoning tasks, with Group Relative Policy Optimization (GRPO) emerging as a leading approach. However, GRPO and its variants are…

Machine Learning · Computer Science 2026-05-12 Chanakya Ekbote , Vijay Lingam , Sujay Sanghavi , Jun Huan , Behrooz Omidvar-Tehrani , Anoop Deoras , Stefano Soatto

Existing reinforcement learning strategies based on outcome supervision have proven effective in enhancing the performance of large language models(LLMs) for code generation. While reinforcement learning based on process supervision has…

Software Engineering · Computer Science 2025-02-05 Yufan Ye , Ting Zhang , Wenbin Jiang , Hua Huang

Large Language Models (LLMs) demonstrate strong capabilities in general coding tasks but encounter two key challenges when optimizing code: (i) the complexity of writing optimized code (such as performant CUDA kernels and competition-level…

Machine Learning · Computer Science 2026-01-12 Jiefu Ou , Sapana Chaudhary , Kaj Bostrom , Nathaniel Weir , Shuai Zhang , Huzefa Rangwala , George Karypis

Large Language Models (LLMs) have shown significant potential in designing reward functions for Reinforcement Learning (RL) tasks. However, obtaining high-quality reward code often involves human intervention, numerous LLM queries, or…

Machine Learning · Computer Science 2024-10-21 Shengjie Sun , Runze Liu , Jiafei Lyu , Jing-Wen Yang , Liangpeng Zhang , Xiu Li

Recent advancements in language modeling have enabled the translation of natural language into code, and the use of execution feedback to improve code generation. However, these methods often rely heavily on pre-existing test cases, which…

Software Engineering · Computer Science 2024-12-19 Nan Wang , Yafei Liu , Chen Chen , Haonan Lu

Large language models (LLMs) have demonstrated strong code generation capabilities, yet the runtime performance of generated code is not guaranteed, and there have been few attempts to train LLMs using runtime performance as a reward in the…

Machine Learning · Computer Science 2026-02-13 Ryo Mikasa , Shun-ichiro Hayashi , Daichi Mukunoki , Tetsuya Hoshino , Takahiro Katagiri

Generating high-quality code that solves complex programming tasks is challenging, especially with current decoder-based models that produce highly stochastic outputs. In code generation, even minor errors can easily break the entire…

Computation and Language · Computer Science 2025-04-15 Nikita Sorokin , Ivan Sedykh , Valentin Malykh

In this work, we introduce CAD-Coder, a novel framework that reformulates text-to-CAD as the generation of CadQuery scripts - a Python-based, parametric CAD language. This representation enables direct geometric validation, a richer…

Graphics · Computer Science 2026-05-15 Yandong Guan , Xilin Wang , Ximing Xing , Jing Zhang , Dong Xu , Qian Yu

Recent years have seen considerable advancements in multi-step reasoning with Large Language Models (LLMs). The previous studies have elucidated the merits of integrating feedback or search mechanisms during model inference to improve the…

Computation and Language · Computer Science 2023-10-17 Qianli Ma , Haotian Zhou , Tingkai Liu , Jianbo Yuan , Pengfei Liu , Yang You , Hongxia Yang

While code large language models have demonstrated remarkable progress in code generation, the generated code often exhibits poor runtime efficiency, limiting its practical application in performance-sensitive scenarios. To address this…

Software Engineering · Computer Science 2025-08-29 Yunlong Feng , Yang Xu , Xiao Xu , Binyuan Hui , Junyang Lin

Code generation plays a crucial role in various tasks, such as code auto-completion and mathematical reasoning. Previous work has proposed numerous methods to enhance code generation performance, including integrating feedback from the…

Computation and Language · Computer Science 2025-05-30 Houxing Ren , Mingjie Zhan , Zhongyuan Wu , Aojun Zhou , Junting Pan , Hongsheng Li

In models to generate program source code from natural language, representing this code in a tree structure has been a common approach. However, existing methods often fail to generate complex code correctly due to a lack of ability to…

Computation and Language · Computer Science 2018-08-31 Shirley Anugrah Hayati , Raphael Olivier , Pravalika Avvaru , Pengcheng Yin , Anthony Tomasic , Graham Neubig

Tool invocation significantly enhances the capabilities of Large Language Models (LLMs), yet challenges persist, particularly in complex task scenarios. Current methods, such as instruction-enhanced reasoning and supervised fine-tuning,…

Artificial Intelligence · Computer Science 2025-08-07 Yifei Lu , Fanghua Ye , Jian Li , Qiang Gao , Cheng Liu , Haibo Luo , Nan Du , Xiaolong Li , Feiliang Ren

Consider a transmission scheme with a single transmitter and multiple receivers over a faulty broadcast channel. For each receiver, the transmitter has a unique infinite stream of packets, and its goal is to deliver them at the highest…

Information Theory · Computer Science 2015-10-27 Mark Shifrin , Asaf Cohen , Omer Gurewitz , Olga Weisman
‹ Prev 1 2 3 10 Next ›