Related papers: SecCodePRM: A Process Reward Model for Code Securi…

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

Large language models (LLMs) are increasingly used in software development, yet their tendency to generate insecure code remains a major barrier to real-world deployment. Existing secure code alignment methods often suffer from a…

Cryptography and Security · Computer Science 2026-02-10 Tianyi Wu , Mingzhe Du , Yue Liu , Chengran Yang , Terry Yue Zhuo , Jiaheng Zhang , See-Kiong Ng

FunPRM: Function-as-Step Process Reward Model with Meta Reward Correction for Code Generation

Code generation is a core application of large language models (LLMs), yet LLMs still frequently fail on complex programming tasks. Given its success in mathematical reasoning, test-time scaling approaches such as Process Reward Model…

Machine Learning · Computer Science 2026-02-02 Ruiyi Zhang , Peijia Qin , Qi Cao , Eric Xue , Pengtao Xie

From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation by Natural Language Prompting

Large Language Models (LLMs) have shown remarkable potential in code generation, making them increasingly important in the field. However, the security issues of generated code have not been fully addressed, and the usability of LLMs in…

Cryptography and Security · Computer Science 2024-10-21 Shigang Liu , Bushra Sabir , Seung Ick Jang , Yuval Kansal , Yansong Gao , Kristen Moore , Alsharif Abuadbba , Surya Nepal

Investigating Large Language Models for Code Vulnerability Detection: An Experimental Study

Code vulnerability detection (CVD) is essential for addressing and preventing system security issues, playing a crucial role in ensuring software security. Previous learning-based vulnerability detection methods rely on either fine-tuning…

Computation and Language · Computer Science 2025-01-07 Xuefeng Jiang , Lvhua Wu , Sheng Sun , Jia Li , Jingjing Xue , Yuwei Wang , Tingting Wu , Min Liu

Learning to Generate Secure Code via Token-Level Rewards

Large language models (LLMs) have demonstrated strong capabilities in code generation, yet they remain prone to producing security vulnerabilities. Existing approaches commonly suffer from two key limitations: the scarcity of high-quality…

Cryptography and Security · Computer Science 2026-03-02 Jiazheng Quan , Xiaodong Li , Bin Wang , Guo An , Like Liu , Degen Huang , Lin Liu , Chengbin Hou

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Recent advancements in Large Language Models (LLMs) have shown that it is promising to utilize Process Reward Models (PRMs) as verifiers to enhance the performance of LLMs. However, current PRMs face three key challenges: (1) limited…

Computation and Language · Computer Science 2025-04-08 Jian Zhao , Runze Liu , Kaiyan Zhang , Zhimu Zhou , Junqi Gao , Dong Li , Jiafei Lyu , Zhouyi Qian , Biqing Qi , Xiu Li , Bowen Zhou

Process Supervision-Guided Policy Optimization for Code Generation

Reinforcement learning (RL) with unit test feedback has enhanced large language models' (LLMs) code generation, but relies on sparse rewards provided only after complete code evaluation, limiting learning efficiency and incremental…

Artificial Intelligence · Computer Science 2025-02-05 Ning Dai , Zheng Wu , Renjie Zheng , Ziyun Wei , Wenlei Shi , Xing Jin , Guanlin Liu , Chen Dun , Liang Huang , Lin Yan

SecureCodeRL: Security-Aware Reinforcement Learning for Code Generation with Partial-Credit Rewards

Large Language Models (LLMs) can generate plausible code, but in settings that require exact stdin/stdout behavior they frequently produce programs that compile yet fail tests, and in some cases they introduce security-sensitive patterns.…

Cryptography and Security · Computer Science 2026-01-06 Suryansh Singh Sijwali , Suman Saha

DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding

Process Reward Models (PRMs) have become essential for improving Large Language Models (LLMs) via test-time scaling, yet their effectiveness in coding remains limited due to the lack of meaningful step decompositions in code and the noise…

Machine Learning · Computer Science 2025-12-18 Ruiyi Zhang , Peijia Qin , Qi Cao , Pengtao Xie

Enhancing Large Language Models for Secure Code Generation: A Dataset-driven Study on Vulnerability Mitigation

Large language models (LLMs) have brought significant advancements to code generation, benefiting both novice and experienced developers. However, their training using unsanitized data from open-source repositories, like GitHub, introduces…

Software Engineering · Computer Science 2023-10-26 Jiexin Wang , Liuwen Cao , Xitong Luo , Zhiping Zhou , Jiayuan Xie , Adam Jatowt , Yi Cai

Large Language Models Versus Static Code Analysis Tools: A Systematic Benchmark for Vulnerability Detection

Modern software relies on a multitude of automated testing and quality assurance tools to prevent errors, bugs and potential vulnerabilities. This study sets out to provide a head-to-head, quantitative and qualitative evaluation of six…

Software Engineering · Computer Science 2025-08-07 Damian Gnieciak , Tomasz Szandala

SecCoder: Towards Generalizable and Robust Secure Code Generation

After large models (LMs) have gained widespread acceptance in code-related tasks, their superior generative capacity has greatly promoted the application of the code LM. Nevertheless, the security of the generated code has raised attention…

Programming Languages · Computer Science 2024-10-03 Boyu Zhang , Tianyu Du , Junkai Tong , Xuhong Zhang , Kingsum Chow , Sheng Cheng , Xun Wang , Jianwei Yin

Let's reward step by step: Step-Level reward model as the Navigators for Reasoning

Recent years have seen considerable advancements in multi-step reasoning with Large Language Models (LLMs). The previous studies have elucidated the merits of integrating feedback or search mechanisms during model inference to improve the…

Computation and Language · Computer Science 2023-10-17 Qianli Ma , Haotian Zhou , Tingkai Liu , Jianbo Yuan , Pengfei Liu , Yang You , Hongxia Yang

SecCodeBench-V2 Technical Report

We introduce SecCodeBench-V2, a publicly released benchmark for evaluating Large Language Model (LLM) copilots' capabilities of generating secure code. SecCodeBench-V2 comprises 98 generation and fix scenarios derived from Alibaba Group's…

Cryptography and Security · Computer Science 2026-02-19 Longfei Chen , Ji Zhao , Lanxiao Cui , Tong Su , Xingbo Pan , Ziyang Li , Yongxing Wu , Qijiang Cao , Qiyao Cai , Jing Zhang , Yuandong Ni , Junyao He , Zeyu Zhang , Chao Ge , Xuhuai Lu , Zeyu Gao , Yuxin Cui , Weisen Chen , Yuxuan Peng , Shengping Wang , Qi Li , Yukai Huang , Yukun Liu , Tuo Zhou , Terry Yue Zhuo , Junyang Lin , Chao Zhang

Rethinking the Evaluation of Secure Code Generation

Large language models (LLMs) are widely used in software development. However, the code generated by LLMs often contains vulnerabilities. Several secure code generation methods have been proposed to address this issue, but their current…

Cryptography and Security · Computer Science 2025-11-14 Shih-Chieh Dai , Jun Xu , Guanhong Tao

SafeGenBench: A Benchmark Framework for Security Vulnerability Detection in LLM-Generated Code

The code generation capabilities of large language models(LLMs) have emerged as a critical dimension in evaluating their overall performance. However, prior research has largely overlooked the security risks inherent in the generated code.…

Cryptography and Security · Computer Science 2025-06-23 Xinghang Li , Jingzhe Ding , Chao Peng , Bing Zhao , Xiang Gao , Hongwan Gao , Xinchen Gu

SeCodePLT: A Unified Platform for Evaluating the Security of Code GenAI

Existing benchmarks for evaluating the security risks and capabilities (e.g., vulnerability detection) of code-generating large language models (LLMs) face several key limitations: (1) limited coverage of risk and capabilities; (2) reliance…

Cryptography and Security · Computer Science 2025-09-22 Yuzhou Nie , Zhun Wang , Yu Yang , Ruizhe Jiang , Yuheng Tang , Xander Davies , Yarin Gal , Bo Li , Wenbo Guo , Dawn Song

PARM: Pipeline-Adapted Reward Model

Reward models (RMs) are central to aligning large language models (LLMs) with human preferences, powering RLHF and advanced decoding strategies. While most prior work focuses on single-step generation, real-world applications increasingly…

Artificial Intelligence · Computer Science 2026-04-21 Xingyu Fan , Wei Shao , Jiacheng Liu , Linqi Song , Pheng Ann Heng

Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned

Process Reward Models (PRMs) provide step-level supervision that improves the reliability of reasoning in large language models. While PRMs have been extensively studied in text-based domains, their extension to Vision Language Models…

Artificial Intelligence · Computer Science 2025-10-08 Brandon Ong , Tej Deep Pala , Vernon Toh , William Chandra Tjhi , Soujanya Poria

Improving Vision-language Models with Perception-centric Process Reward Models

Recent advancements in reinforcement learning with verifiable rewards (RLVR) have significantly improved the complex reasoning ability of vision-language models (VLMs). However, its outcome-level supervision is too coarse to diagnose and…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Yingqian Min , Kun Zhou , Yifan Li , Yuhuan Wu , Han Peng , Yifan Du , Wayne Xin Zhao , Min Yang , Ji-Rong Wen