Related papers: CodeLMSec Benchmark: Systematically Evaluating and…

SafeGenBench: A Benchmark Framework for Security Vulnerability Detection in LLM-Generated Code

The code generation capabilities of large language models(LLMs) have emerged as a critical dimension in evaluating their overall performance. However, prior research has largely overlooked the security risks inherent in the generated code.…

Cryptography and Security · Computer Science 2025-06-23 Xinghang Li , Jingzhe Ding , Chao Peng , Bing Zhao , Xiang Gao , Hongwan Gao , Xinchen Gu

RealSec-bench: A Benchmark for Evaluating Secure Code Generation in Real-World Repositories

Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, but their proficiency in producing secure code remains a critical, under-explored area. Existing benchmarks often fall short by relying on synthetic…

Cryptography and Security · Computer Science 2026-02-02 Yanlin Wang , Ziyao Zhang , Chong Wang , Xinyi Xu , Mingwei Liu , Yong Wang , Jiachi Chen , Zibin Zheng

Guiding AI to Fix Its Own Flaws: An Empirical Study on LLM-Driven Secure Code Generation

Large Language Models (LLMs) have become powerful tools for automated code generation. However, these models often overlook critical security practices, which can result in the generation of insecure code that contains…

Software Engineering · Computer Science 2025-07-01 Hao Yan , Swapneel Suhas Vaidya , Xiaokuan Zhang , Ziyu Yao

Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval

Large language models (LLMs) have brought significant advancements to code generation and code repair, benefiting both novice and experienced developers. However, their training using unsanitized data from open-source repositories, like…

Software Engineering · Computer Science 2024-07-08 Jiexin Wang , Xitong Luo , Liuwen Cao , Hongkui He , Hailin Huang , Jiayuan Xie , Adam Jatowt , Yi Cai

Can We Trust Large Language Models Generated Code? A Framework for In-Context Learning, Security Patterns, and Code Evaluations Across Diverse LLMs

Large Language Models (LLMs) such as ChatGPT and GitHub Copilot have revolutionized automated code generation in software engineering. However, as these models are increasingly utilized for software development, concerns have arisen…

Cryptography and Security · Computer Science 2024-12-03 Ahmad Mohsin , Helge Janicke , Adrian Wood , Iqbal H. Sarker , Leandros Maglaras , Naeem Janjua

The Hidden Risks of LLM-Generated Web Application Code: A Security-Centric Evaluation of Code Generation Capabilities in Large Language Models

The rapid advancement of Large Language Models (LLMs) has enhanced software development processes, minimizing the time and effort required for coding and enhancing developer productivity. However, despite their potential benefits, code…

Cryptography and Security · Computer Science 2025-04-30 Swaroop Dora , Deven Lunkad , Naziya Aslam , S. Venkatesan , Sandeep Kumar Shukla

Security-by-Design for LLM-Based Code Generation: Leveraging Internal Representations for Concept-Driven Steering Mechanisms

Large Language Models (LLMs) show remarkable capabilities in understanding natural language and generating complex code. However, as practitioners adopt CodeLLMs for increasingly critical development tasks, research reveals that these…

Cryptography and Security · Computer Science 2026-03-13 Maximilian Wendlinger , Daniel Kowatsch , Konstantin Böttinger , Philip Sperl

Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study

Despite various approaches being employed to detect vulnerabilities, the number of reported vulnerabilities shows an upward trend over the years. This suggests the problems are not caught before the code is released, which could be caused…

Cryptography and Security · Computer Science 2025-02-14 Karl Tamberg , Hayretdin Bahsi

Enhancing Large Language Models for Secure Code Generation: A Dataset-driven Study on Vulnerability Mitigation

Large language models (LLMs) have brought significant advancements to code generation, benefiting both novice and experienced developers. However, their training using unsanitized data from open-source repositories, like GitHub, introduces…

Software Engineering · Computer Science 2023-10-26 Jiexin Wang , Liuwen Cao , Xitong Luo , Zhiping Zhou , Jiayuan Xie , Adam Jatowt , Yi Cai

LLM-CSEC: Empirical Evaluation of Security in C/C++ Code Generated by Large Language Models

The security of code generated by large language models (LLMs) is a significant concern, as studies indicate that such code often contains vulnerabilities and lacks essential defensive programming constructs. This work focuses on examining…

Artificial Intelligence · Computer Science 2025-11-25 Muhammad Usman Shahid , Chuadhry Mujeeb Ahmed , Rajiv Ranjan

HardSecBench: Benchmarking the Security Awareness of LLMs for Hardware Code Generation

Large language models (LLMs) are being increasingly integrated into practical hardware and firmware development pipelines for code generation. Existing studies have primarily focused on evaluating the functional correctness of LLM-generated…

Cryptography and Security · Computer Science 2026-01-21 Qirui Chen , Jingxian Shuai , Shuangwu Chen , Shenghao Ye , Zijian Wen , Xufei Su , Jie Jin , Jiangming Li , Jun Chen , Xiaobin Tan , Jian Yang

Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective

Code security and usability are both essential for various coding assistant applications driven by large language models (LLMs). Current code security benchmarks focus solely on single evaluation task and paradigm, such as code completion…

Computation and Language · Computer Science 2025-05-16 Yutao Mou , Xiao Deng , Yuxiao Luo , Shikun Zhang , Wei Ye

Security and Quality in LLM-Generated Code: A Multi-Language, Multi-Model Analysis

Artificial Intelligence (AI)-driven code generation tools are increasingly used throughout the software development lifecycle to accelerate coding tasks. However, the security of AI-generated code using Large Language Models (LLMs) remains…

Cryptography and Security · Computer Science 2026-03-10 Mohammed Kharma , Soohyeon Choi , Mohammed AlKhanafseh , David Mohaisen

Rethinking the Evaluation of Secure Code Generation

Large language models (LLMs) are widely used in software development. However, the code generated by LLMs often contains vulnerabilities. Several secure code generation methods have been proposed to address this issue, but their current…

Cryptography and Security · Computer Science 2025-11-14 Shih-Chieh Dai , Jun Xu , Guanhong Tao

Helping LLMs Improve Code Generation Using Feedback from Testing and Static Analysis

Large Language Models (LLMs) are one of the most promising developments in the field of artificial intelligence, and the software engineering community has readily noticed their potential role in the software development life-cycle.…

Software Engineering · Computer Science 2026-03-16 Greta Dolcetti , Vincenzo Arceri , Eleonora Iotti , Sergio Maffeis , Agostino Cortesi , Enea Zaffanella

Benchmarking Large Language Models for Multi-Language Software Vulnerability Detection

Recent advancements in generative AI have led to the widespread adoption of large language models (LLMs) in software engineering, addressing numerous long-standing challenges. However, a comprehensive study examining the capabilities of…

Software Engineering · Computer Science 2025-03-04 Ting Zhang , Chengran Yang , Yindu Su , Martin Weyssow , Hung Nguyen , Tan Bui , Hong Jin Kang , Yikun Li , Eng Lieh Ouh , Lwin Khin Shar , David Lo

Improving LLM-Assisted Secure Code Generation through Retrieval-Augmented-Generation and Multi-Tool Feedback

Large Language Models (LLMs) can generate code but often introduce security vulnerabilities, logical inconsistencies, and compilation errors. Prior work demonstrates that LLMs benefit substantially from structured feedback, static analysis,…

Cryptography and Security · Computer Science 2026-01-05 Vidyut Sriram , Sawan Pandita , Achintya Lakshmanan , Aneesh Shamraj , Suman Saha

Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning

Code generation large language models (LLMs) are increasingly integrated into modern software development workflows. Recent work has shown that these models are vulnerable to backdoor and poisoning attacks that induce the generation of…

Cryptography and Security · Computer Science 2026-03-19 Shenao Yan , Shimaa Ahmed , Shan Jin , Sunpreet S. Arora , Yiwei Cai , Yizhen Wang , Yuan Hong

LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations

Large Language Models (LLMs) like Codex are powerful tools for performing code completion and code generation tasks as they are trained on billions of lines of code from publicly available sources. Moreover, these models are capable of…

Software Engineering · Computer Science 2023-03-17 Catherine Tony , Markus Mutas , Nicolás E. Díaz Ferreyra , Riccardo Scandariato

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities

While automated vulnerability detection techniques have made promising progress in detecting security vulnerabilities, their scalability and applicability remain challenging. The remarkable performance of Large Language Models (LLMs), such…

Cryptography and Security · Computer Science 2024-10-24 Avishree Khare , Saikat Dutta , Ziyang Li , Alaia Solko-Breslin , Rajeev Alur , Mayur Naik