Related papers: SecureCode: A Production-Grade Multi-Turn Dataset …

Secure coding for web applications: Frameworks, challenges, and the role of LLMs

Secure coding is a critical yet often overlooked practice in software development. Despite extensive awareness efforts, real-world adoption remains inconsistent due to organizational, educational, and technical barriers. This paper provides…

Software Engineering · Computer Science 2025-10-02 Kiana Kiashemshaki , Mohammad Jalili Torkamani , Negin Mahmoudi

Benchmarking Open-Source Safety Guard Models: A Comprehensive Evaluation

As Large Language Models (LLMs) are increasingly deployed in safety-critical applications, robust content moderation becomes essential. We present a comprehensive evaluation of 14 open-source safety guard models on a curated benchmark of…

Computation and Language · Computer Science 2026-05-29 Reetu Raj Harsh , Bhaskarjit Sarmah , Stefano Pasquali

Taught by the Flawed: How Dataset Insecurity Breeds Vulnerable AI Code

AI programming assistants have demonstrated a tendency to generate code containing basic security vulnerabilities. While developers are ultimately responsible for validating and reviewing such outputs, improving the inherent quality of…

Cryptography and Security · Computer Science 2025-11-14 Catherine Xia , Manar H. Alalfi

Benchmarking Correctness and Security in Multi-Turn Code Generation

AI coding assistants powered by large language models (LLMs) have transformed software development, significantly boosting productivity. While existing benchmarks evaluate the correctness and security of LLM-generated code, they are…

Software Engineering · Computer Science 2025-10-17 Ruchit Rawal , Jeffrey Yang Fan Chiang , Chihao Shen , Jeffery Siyuan Tian , Aastha Mahajan , Tom Goldstein , Yizheng Chen

Securing the AI Supply Chain: What Can We Learn From Developer-Reported Security Issues and Solutions of AI Projects?

The rapid growth of Artificial Intelligence (AI) models and applications has led to an increasingly complex security landscape. Developers of AI projects must contend not only with traditional software supply chain issues but also with…

Software Engineering · Computer Science 2026-01-12 The Anh Nguyen , Triet Huynh Minh Le , M. Ali Babar

HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data

Large language models (LLMs) have shown great potential for automatic code generation and form the basis for various tools such as GitHub Copilot. However, recent studies highlight that many LLM-generated code contains serious security…

Cryptography and Security · Computer Science 2024-09-11 Hossein Hajipour , Lea Schönherr , Thorsten Holz , Mario Fritz

Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval

Large language models (LLMs) have brought significant advancements to code generation and code repair, benefiting both novice and experienced developers. However, their training using unsanitized data from open-source repositories, like…

Software Engineering · Computer Science 2024-07-08 Jiexin Wang , Xitong Luo , Liuwen Cao , Hongkui He , Hailin Huang , Jiayuan Xie , Adam Jatowt , Yi Cai

SecureVibeBench: Benchmarking Secure Vibe Coding of AI Agents via Reconstructing Vulnerability-Introducing Scenarios

Large language model-powered code agents are rapidly transforming software engineering, yet the security risks of their generated code have become a critical concern. Existing benchmarks have provided valuable insights, but they fail to…

Software Engineering · Computer Science 2026-04-27 Junkai Chen , Huihui Huang , Yunbo Lyu , Junwen An , Jieke Shi , Chengran Yang , Ting Zhang , Haoye Tian , Yikun Li , Zhenhao Li , Xin Zhou , Xing Hu , David Lo

Enhancing Large Language Models for Secure Code Generation: A Dataset-driven Study on Vulnerability Mitigation

Large language models (LLMs) have brought significant advancements to code generation, benefiting both novice and experienced developers. However, their training using unsanitized data from open-source repositories, like GitHub, introduces…

Software Engineering · Computer Science 2023-10-26 Jiexin Wang , Liuwen Cao , Xitong Luo , Zhiping Zhou , Jiayuan Xie , Adam Jatowt , Yi Cai

Developing Hands-on Labs for Source Code Vulnerability Detection with AI

As the role of information and communication technologies gradually increases in our lives, source code security becomes a significant issue to protect against malicious attempts Furthermore with the advent of data-driven techniques, there…

Cryptography and Security · Computer Science 2023-02-03 Maryam Taeb

Instruction Tuning for Secure Code Generation

Modern language models (LMs) have gained widespread acceptance in everyday and professional contexts, particularly in programming. An essential procedure enabling this adoption is instruction tuning, which substantially enhances LMs'…

Cryptography and Security · Computer Science 2024-07-15 Jingxuan He , Mark Vero , Gabriela Krasnopolska , Martin Vechev

How secure is AI-generated Code: A Large-Scale Comparison of Large Language Models

This study compares state-of-the-art Large Language Models (LLMs) on their tendency to generate vulnerabilities when writing C programs using a neutral zero-shot prompt. Tihanyi et al. introduced the FormAI dataset at PROMISE'23, featuring…

Cryptography and Security · Computer Science 2024-12-12 Norbert Tihanyi , Tamas Bisztray , Mohamed Amine Ferrag , Ridhi Jain , Lucas C. Cordeiro

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety…

Cryptography and Security · Computer Science 2023-12-11 Manish Bhatt , Sahana Chennabasappa , Cyrus Nikolaidis , Shengye Wan , Ivan Evtimov , Dominik Gabi , Daniel Song , Faizan Ahmad , Cornelius Aschermann , Lorenzo Fontana , Sasha Frolov , Ravi Prakash Giri , Dhaval Kapil , Yiannis Kozyrakis , David LeBlanc , James Milazzo , Aleksandar Straumann , Gabriel Synnaeve , Varun Vontimitta , Spencer Whitman , Joshua Saxe

AI Code in the Wild: Measuring Security Risks and Ecosystem Shifts of AI-Generated Code in Modern Software

Large language models (LLMs) for code generation are becoming integral to modern software development, but their real-world prevalence and security impact remain poorly understood. We present the first large-scale empirical study of…

Software Engineering · Computer Science 2025-12-23 Bin Wang , Wenjie Yu , Yilu Zhong , Hao Yu , Keke Lian , Chaohua Lu , Hongfang Zheng , Dong Zhang , Hui Li

Secure-Instruct: An Automated Pipeline for Synthesizing Instruction-Tuning Datasets Using LLMs for Secure Code Generation

Although Large Language Models (LLMs) show promising solutions to automated code generation, they often produce insecure code that threatens software security. Current approaches (e.g., SafeCoder) to improve secure code generation are…

Software Engineering · Computer Science 2025-11-25 Junjie Li , Fazle Rabbi , Bo Yang , Song Wang , Jinqiu Yang

WhatsCode: Large-Scale GenAI Deployment for Developer Efficiency at WhatsApp

The deployment of AI-assisted development tools in compliance-relevant, large-scale industrial environments represents significant gaps in academic literature, despite growing industry adoption. We report on the industrial deployment of…

Software Engineering · Computer Science 2025-12-08 Ke Mao , Timotej Kapus , Cons T Åhs , Matteo Marescotti , Daniel Ip , Ákos Hajdu , Sopot Cela , Aparup Banerjee

Code2Doc: A Quality-First Curated Dataset for Code Documentation

The performance of automatic code documentation generation models depends critically on the quality of the training data used for supervision. However, most existing code documentation datasets are constructed through large scale scraping…

Software Engineering · Computer Science 2025-12-25 Recep Kaan Karaman , Meftun Akarsu

Safety Pretraining: Toward the Next Generation of Safe AI

As large language models (LLMs) are increasingly deployed in high-stakes settings, the risk of generating harmful or toxic content remains a central challenge. Post-hoc alignment methods are brittle: once unsafe patterns are learned during…

Machine Learning · Computer Science 2025-09-16 Pratyush Maini , Sachin Goyal , Dylan Sam , Alex Robey , Yash Savani , Yiding Jiang , Andy Zou , Matt Fredrikson , Zacharcy C. Lipton , J. Zico Kolter

AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts

As Large Language Models (LLMs) and generative AI become more widespread, the content safety risks associated with their use also increase. We find a notable deficiency in high-quality content safety datasets and benchmarks that…

Machine Learning · Computer Science 2024-09-12 Shaona Ghosh , Prasoon Varshney , Erick Galinkin , Christopher Parisien

SecureCAI: Injection-Resilient LLM Assistants for Cybersecurity Operations

Large Language Models have emerged as transformative tools for Security Operations Centers, enabling automated log analysis, phishing triage, and malware explanation; however, deployment in adversarial cybersecurity environments exposes…

Cryptography and Security · Computer Science 2026-01-13 Mohammed Himayath Ali , Mohammed Aqib Abdullah , Mohammed Mudassir Uddin , Shahnawaz Alam