English
Related papers

Related papers: DUALGUAGE: Automated Joint Security-Functionality …

200 papers

The code generation capabilities of large language models(LLMs) have emerged as a critical dimension in evaluating their overall performance. However, prior research has largely overlooked the security risks inherent in the generated code.…

Cryptography and Security · Computer Science 2025-06-23 Xinghang Li , Jingzhe Ding , Chao Peng , Bing Zhao , Xiang Gao , Hongwan Gao , Xinchen Gu

Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, but their proficiency in producing secure code remains a critical, under-explored area. Existing benchmarks often fall short by relying on synthetic…

Cryptography and Security · Computer Science 2026-02-02 Yanlin Wang , Ziyao Zhang , Chong Wang , Xinyi Xu , Mingwei Liu , Yong Wang , Jiachi Chen , Zibin Zheng

Large language models (LLMs) are being increasingly integrated into practical hardware and firmware development pipelines for code generation. Existing studies have primarily focused on evaluating the functional correctness of LLM-generated…

Cryptography and Security · Computer Science 2026-01-21 Qirui Chen , Jingxian Shuai , Shuangwu Chen , Shenghao Ye , Zijian Wen , Xufei Su , Jie Jin , Jiangming Li , Jun Chen , Xiaobin Tan , Jian Yang

Large Language Models (LLMs) have made significant strides in front-end code generation. However, existing benchmarks exhibit several critical limitations: many tasks are overly simplistic, test cases often lack rigor, and end-to-end…

Software Engineering · Computer Science 2025-06-19 Hongda Zhu , Yiwen Zhang , Bing Zhao , Jingzhe Ding , Siyao Liu , Tong Liu , Dandan Wang , Yanan Liu , Zhaojian Li

Large language models (LLMs) are widely used in software development. However, the code generated by LLMs often contains vulnerabilities. Several secure code generation methods have been proposed to address this issue, but their current…

Cryptography and Security · Computer Science 2025-11-14 Shih-Chieh Dai , Jun Xu , Guanhong Tao

The evaluation of code-generating Large Language Models (LLMs) is fundamentally constrained by two intertwined challenges: a reliance on static, easily contaminated problem sources and the use of superficial, low-rigor testing. This paper…

Software Engineering · Computer Science 2026-02-04 Zhe Zhang , Runlin Liu , Aishan Liu , Xingyu Liu , Xiang Gao , Hailong Sun

With the growing popularity of Large Language Models (LLMs) in software engineers' daily practices, it is important to ensure that the code generated by these tools is not only functionally correct but also free of vulnerabilities. Although…

Software Engineering · Computer Science 2024-09-06 Mohammed Latif Siddiq , Joanna C. S. Santos , Sajith Devareddy , Anna Muller

Large language models (LLMs) have become integral to modern software development, producing vast amounts of AI-generated source code. While these models boost programming productivity, their misuse introduces critical risks, including code…

Software Engineering · Computer Science 2025-06-16 Hanxi Guo , Siyuan Cheng , Kaiyuan Zhang , Guangyu Shen , Xiangyu Zhang

Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Their advances in competition-level programming problems have made them an essential pillar of AI-assisted pair…

Cryptography and Security · Computer Science 2023-10-24 Hossein Hajipour , Keno Hassler , Thorsten Holz , Lea Schönherr , Mario Fritz

In light of the rapid adoption of AI coding assistants, LLM-assisted development has become increasingly prevalent, creating an urgent need for robust evaluation of generated code quality. Existing benchmarks often require extensive manual…

Software Engineering · Computer Science 2025-05-21 Yuancheng Jiang , Roland Yap , Zhenkai Liang

The rapid advancement of Large Language Models (LLMs) has enhanced software development processes, minimizing the time and effort required for coding and enhancing developer productivity. However, despite their potential benefits, code…

Cryptography and Security · Computer Science 2025-04-30 Swaroop Dora , Deven Lunkad , Naziya Aslam , S. Venkatesan , Sandeep Kumar Shukla

Large language models (LLMs) can often generate functionally correct code, but their ability to produce efficient implementations for performance-critical systems tasks remains limited. Existing code benchmarks mainly emphasize correctness…

Software Engineering · Computer Science 2026-05-18 Huihao Jing , Wenbin Hu , Haochen Shi , Hanyu Yang , Sirui Zhang , Shaojin Chen , Haoran Li , Yangqiu Song

Artificial Intelligence (AI)-driven code generation tools are increasingly used throughout the software development lifecycle to accelerate coding tasks. However, the security of AI-generated code using Large Language Models (LLMs) remains…

Cryptography and Security · Computer Science 2026-03-10 Mohammed Kharma , Soohyeon Choi , Mohammed AlKhanafseh , David Mohaisen

Code benchmarks such as HumanEval are widely adopted to evaluate the capabilities of Large Language Models (LLMs), providing insights into their strengths and weaknesses. However, current benchmarks primarily exercise LLMs' capability on…

Artificial Intelligence · Computer Science 2024-08-26 Qiming Zhu , Jialun Cao , Yaojie Lu , Hongyu Lin , Xianpei Han , Le Sun , Shing-Chi Cheung

Code security and usability are both essential for various coding assistant applications driven by large language models (LLMs). Current code security benchmarks focus solely on single evaluation task and paradigm, such as code completion…

Computation and Language · Computer Science 2025-05-16 Yutao Mou , Xiao Deng , Yuxiao Luo , Shikun Zhang , Wei Ye

Code vulnerability detection is crucial for ensuring the security and reliability of modern software systems. Recently, Large Language Models (LLMs) have shown promising capabilities in this domain. However, notable discrepancies in…

Software Engineering · Computer Science 2025-09-19 Zhihong Sun , Jia Li , Yao Wan , Chuanyi Li , Hongyu Zhang , Zhi jin , Ge Li , Hong Liu , Chen Lyu , Songlin Hu

Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in automated front-end engineering, e.g., generating UI code from visual designs. However, existing front-end UI code generation benchmarks have the…

Software Engineering · Computer Science 2026-03-17 Jingyu Xiao , Ming Wang , Man Ho Lam , Yuxuan Wan , Junliang Liu , Yintong Huo , Michael R. Lyu

The use of large language models (LLMs) for automated code generation has emerged as a significant focus within AI research. As these pretrained models continue to evolve, their ability to understand and generate complex code structures has…

Software Engineering · Computer Science 2025-05-06 Nazmus Ashrafi , Salah Bouktif , Mohammed Mediani

As large language models become increasingly capable of generating code, evaluating their performance remains a complex and evolving challenge. Existing benchmarks primarily focus on functional correctness, overlooking the diversity of…

Software Engineering · Computer Science 2025-11-03 Forough Mehralian , Ryan Shar , James R. Rae , Alireza Hashemi

Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains, with code generation emerging as a key area of focus. While numerous benchmarks have been proposed to evaluate their code generation abilities,…

‹ Prev 1 2 3 10 Next ›