English
Related papers

Related papers: TESTEVAL: Benchmarking Large Language Models for T…

200 papers

Software testing is a crucial phase in the software life cycle, helping identify potential risks and reduce maintenance costs. With the advancement of Large Language Models (LLMs), researchers have proposed an increasing number of LLM-based…

Software Engineering · Computer Science 2024-09-27 Quanjun Zhang , Ye Shang , Chunrong Fang , Siqi Gu , Jianyi Zhou , Zhenyu Chen

Code generation models can help improve many common software tasks ranging from code completion to defect prediction. Most of the existing benchmarks for code generation LLMs focus on code authoring or code completion. Surprisingly, there…

Software Engineering · Computer Science 2025-03-20 Kush Jain , Gabriel Synnaeve , Baptiste Rozière

Code large language models (LLMs) have shown remarkable advances in code understanding, completion, and generation tasks. Programming benchmarks, comprised of a selection of code challenges and corresponding test cases, serve as a standard…

Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, capable of tackling complex tasks during inference. However, the extent to which LLMs can be utilized for code checking or debugging through test…

Code generation with Large Language Models (LLMs) has been extensively studied and achieved remarkable progress. As a complementary aspect to code generation, test case generation is of crucial importance in ensuring the quality and…

Software Engineering · Computer Science 2024-04-23 Kefan Li , Yuan Yuan

Large Language Models (LLMs) are starting to be profiled as one of the most significant disruptions in the Software Testing field. Specifically, they have been successfully applied in software testing tasks such as generating test code, or…

Software Engineering · Computer Science 2025-09-30 Cristian Augusto , Antonia Bertolino , Guglielmo De Angelis , Francesca Lonetti , Jesús Morán

Recently, large language models (LLMs), especially those that are pretrained on code, have demonstrated strong capabilities in generating programs from natural language inputs in a few-shot or even zero-shot manner. Despite promising…

This paper presents a detailed case study examining the application of Large Language Models (LLMs) in the construction of test cases within the context of software engineering. LLMs, characterized by their advanced natural language…

Recent advancements in large language models (LLMs) have significantly enhanced their coding capabilities. However, existing benchmarks predominantly focused on simplified or isolated aspects of coding, such as single-file code generation…

Computation and Language · Computer Science 2024-12-17 Bowen Li , Wenhan Wu , Ziwei Tang , Lin Shi , John Yang , Jinyang Li , Shunyu Yao , Chen Qian , Binyuan Hui , Qicheng Zhang , Zhiyin Yu , He Du , Ping Yang , Dahua Lin , Chao Peng , Kai Chen

Large Language Models (LLMs) are predominantly assessed based on their common sense reasoning, language comprehension, and logical reasoning abilities. While models trained in specialized domains like mathematics or coding have demonstrated…

Software Engineering · Computer Science 2026-01-08 Danny Brahman , Mohammad Mahoor

A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates coherent content, including grammatically precise sentences, human-like paragraphs, and syntactically accurate code snippets. LLMs can play…

Software Engineering · Computer Science 2023-12-11 Robson Santos , Italo Santos , Cleyton Magalhaes , Ronnie de Souza Santos

How to evaluate Large Language Models (LLMs) in code generation is an open question. Many benchmarks have been proposed but are inconsistent with practical software projects, e.g., unreal program distributions, insufficient dependencies,…

Pre-trained large language models (LLMs) have recently emerged as a breakthrough technology in natural language processing and artificial intelligence, with the ability to handle large-scale datasets and exhibit remarkable performance…

Software Engineering · Computer Science 2024-03-05 Junjie Wang , Yuchao Huang , Chunyang Chen , Zhe Liu , Song Wang , Qing Wang

We introduce TestCase-Eval, a new benchmark for systematic evaluation of LLMs in test-case generation. TestCase-Eval includes 500 algorithm problems and 100,000 human-crafted solutions from the Codeforces platform. It focuses on two pivotal…

Software Engineering · Computer Science 2025-06-17 Zheyuan Yang , Zexi Kuang , Xue Xia , Yilun Zhao

\textit{Background:} The use of large language models in software testing is growing fast as they support numerous tasks, from test case generation to automation, and documentation. However, their adoption often relies on informal…

Software Engineering · Computer Science 2025-10-21 Maria Deolinda Santana , Cleyton Magalhaes , Ronnie de Souza Santos

Large Language Models (LLMs) are widely used in Software Engineering (SE) for various tasks, including generating code, designing and documenting software, adding code comments, reviewing code, and writing test scripts. However, creating…

Software Engineering · Computer Science 2024-06-12 Abdul Malik Sami , Zeeshan Rasheed , Muhammad Waseem , Zheying Zhang , Herda Tomas , Pekka Abrahamsson

Code large language models (LLMs) have made significant progress in code debugging by directly generating the correct code based on the buggy code snippet. Programming benchmarks, typically consisting of buggy code snippet and their…

In recent years, the application of large language models (LLMs) to code-related tasks has gained significant attention. However, existing evaluation benchmarks often focus on limited scenarios, such as code generation or completion, which…

Software Engineering · Computer Science 2024-09-17 Jia Feng , Jiachen Liu , Cuiyun Gao , Chun Yong Chong , Chaozheng Wang , Shan Gao , Xin Xia

Large Language Models (LLMs) are rapidly becoming ubiquitous both as stand-alone tools and as components of current and future software systems. To enable usage of LLMs in the high-stake or safety-critical systems of 2030, they need to…

Software Engineering · Computer Science 2024-06-13 Sinclair Hudson , Sophia Jit , Boyue Caroline Hu , Marsha Chechik

System models, a critical artifact in software development, provide a formal abstraction of both the structural and behavioral aspects of software systems, which can facilitate the early requirements analysis and architecture design.…

Software Engineering · Computer Science 2025-08-06 Dongming Jin , Zhi Jin , Linyu Li , Zheng Fang , Jia Li , Xiaohong Chen
‹ Prev 1 2 3 10 Next ›