English

Automatic High-Level Test Case Generation using Large Language Models

Software Engineering 2025-03-25 v1

Abstract

We explored the challenges practitioners face in software testing and proposed automated solutions to address these obstacles. We began with a survey of local software companies and 26 practitioners, revealing that the primary challenge is not writing test scripts but aligning testing efforts with business requirements. Based on these insights, we constructed a use-case \rightarrow (high-level) test-cases dataset to train/fine-tune models for generating high-level test cases. High-level test cases specify what aspects of the software's functionality need to be tested, along with the expected outcomes. We evaluated large language models, such as GPT-4o, Gemini, LLaMA 3.1 8B, and Mistral 7B, where fine-tuning (the latter two) yields improved performance. A final (human evaluation) survey confirmed the effectiveness of these generated test cases. Our proactive approach strengthens requirement-testing alignment and facilitates early test case generation to streamline development.

Keywords

Cite

@article{arxiv.2503.17998,
  title  = {Automatic High-Level Test Case Generation using Large Language Models},
  author = {Navid Bin Hasan and Md. Ashraful Islam and Junaed Younus Khan and Sanjida Senjik and Anindya Iqbal},
  journal= {arXiv preprint arXiv:2503.17998},
  year   = {2025}
}

Comments

Accepted at International Conference on Mining Software Repositories (MSR) 2025