English
Related papers

Related papers: Evaluating LLM-driven User-Intent Formalization fo…

200 papers

Using large language models (LLMs) to generate source code from natural language prompts is a popular and promising idea with a wide range of applications. One of its limitations is that the generated code can be faulty at times, often in a…

Software Engineering · Computer Science 2025-01-14 Yue Chen Li , Stefan Zetzsche , Siva Somayyajula

This paper presents a focused literature survey on the use of large language models (LLM) to assist in writing formal specifications for software. A summary of thirty-five key papers is presented, including examples for specifying programs…

Software Engineering · Computer Science 2025-06-16 Arshad Beg , Diarmuid O'Donoghue , Rosemary Monahan

Large Language Models (LLMs) show promise in automated software engineering, yet their guarantee of correctness is frequently undermined by erroneous or hallucinated code. To enforce model honesty, formal verification requires LLMs to…

Software Engineering · Computer Science 2026-04-27 Md Erfan , Md Kamal Hossain Chowdhury , Ahmed Ryan , Md Rayhanur Rahman

Existing informal language-based (e.g., human language) Large Language Models (LLMs) trained with Reinforcement Learning (RL) face a significant challenge: their verification processes, which provide crucial training signals, are neither…

Computation and Language · Computer Science 2025-10-14 Chuanhao Yan , Fengdi Che , Xuhan Huang , Xu Xu , Xin Li , Yizhi Li , Xingwei Qu , Jingzhe Shi , Chenghua Lin , Yaodong Yang , Binhang Yuan , Hang Zhao , Yu Qiao , Bowen Zhou , Jie Fu

Software correctness is ensured mathematically through formal verification, which involves the resources of generating formal requirement specifications and having an implementation that must be verified. Tools such as model-checkers and…

Software Engineering · Computer Science 2025-08-29 Arshad Beg , Diarmuid O'Donoghue , Rosemary Monahan

Agentic AI systems can now generate code with remarkable fluency, but a fundamental question remains: \emph{does the generated code actually do what the user intended?} The gap between informal natural language requirements and precise…

Software Engineering · Computer Science 2026-03-19 Shuvendu K. Lahiri

Students in computing education increasingly use large language models (LLMs) such as ChatGPT. Yet, the role of LLMs in supporting cognitively demanding tasks, like deductive program verification, remains poorly understood. This paper…

Software Engineering · Computer Science 2025-09-09 Carolina Carreira , Álvaro Silva , Alexandre Abreu , Alexandra Mendes

Recent frontier large language models (LLMs) have shown strong performance in identifying security vulnerabilities in large, mature open-source systems. As LLM-generated code becomes increasingly common, a natural goal is to prevent such…

Software Engineering · Computer Science 2026-05-13 Zhaorui Li , Chengyu Song

Formal verification techniques aim at formally proving the correctness of a computer program with respect to a formal specification, but the expertise and effort required for applying formal specification and verification techniques and…

Software Engineering · Computer Science 2023-01-10 João Pascoal Faria , Rui Abreu

We present and test the largest benchmark for vericoding, LLM-generation of formally verified code from formal specifications - in contrast to vibe coding, which generates potentially buggy code from a natural language description. Our…

This research idea paper proposes leveraging Large Language Models (LLMs) to enhance the productivity of Dafny developers. Although the use of verification-aware languages, such as Dafny, has increased considerably in the last decade, these…

Software Engineering · Computer Science 2024-01-03 Álvaro Silva , Alexandra Mendes , João F. Ferreira

With the advent of AI-based coding engines, it is possible to convert natural language requirements to executable code in standard programming languages. However, AI-generated code can be unreliable, and the natural language requirements…

Software Engineering · Computer Science 2024-11-06 Martin Mirchev , Andreea Costea , Abhishek Kr Singh , Abhik Roychoudhury

Recent verification tools aim to make formal verification more accessible to software engineers by automating most of the verification process. However, annotating conventional programs with the formal specification and verification…

Software Engineering · Computer Science 2026-01-21 João Pascoal Faria , Emanuel Trigo , Vinicius Honorato , Rui Abreu

Formal verification has the potential to drastically reduce software bugs, but its high additional cost has hindered large-scale adoption. While Dafny presents a promise to significantly reduce the effort to write verified programs, users…

Software Engineering · Computer Science 2024-11-26 Gabriel Poesia , Chloe Loughridge , Nada Amin

We introduce DafnyBench, the largest benchmark of its kind for training and evaluating machine learning systems for formal software verification. We test the ability of LLMs such as GPT-4 and Claude 3 to auto-generate enough hints for the…

Large Language Models (LLMs) have demonstrated formidable capabilities in solving mathematical problems, yet they may still commit logical reasoning and computational errors during the problem-solving process. Thus, this paper proposes a…

Artificial Intelligence · Computer Science 2025-05-28 Kuo Zhou , Lu Zhang

In the digital age, ensuring the correctness, safety, and reliability of software through formal verification is paramount, particularly as software increasingly underpins critical infrastructure. Formal verification, split into theorem…

Software Engineering · Computer Science 2026-04-03 Zhiyong Chen , Jialun Cao , Jiarong Wu , Chang Xu , Shing-Chi Cheung

Large Language Models (LLMs) are increasingly being used to automate programming tasks. Yet, LLMs' capabilities in reasoning about program semantics are still inadequately studied, leaving significant potential for further exploration. This…

Programming Languages · Computer Science 2025-05-30 Thanh Le-Cong , Bach Le , Toby Murray

Although formal methods are capable of producing reliable software, they have seen minimal adoption in everyday programming. Automatic code generation using large language models is becoming increasingly widespread, but it rarely considers…

Software Engineering · Computer Science 2025-03-19 Aleksandr Shefer , Igor Engel , Stanislav Alekseev , Daniil Berezun , Ekaterina Verbitskaia , Anton Podkopaev

Large language models show great promise in many domains, including programming. A promise is easy to make but hard to keep, and language models often fail to keep their promises, generating erroneous code. A promising avenue to keep models…

Software Engineering · Computer Science 2024-06-12 Md Rakib Hossain Misu , Cristina V. Lopes , Iris Ma , James Noble
‹ Prev 1 2 3 10 Next ›