English
Related papers

Related papers: Agentic Proving for Program Verification

200 papers

Agentic code assistants are a new generation of AI systems capable of performing end-to-end software engineering tasks. While these systems promise unprecedented productivity gains, their behavior and effectiveness depend heavily on…

Software Engineering · Computer Science 2026-05-26 Helio Victor F. Santos , Vitor Costa , Joao Eduardo Montandon , Marco Tulio Valente

Large language models (LLMs) have demonstrated strong coding capabilities but still struggle to solve competitive programming problems correctly in a single attempt. Execution-based re-ranking offers a promising test-time scaling strategy,…

Computation and Language · Computer Science 2026-02-05 Zeyao Ma , Jing Zhang , Xiaokang Zhang , Jiaxi Yang , Zongmeng Zhang , Jiajun Zhang , Yuheng Jing , Lei Zhang , Hao Zheng , Wenting Zhao , Junyang Lin , Binyuan Hui

Agentic coding tools receive goals written in natural language as input, break them down into specific tasks, and write/execute the actual code with minimal human intervention. Key to this process are agent manifests, configuration files…

Automatically generated code is gaining traction recently, owing to the prevalence of Large Language Models (LLMs). Further, the AlphaProof initiative has demonstrated the possibility of using AI for general mathematical reasoning.…

Software Engineering · Computer Science 2026-04-14 Haoxin Tu , Huan Zhao , Yahui Song , Mehtab Zafar , Ruijie Meng , Abhik Roychoudhury

Formal verification offers a path to provably correct software, but writing verified code remains expensive enough that the technique is rarely used in production. Recent large language models can accelerate this work, and recent benchmarks…

Logic in Computer Science · Computer Science 2026-05-28 Leo Yao

Verifying LLM-generated systems code is hard: bugs are prevalent, formal specifications are missing, and safety contracts are encoded implicitly at call sites rather than enforced at function boundaries. We propose agentic model checking, a…

Software Engineering · Computer Science 2026-05-21 Youcheng Sun , Jiawen Liu , Daniel Kroening , Jason Xue

Large language models (LLMs) are increasingly being integrated into software development processes. The ability to generate code and submit pull requests with minimal human intervention, through the use of autonomous AI agents, is poised to…

Software Engineering · Computer Science 2026-02-10 Miku Watanabe , Hao Li , Yutaro Kashiwa , Brittany Reid , Hajimu Iida , Ahmed E. Hassan

Proof engineering is notoriously labor-intensive: proofs that are straightforward on paper often require lengthy scripts in theorem provers. Recent advances in large language models (LLMs) create new opportunities for proof automation:…

Programming Languages · Computer Science 2026-01-08 Yichen Xu , Martin Odersky

Recent advances in large language models (LLMs) demonstrate their effectiveness in scaling test-time compute for software engineering tasks. However, these approaches often focus on high-level solutions, with limited attention to optimizing…

Software Engineering · Computer Science 2025-09-19 Robert Tjarko Lange , Qi Sun , Aaditya Prasad , Maxence Faldor , Yujin Tang , David Ha

We present a two-stage pipeline for AI-assisted improvement of published algorithm implementations. In the first stage, a large language model with research capabilities identifies recently published algorithms satisfying explicit…

Software Engineering · Computer Science 2026-04-16 Worasait Suwannik

Agentic systems are becoming more capable: agents define strategies, take actions, and interact with different environments. This autonomy poses serious challenges for overseeing and assessing agent behavior. Most current tools are limited,…

Computation and Language · Computer Science 2026-05-22 Asaf Yehudai , Lilach Eden , Michal Shmueli-Scheuer

In industrial control systems, the generation and verification of Programmable Logic Controller (PLC) code are critical for ensuring operational efficiency and safety. While Large Language Models (LLMs) have made strides in automated code…

Software Engineering · Computer Science 2024-12-30 Zihan Liu , Ruinan Zeng , Dongxia Wang , Gengyun Peng , Jingyi Wang , Qiang Liu , Peiyu Liu , Wenhai Wang

Agentic AI coding systems can inspect repositories, plan implementation steps, edit files, call tools, run tests, and submit pull requests. These capabilities make software and hardware development faster in some settings, but current…

Software Engineering · Computer Science 2026-05-21 Christopher Koch

Automatic software verification tools help to find hard-to-detect faults in programs checked against specified requirements non-interactively. Besides, they can prove program correctness formally under certain assumptions. These…

Software Engineering · Computer Science 2023-09-29 Ilja Zakharov , Evgeny Novikov , Ilya Shchepetkov

AI coding agents powered by large language models can read codebases and produce functional code, but they routinely violate team-specific product decisions that are invisible in the source code alone. We introduce a controlled benchmark…

Software Engineering · Computer Science 2026-05-12 Drew Dillon , Kasyap Varanasi

The arrival of large language models (LLMs) capable of multi-step reasoning, tool use, and long-horizon planning has produced a qualitative shift in software engineering. Where earlier code-completion tools such as GitHub Copilot operated…

Software Engineering · Computer Science 2026-04-30 Happy Bhati

AI coding assistants have proliferated rapidly, yet structured pedagogical frameworks for learning these tools remain scarce. Developers face a gap between tool documentation and practical mastery, relying on fragmented resources such as…

Computers and Society · Computer Science 2026-05-01 Zain Naboulsi

Large Language Models (LLMs) are increasingly used to automatically generate optimized CUDA kernels, substantially improving developer productivity. However, despite rapid generation, these kernels often contain subtle correctness bugs and…

Software Engineering · Computer Science 2026-03-19 Bodhisatwa Chatterjee , Drew Zagieboylo , Sana Damani , Siva Hari , Christos Kozyrakis

Claude Code is an agentic coding tool that can run shell commands, edit files, and call external services on behalf of the user. This study describes its comprehensive architecture by analyzing the publicly available TypeScript source code…

Software Engineering · Computer Science 2026-04-17 Jiacheng Liu , Xiaohan Zhao , Xinyi Shang , Zhiqiang Shen

Specification synthesis, the task of automatically inferring formal specifications from program implementations and natural language, is important for refactoring, transpilation, optimization, and verification, yet remains an open challenge…

Programming Languages · Computer Science 2026-05-28 Tarun Suresh , David Korczynski , Julien Vanegue
‹ Prev 1 2 3 10 Next ›