Related papers: Agentic Proving for Program Verification

Decoding the Configuration of AI Coding Agents: Insights from Claude Code Projects

Agentic code assistants are a new generation of AI systems capable of performing end-to-end software engineering tasks. While these systems promise unprecedented productivity gains, their behavior and effectiveness depend heavily on…

Software Engineering · Computer Science 2026-05-26 Helio Victor F. Santos , Vitor Costa , Joao Eduardo Montandon , Marco Tulio Valente

Scaling Agentic Verifier for Competitive Coding

Large language models (LLMs) have demonstrated strong coding capabilities but still struggle to solve competitive programming problems correctly in a single attempt. Execution-based re-ranking offers a promising test-time scaling strategy,…

Computation and Language · Computer Science 2026-02-05 Zeyao Ma , Jing Zhang , Xiaokang Zhang , Jiaxi Yang , Zongmeng Zhang , Jiajun Zhang , Yuheng Jing , Lei Zhang , Hao Zheng , Wenting Zhao , Junyang Lin , Binyuan Hui

On the Use of Agentic Coding Manifests: An Empirical Study of Claude Code

Agentic coding tools receive goals written in natural language as input, break them down into specific tasks, and write/execute the actual code with minimal human intervention. Key to this process are agent manifests, configuration files…

Software Engineering · Computer Science 2025-12-30 Worawalan Chatlatanagulchai , Kundjanasith Thonglek , Brittany Reid , Yutaro Kashiwa , Pattara Leelaprute , Arnon Rungsawang , Bundit Manaskasemsak , Hajimu Iida

Agentic Verification of Software Systems

Automatically generated code is gaining traction recently, owing to the prevalence of Large Language Models (LLMs). Further, the AlphaProof initiative has demonstrated the possibility of using AI for general mathematical reasoning.…

Software Engineering · Computer Science 2026-04-14 Haoxin Tu , Huan Zhao , Yahui Song , Mehtab Zafar , Ruijie Meng , Abhik Roychoudhury

Automating Formal Verification with Agent-Guided Tree Search

Formal verification offers a path to provably correct software, but writing verified code remains expensive enough that the technique is rarely used in production. Recent large language models can accelerate this work, and recent benchmarks…

Logic in Computer Science · Computer Science 2026-05-28 Leo Yao

Agentic Model Checking

Verifying LLM-generated systems code is hard: bugs are prevalent, formal specifications are missing, and safety contracts are encoded implicitly at call sites rather than enforced at function boundaries. We propose agentic model checking, a…

Software Engineering · Computer Science 2026-05-21 Youcheng Sun , Jiawen Liu , Daniel Kroening , Jason Xue

On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub

Large language models (LLMs) are increasingly being integrated into software development processes. The ability to generate code and submit pull requests with minimal human intervention, through the use of autonomous AI agents, is poised to…

Software Engineering · Computer Science 2026-02-10 Miku Watanabe , Hao Li , Yutaro Kashiwa , Brittany Reid , Hajimu Iida , Ahmed E. Hassan

Agentic Proof Automation: A Case Study

Proof engineering is notoriously labor-intensive: proofs that are straightforward on paper often require lengthy scripts in theorem provers. Recent advances in large language models (LLMs) create new opportunities for proof automation:…

Programming Languages · Computer Science 2026-01-08 Yichen Xu , Martin Odersky

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization

Recent advances in large language models (LLMs) demonstrate their effectiveness in scaling test-time compute for software engineering tasks. However, these approaches often focus on high-level solutions, with limited attention to optimizing…

Software Engineering · Computer Science 2025-09-19 Robert Tjarko Lange , Qi Sun , Aaditya Prasad , Maxence Faldor , Yujin Tang , David Ha

Applying an Agentic Coding Tool for Improving Published Algorithm Implementations

We present a two-stage pipeline for AI-assisted improvement of published algorithm implementations. In the first stage, a large language model with research capabilities identifies recently published algorithms satisfying explicit…

Software Engineering · Computer Science 2026-04-16 Worasait Suwannik

Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents

Agentic systems are becoming more capable: agents define strategies, take actions, and interact with different environments. This autonomy poses serious challenges for overseeing and assessing agent behavior. Most current tools are limited,…

Computation and Language · Computer Science 2026-05-22 Asaf Yehudai , Lilach Eden , Michal Shmueli-Scheuer

Agents4PLC: Automating Closed-loop PLC Code Generation and Verification in Industrial Control Systems using LLM-based Agents

In industrial control systems, the generation and verification of Programmable Logic Controller (PLC) code are critical for ensuring operational efficiency and safety. While Large Language Models (LLMs) have made strides in automated code…

Software Engineering · Computer Science 2024-12-30 Zihan Liu , Ruinan Zeng , Dongxia Wang , Gengyun Peng , Jingyi Wang , Qiang Liu , Peiyu Liu , Wenhai Wang

Agentic Agile-V: From Vibe Coding to Verified Engineering in Software and Hardware Development

Agentic AI coding systems can inspect repositories, plan implementation steps, edit files, call tools, run tests, and submit pull requests. These capabilities make software and hardware development faster in some settings, but current…

Software Engineering · Computer Science 2026-05-21 Christopher Koch

Klever: Verification Framework for Critical Industrial C Programs

Automatic software verification tools help to find hard-to-detect faults in programs checked against specified requirements non-interactively. Besides, they can prove program correctness formally under certain assumptions. These…

Software Engineering · Computer Science 2023-09-29 Ilja Zakharov , Evgeny Novikov , Ilya Shchepetkov

Context-Augmented Code Generation: How Product Context Improves AI Coding Agent Decision Compliance by 49%

AI coding agents powered by large language models can read codebases and produce functional code, but they routinely violate team-specific product decisions that are invisible in the source code alone. We introduce a controlled benchmark…

Software Engineering · Computer Science 2026-05-12 Drew Dillon , Kasyap Varanasi

Agentic AI in the Software Development Lifecycle: Architecture, Empirical Evidence, and the Reshaping of Software Engineering

The arrival of large language models (LLMs) capable of multi-step reasoning, tool use, and long-horizon planning has produced a qualitative shift in software engineering. Where earlier code-completion tools such as GitHub Copilot operated…

Software Engineering · Computer Science 2026-04-30 Happy Bhati

Agentic Education: Using Claude Code to Teach Claude Code

AI coding assistants have proliferated rapidly, yet structured pedagogical frameworks for learning these tools remain scarce. Developers face a gap between tool documentation and practical mastery, relying on fragmented resources such as…

Computers and Society · Computer Science 2026-05-01 Zain Naboulsi

ProofWright: Towards Agentic Formal Verification of CUDA

Large Language Models (LLMs) are increasingly used to automatically generate optimized CUDA kernels, substantially improving developer productivity. However, despite rapid generation, these kernels often contain subtle correctness bugs and…

Software Engineering · Computer Science 2026-03-19 Bodhisatwa Chatterjee , Drew Zagieboylo , Sana Damani , Siva Hari , Christos Kozyrakis

Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

Claude Code is an agentic coding tool that can run shell commands, edit files, and call external services on behalf of the user. This study describes its comprehensive architecture by analyzing the publicly available TypeScript source code…

Software Engineering · Computer Science 2026-04-17 Jiacheng Liu , Xiaohan Zhao , Xinyi Shang , Zhiqiang Shen

Agentic Separation Logic Specification Synthesis

Specification synthesis, the task of automatically inferring formal specifications from program implementations and natural language, is important for refactoring, transpilation, optimization, and verification, yet remains an open challenge…

Programming Languages · Computer Science 2026-05-28 Tarun Suresh , David Korczynski , Julien Vanegue