English
Related papers

Related papers: Agentic Proof Automation: A Case Study

200 papers

We present Prover Agent, a novel AI agent for automated theorem proving that integrates large language models (LLMs) with a formal proof assistant, Lean. Prover Agent coordinates an informal reasoning LLM, a formal prover model, and…

Artificial Intelligence · Computer Science 2026-02-18 Kaito Baba , Chaoran Liu , Shuhei Kurita , Akiyoshi Sannai

Large language models (LLMs) have been used to generate formal proofs of mathematical theorems in proofs assistants such as Lean. However, we often want to optimize a formal proof with respect to various criteria, depending on its…

Artificial Intelligence · Computer Science 2026-05-22 Riyaz Ahuja , Jeremy Avigad , Prasad Tetali , Sean Welleck

Automatically generated code is gaining traction recently, owing to the prevalence of Large Language Models (LLMs). Further, the AlphaProof initiative has demonstrated the possibility of using AI for general mathematical reasoning.…

Software Engineering · Computer Science 2026-04-14 Haoxin Tu , Huan Zhao , Yahui Song , Mehtab Zafar , Ruijie Meng , Abhik Roychoudhury

Having a high quality software is essential in software engineering, which requires robust validation and verification processes during testing activities. Manual testing, while effective, can be time consuming and costly, leading to an…

Software Engineering · Computer Science 2025-01-03 Betim Sherifi , Khaled Slhoub , Fitzroy Nembhard

Historically, scientific discovery has been a lengthy and costly process, demanding substantial time and resources from initial conception to final results. To accelerate scientific discovery, reduce research costs, and improve research…

Human-Computer Interaction · Computer Science 2025-06-18 Samuel Schmidgall , Yusheng Su , Ze Wang , Ximeng Sun , Jialian Wu , Xiaodong Yu , Jiang Liu , Michael Moor , Zicheng Liu , Emad Barsoum

Large language models (LLMs) increasingly excel at mathematical reasoning, but their unreliability limits their utility in mathematics research. A mitigation is using LLMs to generate formal proofs in languages like Lean. We perform the…

Recent advances in the intrinsic reasoning capabilities of large language models (LLMs) have given rise to LLM-based agent systems that exhibit near-human performance on a variety of automated tasks. However, although these systems share…

Artificial Intelligence · Computer Science 2025-08-26 Bingxi Zhao , Lin Geng Foo , Ping Hu , Christian Theobalt , Hossein Rahmani , Jun Liu

Formal verification offers a path to provably correct software, but writing verified code remains expensive enough that the technique is rarely used in production. Recent large language models can accelerate this work, and recent benchmarks…

Logic in Computer Science · Computer Science 2026-05-28 Leo Yao

LLM-based agents are rapidly being adopted for scientific data analysis, automating tasks once limited by human time and expertise. This capability is often framed as an acceleration of discovery, but it also accelerates a familiar failure…

Artificial Intelligence · Computer Science 2026-05-21 Dionizije Fa , Marko Culjak

Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery, yet their static knowledge and hallucination issues hinder autonomous research applications. Recent advances integrate LLMs into agentic…

Artificial Intelligence · Computer Science 2025-12-23 Zeyu Xia , Jinzhe Ma , Congjie Zheng , Shufei Zhang , Yuqiang Li , Hang Su , P. Hu , Changshui Zhang , Xingao Gong , Wanli Ouyang , Lei Bai , Dongzhan Zhou , Mao Su

Verification is one of the central tasks in circuit and system design. While simulation and emulation are widely used, complete correctness can only be ensured based on formal proof techniques. But these approaches often have very high run…

Logic in Computer Science · Computer Science 2025-05-30 Rolf Drechsler

Software testing is an important part of the development cycle, yet it requires specialized expertise and substantial developer effort to adequately test software. Recent discoveries of the capabilities of large language models (LLMs)…

Software Engineering · Computer Science 2023-09-06 Robert Feldt , Sungmin Kang , Juyeon Yoon , Shin Yoo

Software issue resolution aims to address real-world issues in software repositories (e.g., bug fixing and efficiency optimization) based on natural language descriptions provided by users, representing a key aspect of software maintenance.…

Software Engineering · Computer Science 2025-12-30 Zhonghao Jiang , David Lo , Zhongxin Liu

Recent advancements in Large Language Models (LLMs) have spurred interest in deploying LLM agents to undertake tasks in the world. LLMs are often deployed in agent systems: code that orchestrates LLM calls and provides them with tools. We…

Artificial Intelligence · Computer Science 2025-05-20 Maxime Robeyns , Martin Szummer , Laurence Aitchison

Large Language Models (LLMs) have demonstrated advanced capabilities in real-world agentic applications. Growing research efforts aim to develop LLM-based agents to address practical demands, introducing a new challenge: agentic scenarios…

Artificial Intelligence · Computer Science 2025-05-23 Yunjia Qi , Hao Peng , Xiaozhi Wang , Amy Xin , Youfeng Liu , Bin Xu , Lei Hou , Juanzi Li

Modern engineering increasingly relies on vast datasets generated by experiments and simulations, driving a growing demand for efficient, reliable, and broadly applicable modeling strategies. There is also heightened interest in developing…

Artificial Intelligence · Computer Science 2025-10-03 Yang Liu , Zaid Abulawi , Abhiram Garimidi , Doyeong Lim

The integration of Large Language Models (LLMs) into software engineering has driven a transition from traditional rule-based systems to autonomous agentic systems capable of solving complex problems. However, systematic progress is…

Software Engineering · Computer Science 2025-10-24 Jiale Guo , Suizhi Huang , Mei Li , Dong Huang , Xingsheng Chen , Regina Zhang , Zhijiang Guo , Han Yu , Siu-Ming Yiu , Pietro Lio , Kwok-Yan Lam

Automated proof generation for formal software verification remains largely unresolved despite advances in large language models (LLMs). While LLMs perform well in NLP, vision, and code generation, formal verification still requires…

Logic in Computer Science · Computer Science 2026-04-10 Youngjoo Ahn , Sangyeop Yeo , Gijung Im , Jongmin Lee , Jinyoung Yeo , Jieung Kim

In industrial control systems, the generation and verification of Programmable Logic Controller (PLC) code are critical for ensuring operational efficiency and safety. While Large Language Models (LLMs) have made strides in automated code…

Software Engineering · Computer Science 2024-12-30 Zihan Liu , Ruinan Zeng , Dongxia Wang , Gengyun Peng , Jingyi Wang , Qiang Liu , Peiyu Liu , Wenhai Wang

In the current rapidly changing digital environment, businesses are under constant stress to ensure that their systems are secured. Security audits help to maintain a strong security posture by ensuring that policies are in place, controls…

Cryptography and Security · Computer Science 2025-05-19 Jia Hui Chin , Pu Zhang , Yu Xin Cheong , Jonathan Pan
‹ Prev 1 2 3 10 Next ›