English
Related papers

Related papers: Solving Formal Math Problems by Decomposition and …

200 papers

Large Language Models (LLMs) demonstrate impressive mathematical reasoning abilities, but their solutions frequently contain errors that cannot be automatically checked. Formal theorem proving systems such as Lean 4 offer automated…

Artificial Intelligence · Computer Science 2026-03-18 Sumanth Varambally , Thomas Voice , Yanchao Sun , Zhifeng Chen , Rose Yu , Ke Ye

We present Prover Agent, a novel AI agent for automated theorem proving that integrates large language models (LLMs) with a formal proof assistant, Lean. Prover Agent coordinates an informal reasoning LLM, a formal prover model, and…

Artificial Intelligence · Computer Science 2026-02-18 Kaito Baba , Chaoran Liu , Shuhei Kurita , Akiyoshi Sannai

We present Ax-Prover, a multi-agent system for automated theorem proving in Lean that can solve problems across diverse scientific domains and operate either autonomously or collaboratively with human experts. To achieve this, Ax-Prover…

Language models have become increasingly powerful tools for formal mathematical reasoning. However, most existing approaches rely exclusively on either large general-purpose models or smaller specialized models, each with distinct…

Artificial Intelligence · Computer Science 2025-07-22 Nicolas Wischermann , Claudio Mayrink Verdun , Gabriel Poesia , Francesco Noseda

Large language models (LLMs) have been used to generate formal proofs of mathematical theorems in proofs assistants such as Lean. However, we often want to optimize a formal proof with respect to various criteria, depending on its…

Artificial Intelligence · Computer Science 2026-05-22 Riyaz Ahuja , Jeremy Avigad , Prasad Tetali , Sean Welleck

Proof assistants like Lean have revolutionized mathematical proof verification, ensuring high accuracy and reliability. Although large language models (LLMs) show promise in mathematical reasoning, their advancement in formal theorem…

Artificial Intelligence · Computer Science 2024-05-24 Huajian Xin , Daya Guo , Zhihong Shao , Zhizhou Ren , Qihao Zhu , Bo Liu , Chong Ruan , Wenda Li , Xiaodan Liang

Automated theorem proving is fundamental to formal methods, and the recent trend is to integrate large language models (LLMs) and proof assistants to form effective proof agents. While existing proof agents show promising performance, they…

Software Engineering · Computer Science 2026-04-22 Yican Sun , Chengwei Shi , Hangzhou Lyu , Yingfei Xiong

Formalizing mathematical proofs using computerized verification languages like Lean 4 has the potential to significantly impact the field of mathematics, it offers prominent capabilities for advancing mathematical reasoning. However,…

Computation and Language · Computer Science 2024-11-11 Xichen Tang

Recent progress in formal theorem proving has benefited from large-scale proof generation and verifier-aware training, but agentic proving is rarely integrated into prover training, appearing only at inference time. We present OProver, a…

Computation and Language · Computer Science 2026-05-19 David Ma , Kaijing Ma , Shawn Guo , Yunfeng Shi , Enduo Zhao , Jiajun Shi , Zhaoxiang Zhang , Gavin Cheung , Jiaheng Liu , Zili Wang

Large language models (LLMs) often struggle with complex logical reasoning due to logical inconsistencies and the inherent difficulty of such reasoning. We use Lean, a theorem proving framework, to address these challenges. By formalizing…

Computation and Language · Computer Science 2024-03-21 Dongwei Jiang , Marcio Fonseca , Shay B. Cohen

Despite demonstrating emergent reasoning abilities, Large Language Models (LLMS) often lose track of complex, multi-step reasoning. Existing studies show that providing guidance via decomposing the original question into multiple…

Computation and Language · Computer Science 2024-04-04 Gurusha Juneja , Subhabrata Dutta , Tanmoy Chakraborty

LLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely…

Solving mathematical problems using computer-verifiable languages like Lean has significantly impacted the mathematical and computer science communities. State-of-the-art methods utilize a single Large Language Model (LLM) to generate…

Computation and Language · Computer Science 2025-05-28 Ruida Wang , Rui Pan , Yuxin Li , Jipeng Zhang , Yizhen Jia , Shizhe Diao , Renjie Pi , Junjie Hu , Tong Zhang

Formal reasoning and automated theorem proving constitute a challenging subfield of machine learning, in which machines are tasked with proving mathematical theorems using formal languages like Lean. A formal verification system can check…

Artificial Intelligence · Computer Science 2025-11-05 Azim Ospanov , Farzan Farnia , Roozbeh Yousefzadeh

Nowadays, formal theorem provers have made monumental progress on high-school and competition-level mathematics, but few of them generalize to more advanced mathematics. In this paper, we present REAL-Prover, a new open-source stepwise…

Computation and Language · Computer Science 2025-11-25 Ziju Shen , Naohao Huang , Fanyi Yang , Yutong Wang , Guoxiong Gao , Tianyi Xu , Jiedong Jiang , Wanyi He , Pu Yang , Mengzhou Sun , Haocheng Ju , Peihao Wu , Bryan Dai , Bin Dong

Large language models (LLMs) are increasingly explored as general-purpose reasoners, particularly in agentic contexts. However, their outputs remain prone to mathematical and logical errors. This is especially challenging in open-ended…

Artificial Intelligence · Computer Science 2025-05-30 Agnieszka Mensfelt , Kostas Stathis , Vince Trencsenyi

Formal methods is pivotal for verifying the reliability of critical systems through rigorous mathematical proofs. However, its adoption is hindered by labor-intensive manual proofs and the expertise required to use theorem provers. Recent…

Formal Languages and Automata Theory · Computer Science 2025-05-22 Jilin Hu , Jianyu Zhang , Yongwang Zhao , Talia Ringer

Large language models (LLMs) increasingly excel at mathematical reasoning, but their unreliability limits their utility in mathematics research. A mitigation is using LLMs to generate formal proofs in languages like Lean. We perform the…

Despite the success of large language models (LLMs), the task of theorem proving still remains one of the hardest reasoning tasks that is far from being fully solved. Prior methods using language models have demonstrated promising results,…

Large Language Models (LLMs) are increasingly explored for legal argument generation, yet they pose significant risks of manipulation through hallucination and ungrounded persuasion, and often fail to utilize provided factual bases…

Artificial Intelligence · Computer Science 2025-10-27 Li Zhang , Kevin D. Ashley
‹ Prev 1 2 3 10 Next ›