English
Related papers

Related papers: LEAN-GitHub: Compiling GitHub LEAN repositories fo…

200 papers

The math abilities of large language models can represent their abstract reasoning ability. In this paper, we introduce and open-source our math reasoning LLMs InternLM-Math which is continue pre-trained from InternLM2. We unify…

Large language models have demonstrated impressive capabilities across various natural language processing tasks, especially in solving mathematical problems. However, large language models are not good at math theorem proving using formal…

Computation and Language · Computer Science 2025-06-19 Huaiyuan Ying , Zijian Wu , Yihan Geng , Zheng Yuan , Dahua Lin , Kai Chen

Proof assistants like Lean have revolutionized mathematical proof verification, ensuring high accuracy and reliability. Although large language models (LLMs) show promise in mathematical reasoning, their advancement in formal theorem…

Artificial Intelligence · Computer Science 2024-05-24 Huajian Xin , Daya Guo , Zhihong Shao , Zhizhou Ren , Qihao Zhu , Bo Liu , Chong Ruan , Wenda Li , Xiaodan Liang

The challenge of formal proof generation has a rich history, but with modern techniques, we may finally be at the stage of making actual progress in real-life mathematical problems. This paper explores the integration of ChatGPT and basic…

Logic in Computer Science · Computer Science 2025-02-20 Sangjun Han , Taeil Hur , Youngmi Hur , Kathy Sangkyung Lee , Myungyoon Lee , Hyojae Lim

Using AI to write formal proofs for mathematical problems is a challenging task that has seen some advancements in recent years. Automated systems such as Lean can verify the correctness of proofs written in formal language, yet writing the…

Machine Learning · Computer Science 2025-03-04 Roozbeh Yousefzadeh , Xuenan Cao , Azim Ospanov

We present StepFun-Prover Preview, a large language model designed for formal theorem proving through tool-integrated reasoning. Using a reinforcement learning pipeline that incorporates tool-based interactions, StepFun-Prover can achieve…

Artificial Intelligence · Computer Science 2025-08-14 Shijie Shang , Ruosi Wan , Yue Peng , Yutong Wu , Xiong-hui Chen , Jie Yan , Xiangyu Zhang

Verifiable formal languages like Lean have profoundly impacted mathematical reasoning, particularly through the use of large language models (LLMs) for automated reasoning. A significant challenge in training LLMs for these formal languages…

Computation and Language · Computer Science 2025-02-28 Guoxiong Gao , Yutong Wang , Jiedong Jiang , Qi Gao , Zihan Qin , Tianyi Xu , Bin Dong

GitHub workflows or GitHub CI is a popular continuous integration platform that enables developers to automate various software engineering tasks by specifying them as workflows, i.e., YAML files with a list of jobs. However, engineering…

Software Engineering · Computer Science 2024-03-20 Xinyu Zhang , Siddharth Muralee , Sourag Cherupattamoolayil , Aravind Machiry

Formal mathematical reasoning remains a critical challenge for artificial intelligence, hindered by limitations of existing benchmarks in scope and scale. To address this, we present FormalMATH, a large-scale Lean4 benchmark comprising…

Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs)…

Formal Languages and Automata Theory · Computer Science 2024-10-07 Ruida Wang , Jipeng Zhang , Yizhen Jia , Rui Pan , Shizhe Diao , Renjie Pi , Tong Zhang

Large Language Models (LLMs) demonstrate impressive mathematical reasoning abilities, but their solutions frequently contain errors that cannot be automatically checked. Formal theorem proving systems such as Lean 4 offer automated…

Artificial Intelligence · Computer Science 2026-03-18 Sumanth Varambally , Thomas Voice , Yanchao Sun , Zhifeng Chen , Rose Yu , Ke Ye

The research in AI-based formal mathematical reasoning has shown an unstoppable growth trend. These studies have excelled in mathematical competitions like IMO and have made significant progress. This paper focuses on formal verification,…

Artificial Intelligence · Computer Science 2025-06-10 Jialun Cao , Yaojie Lu , Meiziniu Li , Haoyang Ma , Haokun Li , Mengda He , Cheng Wen , Le Sun , Hongyu Zhang , Shengchao Qin , Shing-Chi Cheung , Cong Tian

There is growing evidence that pretraining on high quality, carefully thought-out tokens such as code or mathematics plays an important role in improving the reasoning abilities of large language models. For example, Minerva, a PaLM model…

Artificial Intelligence · Computer Science 2023-10-11 Keiran Paster , Marco Dos Santos , Zhangir Azerbayev , Jimmy Ba

We present Lean Finder, a semantic search engine for Lean and mathlib that understands and aligns with the intents of mathematicians. Progress in formal theorem proving is often hindered by the difficulty of locating relevant theorems and…

Machine Learning · Computer Science 2026-02-24 Jialin Lu , Kye Emond , Kaiyu Yang , Swarat Chaudhuri , Weiran Sun , Wuyang Chen

Nowadays, formal theorem provers have made monumental progress on high-school and competition-level mathematics, but few of them generalize to more advanced mathematics. In this paper, we present REAL-Prover, a new open-source stepwise…

Computation and Language · Computer Science 2025-11-25 Ziju Shen , Naohao Huang , Fanyi Yang , Yutong Wang , Guoxiong Gao , Tianyi Xu , Jiedong Jiang , Wanyi He , Pu Yang , Mengzhou Sun , Haocheng Ju , Peihao Wu , Bryan Dai , Bin Dong

Solving mathematical problems using computer-verifiable languages like Lean has significantly impacted the mathematical and computer science communities. State-of-the-art methods utilize a single Large Language Model (LLM) to generate…

Computation and Language · Computer Science 2025-05-28 Ruida Wang , Rui Pan , Yuxin Li , Jipeng Zhang , Yizhen Jia , Shizhe Diao , Renjie Pi , Junjie Hu , Tong Zhang

We perform a thorough analysis of the formal and informal statements in the miniF2F benchmark from the perspective of an AI system that is tasked to participate in a math Olympiad consisting of the problems in miniF2F. In such setting, the…

Artificial Intelligence · Computer Science 2025-11-06 Azim Ospanov , Farzan Farnia , Roozbeh Yousefzadeh

Although most of the automated theorem-proving approaches depend on formal proof systems, informal theorem proving can align better with large language models' (LLMs) strength in natural language processing. In this work, we identify a…

Artificial Intelligence · Computer Science 2026-04-20 Yunhe Li , Hao Shi , Bowen Deng , Wei Wang , Mengzhe Ruan , Hanxu Hou , Zhongxiang Dai , Siyang Gao , Chao Wang , Shuang Qiu , Linqi Song

Formalizing mathematical proofs using computerized verification languages like Lean 4 has the potential to significantly impact the field of mathematics, it offers prominent capabilities for advancing mathematical reasoning. However,…

Computation and Language · Computer Science 2024-11-11 Xichen Tang

We introduce Goedel-Prover, an open-source language model that achieves state-of-the-art (as of April 5 2025) performance in automated formal proof generation for mathematical problems. A key challenge in this field is the scarcity of…

Machine Learning · Computer Science 2025-04-22 Yong Lin , Shange Tang , Bohan Lyu , Jiayun Wu , Hongzhou Lin , Kaiyu Yang , Jia Li , Mengzhou Xia , Danqi Chen , Sanjeev Arora , Chi Jin
‹ Prev 1 2 3 10 Next ›