Related papers: Augmenting Math Word Problems via Iterative Questi…

The CompMath-MCQ Dataset: Are LLMs Ready for Higher-Level Math?

The evaluation of Large Language Models (LLMs) on mathematical reasoning has largely focused on elementary problems, competition-style questions, or formal theorem proving, leaving graduate-level and computational mathematics relatively…

Computation and Language · Computer Science 2026-03-05 Bianca Raimondi , Francesco Pivi , Davide Evangelista , Maurizio Gabbrielli

Enhancing Mathematical Problem Solving in LLMs through Execution-Driven Reasoning Augmentation

Mathematical problem solving is a fundamental benchmark for assessing the reasoning capabilities of artificial intelligence and a gateway to applications in education, science, and engineering where reliable symbolic reasoning is essential.…

Artificial Intelligence · Computer Science 2026-02-10 Aditya Basarkar , Benyamin Tabarsi , Tiffany Barnes , Dongkuan Xu

MathMixup: Boosting LLM Mathematical Reasoning with Difficulty-Controllable Data Synthesis and Curriculum Learning

In mathematical reasoning tasks, the advancement of Large Language Models (LLMs) relies heavily on high-quality training data with clearly defined and well-graded difficulty levels. However, existing data synthesis methods often suffer from…

Machine Learning · Computer Science 2026-01-27 Xuchen Li , Jing Chen , Xuzhao Li , Hao Liang , Xiaohuan Zhou , Taifeng Wang , Wentao Zhang

AgenticMath: Enhancing LLM Reasoning via Agentic-based Math Data Generation

The creation of high-quality datasets to improve Large Language Model (LLM) reasoning remains a significant challenge, as current methods often suffer from generating low-quality/incorrect answers and limited information richness from…

Computation and Language · Computer Science 2026-01-09 Xianyang Liu , Yilin Liu , Shuai Wang , Hao Cheng , Andrew Estornell , Yuzhi Zhao , Jun Shu , Jiaheng Wei

MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs

Large language models (LLMs) have exhibited great potential in mathematical reasoning. However, there remains a performance gap in this area between existing open-source models and closed-source models such as GPT-4. In this paper, we…

Computation and Language · Computer Science 2024-09-12 Zimu Lu , Aojun Zhou , Houxing Ren , Ke Wang , Weikang Shi , Junting Pan , Mingjie Zhan , Hongsheng Li

Math Multiple Choice Question Generation via Human-Large Language Model Collaboration

Multiple choice questions (MCQs) are a popular method for evaluating students' knowledge due to their efficiency in administration and grading. Crafting high-quality math MCQs is a labor-intensive process that requires educators to…

Computation and Language · Computer Science 2024-05-03 Jaewook Lee , Digory Smith , Simon Woodhead , Andrew Lan

MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning

The tool-use Large Language Models (LLMs) that integrate with external Python interpreters have significantly enhanced mathematical reasoning capabilities for open-source LLMs, while tool-free methods chose another track: augmenting math…

Computation and Language · Computer Science 2024-05-14 Shuo Yin , Weihao You , Zhilong Ji , Guoqiang Zhong , Jinfeng Bai

MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion

Large Language Models (LLMs) have shown impressive progress in mathematical reasoning. While data augmentation is promising to enhance mathematical problem-solving ability, current approaches are predominantly limited to instance-level…

Computation and Language · Computer Science 2025-06-17 Qizhi Pei , Lijun Wu , Zhuoshi Pan , Yu Li , Honglin Lin , Chenlin Ming , Xin Gao , Conghui He , Rui Yan

Solving Math Word Problems Using Estimation Verification and Equation Generation

Large Language Models (LLMs) excel at various tasks, including problem-solving and question-answering. However, LLMs often find Math Word Problems (MWPs) challenging because solving them requires a range of reasoning and mathematical…

Artificial Intelligence · Computer Science 2025-09-24 Mitchell Piehl , Dillon Wilson , Ananya Kalita , Jugal Kalita

Reasoning and Sampling-Augmented MCQ Difficulty Prediction via LLMs

The difficulty of multiple-choice questions (MCQs) is a crucial factor for educational assessments. Predicting MCQ difficulty is challenging since it requires understanding both the complexity of reaching the correct option and the…

Artificial Intelligence · Computer Science 2025-03-12 Wanyong Feng , Peter Tran , Stephen Sireci , Andrew Lan

MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning

In math reasoning with large language models (LLMs), fine-tuning data augmentation by query evolution and diverse reasoning paths is empirically verified effective, profoundly narrowing the gap between open-sourced LLMs and cutting-edge…

Computation and Language · Computer Science 2024-07-18 Chengpeng Li , Zheng Yuan , Hongyi Yuan , Guanting Dong , Keming Lu , Jiancan Wu , Chuanqi Tan , Xiang Wang , Chang Zhou

Unleashing LLM Reasoning Capability via Scalable Question Synthesis from Scratch

Improving the mathematical reasoning capabilities of Large Language Models (LLMs) is critical for advancing artificial intelligence. However, access to extensive, diverse, and high-quality reasoning datasets remains a significant challenge,…

Computation and Language · Computer Science 2025-05-28 Yuyang Ding , Xinyu Shi , Xiaobo Liang , Juntao Li , Zhaopeng Tu , Qiaoming Zhu , Min Zhang

Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching

With the introduction of large language models (LLMs), automatic math reasoning has seen tremendous success. However, current methods primarily focus on providing solutions or using techniques like Chain-of-Thought to enhance…

Computation and Language · Computer Science 2024-07-25 Yuyang Ding , Hanglei Hu , Jie Zhou , Qin Chen , Bo Jiang , Liang He

LLM-ProS: Analyzing Large Language Models' Performance in Competitive Problem Solving

The rapid advancement of large language models has opened new avenues for automating complex problem-solving tasks such as algorithmic coding and competitive programming. This paper introduces a novel evaluation technique, LLM-ProS, to…

Computation and Language · Computer Science 2026-03-03 Md Sifat Hossain , Anika Tabassum , Md. Fahim Arefin , Tarannum Shaila Zaman

Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning

Large Language Models (LLMs) have been applied to Math Word Problems (MWPs) with transformative impacts, revolutionizing how these complex problems are approached and solved in various domains including educational settings. However, the…

Computation and Language · Computer Science 2024-06-18 Joykirat Singh , Akshay Nambi , Vibhav Vineet

LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning

Large Language Models (LLMs) have shown remarkable performance in various natural language processing tasks but face challenges in mathematical reasoning, where complex problem-solving requires both linguistic understanding and mathematical…

Computation and Language · Computer Science 2025-03-20 Shuguang Chen , Guang Lin

QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation

Reinforcement learning (RL) has emerged as a central paradigm for training large language models (LLMs) in reasoning tasks. Yet recent studies question RL's ability to incentivize reasoning capacity beyond the base model. This raises a key…

Computation and Language · Computer Science 2026-04-01 Jiazheng Li , Hongzhou Lin , Hong Lu , Kaiyue Wen , Zaiwen Yang , Jiaxuan Gao , Yi Wu , Jingzhao Zhang

Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving

Math Word Problem (MWP) solving presents a challenging task in Natural Language Processing (NLP). This study aims to provide MWP solvers with a more diverse training set, ultimately improving their ability to solve various math problems. We…

Computation and Language · Computer Science 2024-05-02 Gulsum Yigit , Mehmet Fatih Amasyali

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. However, the auto-regressive generation process makes LLMs prone to produce errors, hallucinations and inconsistent statements when…

Artificial Intelligence · Computer Science 2024-07-23 Chaojie Wang , Yanchen Deng , Zhiyi Lyu , Liang Zeng , Jujie He , Shuicheng Yan , Bo An

Superficial Self-Improved Reasoners Benefit from Model Merging

As scaled language models (LMs) approach human-level reasoning capabilities, self-improvement emerges as a solution to synthesizing high-quality data corpus. While previous research has identified model collapse as a risk in…

Computation and Language · Computer Science 2025-10-28 Xiangchi Yuan , Chunhui Zhang , Zheyuan Liu , Dachuan Shi , Leyan Pan , Soroush Vosoughi , Wenke Lee