Related papers: SkyMath: Technical Report

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far…

Artificial Intelligence · Computer Science 2024-07-18 Liang Zeng , Liangjun Zhong , Liang Zhao , Tianwen Wei , Liu Yang , Jujie He , Cheng Cheng , Rui Hu , Yang Liu , Shuicheng Yan , Han Fang , Yahui Zhou

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Large language models (LLMs) have pushed the limits of natural language understanding and exhibited excellent problem-solving ability. Despite the great success, most existing open-source LLMs (e.g., LLaMA-2) are still far away from…

Computation and Language · Computer Science 2024-05-06 Longhui Yu , Weisen Jiang , Han Shi , Jincheng Yu , Zhengying Liu , Yu Zhang , James T. Kwok , Zhenguo Li , Adrian Weller , Weiyang Liu

Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks

The rapid progress in the field of natural language processing (NLP) systems and the expansion of large language models (LLMs) have opened up numerous opportunities in the field of education and instructional methods. These advancements…

Computation and Language · Computer Science 2024-04-23 Avinash Anand , Mohit Gupta , Kritarth Prasad , Navya Singla , Sanjana Sanjeev , Jatin Kumar , Adarsh Raj Shivam , Rajiv Ratn Shah

Skywork: A More Open Bilingual Foundation Model

In this technical report, we present Skywork-13B, a family of large language models (LLMs) trained on a corpus of over 3.2 trillion tokens drawn from both English and Chinese texts. This bilingual foundation model is the most extensively…

Computation and Language · Computer Science 2023-10-31 Tianwen Wei , Liang Zhao , Lichang Zhang , Bo Zhu , Lijie Wang , Haihua Yang , Biye Li , Cheng Cheng , Weiwei Lü , Rui Hu , Chenxia Li , Liu Yang , Xilin Luo , Xuejie Wu , Lunan Liu , Wenjun Cheng , Peng Cheng , Jianhao Zhang , Xiaoyu Zhang , Lei Lin , Xiaokun Wang , Yutuan Ma , Chuanhai Dong , Yanqi Sun , Yifu Chen , Yongyi Peng , Xiaojuan Liang , Shuicheng Yan , Han Fang , Yahui Zhou

KwaiYiiMath: Technical Report

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning. In this…

Computation and Language · Computer Science 2023-10-20 Jiayi Fu , Lei Lin , Xiaoyang Gao , Pengli Liu , Zhengzong Chen , Zhirui Yang , Shengnan Zhang , Xue Zheng , Yan Li , Yuliang Liu , Xucheng Ye , Yiqiao Liao , Chao Liao , Bin Chen , Chengru Song , Junchen Wan , Zijia Lin , Fuzheng Zhang , Zhongyuan Wang , Di Zhang , Kun Gai

Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems

This study focuses on improving the performance of lightweight Large Language Models (LLMs) in mathematical reasoning tasks. We introduce a novel method for measuring mathematical logic similarity and design an automatic screening mechanism…

Computation and Language · Computer Science 2024-09-04 Ding Kai , Ma Zhenguo , Yan Xiaoran

TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving

The increasing adoption of artificial intelligence in telecommunications has raised interest in the capability of Large Language Models (LLMs) to address domain-specific, mathematically intensive tasks. Although recent advancements have…

Artificial Intelligence · Computer Science 2025-06-13 Vincenzo Colle , Mohamed Sana , Nicola Piovesan , Antonio De Domenico , Fadhel Ayed , Merouane Debbah

Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data

Large Language Models (LLMs) have shown excellent performance in language understanding, text generation, code synthesis, and many other tasks, while they still struggle in complex multi-step reasoning problems, such as mathematical…

Computation and Language · Computer Science 2024-06-05 Haolong Li , Yu Ma , Yinqi Zhang , Chen Ye , Jie Chen

BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages

Large language models (LLMs) demonstrate promising translation performance among various natural languages. However, many LLMs especially the open-sourced ones, such as BLOOM and LLaMA, are English-dominant and support only dozens of…

Computation and Language · Computer Science 2023-11-22 Wen Yang , Chong Li , Jiajun Zhang , Chengqing Zong

Benchmarking Large Language Models for Math Reasoning Tasks

The use of Large Language Models (LLMs) in mathematical reasoning has become a cornerstone of related research, demonstrating the intelligence of these models and enabling potential practical applications through their advanced performance,…

Computation and Language · Computer Science 2024-12-20 Kathrin Seßler , Yao Rong , Emek Gözlüklü , Enkelejda Kasneci

Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models

Large Language Models (LLMs) achieve impressive performance in a wide range of tasks, even if they are often trained with the only objective of chatting fluently with users. Among other skills, LLMs show emergent abilities in mathematical…

Computation and Language · Computer Science 2024-06-12 Flavio Petruzzellis , Alberto Testolin , Alessandro Sperduti

MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models

The rapid development of large language models (LLMs) has spurred extensive research into their domain-specific capabilities, particularly mathematical reasoning. However, most open-source LLMs focus solely on mathematical reasoning,…

Computation and Language · Computer Science 2024-09-04 Shuai Peng , Di Fu , Liangcai Gao , Xiuqin Zhong , Hongguang Fu , Zhi Tang

PolyLM: An Open Source Polyglot Large Language Model

Large language models (LLMs) demonstrate remarkable ability to comprehend, reason, and generate following nature language instructions. However, the development of LLMs has been primarily focused on high-resource languages, such as English,…

Computation and Language · Computer Science 2023-07-13 Xiangpeng Wei , Haoran Wei , Huan Lin , Tianhao Li , Pei Zhang , Xingzhang Ren , Mei Li , Yu Wan , Zhiwei Cao , Binbin Xie , Tianxiang Hu , Shangjie Li , Binyuan Hui , Bowen Yu , Dayiheng Liu , Baosong Yang , Fei Huang , Jun Xie

RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics

Existing benchmarks for evaluating mathematical reasoning in large language models (LLMs) rely primarily on competition problems, formal proofs, or artificially challenging questions -- failing to capture the nature of mathematics…

Artificial Intelligence · Computer Science 2025-10-21 Jie Zhang , Cezara Petrui , Kristina Nikolić , Florian Tramèr

FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models

To thoroughly assess the mathematical reasoning abilities of Large Language Models (LLMs), we need to carefully curate evaluation datasets covering diverse mathematical concepts and mathematical problems at different difficulty levels. In…

Computation and Language · Computer Science 2024-09-09 Yan Liu , Renren Jin , Ling Shi , Zheng Yao , Deyi Xiong

STBench: Assessing the Ability of Large Language Models in Spatio-Temporal Analysis

The rapid evolution of large language models (LLMs) holds promise for reforming the methodology of spatio-temporal data mining. However, current works for evaluating the spatio-temporal understanding capability of LLMs are somewhat limited…

Computation and Language · Computer Science 2024-06-28 Wenbin Li , Di Yao , Ruibo Zhao , Wenjie Chen , Zijie Xu , Chengxue Luo , Chang Gong , Quanliang Jing , Haining Tan , Jingping Bi

Large Language Models for Mathematicians

Large language models (LLMs) such as ChatGPT have received immense interest for their general-purpose language understanding and, in particular, their ability to generate high-quality text or computer code. For many professions, LLMs…

Computation and Language · Computer Science 2024-04-03 Simon Frieder , Julius Berner , Philipp Petersen , Thomas Lukasiewicz

Specializing Smaller Language Models towards Multi-Step Reasoning

The surprising ability of Large Language Models (LLMs) to perform well on complex reasoning with only few-shot chain-of-thought prompts is believed to emerge only in very large-scale models (100+ billion parameters). We show that such…

Computation and Language · Computer Science 2023-01-31 Yao Fu , Hao Peng , Litu Ou , Ashish Sabharwal , Tushar Khot

ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models

This paper introduces ConceptMath, a bilingual (English and Chinese), fine-grained benchmark that evaluates concept-wise mathematical reasoning of Large Language Models (LLMs). Unlike traditional benchmarks that evaluate general…

Computation and Language · Computer Science 2024-02-26 Yanan Wu , Jie Liu , Xingyuan Bu , Jiaheng Liu , Zhanhui Zhou , Yuanxing Zhang , Chenchen Zhang , Zhiqi Bai , Haibin Chen , Tiezheng Ge , Wanli Ouyang , Wenbo Su , Bo Zheng

Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective

The progress of Large Language Models (LLMs) like ChatGPT raises the question of how they can be integrated into education. One hope is that they can support mathematics learning, including word-problem solving. Since LLMs can handle…

Computation and Language · Computer Science 2025-08-12 Anselm R. Strohmaier , Wim Van Dooren , Kathrin Seßler , Brian Greer , Lieven Verschaffel