English
Related papers

Related papers: LogicPro: Improving Complex Logical Reasoning via …

200 papers

Data synthesis for training large reasoning models offers a scalable alternative to limited, human-curated datasets, enabling the creation of high-quality data. However, existing approaches face several challenges: (i) indiscriminate…

Artificial Intelligence · Computer Science 2026-05-11 Yongxian Wei , Yilin Zhao , Zixuan Hu , Li Shen , Xinrui Chen , Runxi Cheng , Sinan Du , Hao Yu , Chun Yuan , Dian Li

Large language models (LLMs) have shown impressive promise in code generation, yet their progress remains limited by the shortage of large-scale datasets that are both diverse and well-aligned with human reasoning. Most existing resources…

Machine Learning · Computer Science 2025-10-28 Amal Abed , Ivan Lukic , Jörg K. H. Franke , Frank Hutter

Large language models have achieved substantial progress in mathematical reasoning, yet their advancement is limited by the scarcity of high-quality, high-difficulty training data. Existing synthesis methods largely rely on transforming…

Computation and Language · Computer Science 2026-03-10 Shaoxiong Zhan , Yanlin Lai , Ziyu Lu , Dahua Lin , Ziqing Yang , Fei Tan

Mathematical reasoning remains challenging for LLMs due to complex logic and the need for precise computation. Existing methods enhance LLM reasoning by synthesizing datasets through problem rephrasing, but face issues with generation…

Computation and Language · Computer Science 2025-06-12 Lei Xu , Sirui Chen , Yuxuan Huang , Chaochao Lu

Training models on synthetic data has emerged as an increasingly important strategy for improving the performance of generative AI. This approach is particularly helpful for large multimodal models (LMMs) due to the relative scarcity of…

Artificial Intelligence · Computer Science 2026-01-13 Gabriela Ben Melech Stan , Estelle Aflalo , Avinash Madasu , Vasudev Lal , Phillip Howard

Joint logical-numerical reasoning remains a major challenge for language models, yet existing datasets rely on fixed rule sets and offer limited control over task complexity, constraining their generalizability for evaluation and training.…

Computation and Language · Computer Science 2025-10-14 Yiwei Liu , Yucheng Li , Xiao Li , Gong Cheng

The ability of large language models to solve complex mathematical problems has progressed significantly, particularly for tasks requiring advanced reasoning. However, the scarcity of sufficiently challenging problems, particularly at the…

Computation and Language · Computer Science 2025-12-23 Xueliang Zhao , Wei Wu , Jian Guan , Lingpeng Kong

Enhancing the mathematical reasoning of large language models (LLMs) demands high-quality training data, yet conventional methods face critical challenges in scalability, cost, and data reliability. To address these limitations, we propose…

Computation and Language · Computer Science 2025-08-27 Sirui Chen , Changxin Tian , Binbin Hu , Kunlong Chen , Ziqi Liu , Zhiqiang Zhang , Jun Zhou

In mathematical reasoning tasks, the advancement of Large Language Models (LLMs) relies heavily on high-quality training data with clearly defined and well-graded difficulty levels. However, existing data synthesis methods often suffer from…

Machine Learning · Computer Science 2026-01-27 Xuchen Li , Jing Chen , Xuzhao Li , Hao Liang , Xiaohuan Zhou , Taifeng Wang , Wentao Zhang

Advancing complex reasoning in large language models relies on high-quality, verifiable datasets, yet human annotation remains cost-prohibitive and difficult to scale. Current synthesis paradigms often face a recurring trade-off:…

Artificial Intelligence · Computer Science 2026-02-04 Zhengbo Jiao , Shaobo Wang , Zifan Zhang , Xuan Ren , Wei Wang , Bing Zhao , Hu Wei , Linfeng Zhang

Large language models (LLMs) make remarkable progress in reasoning tasks. Among different reasoning modes, inductive reasoning, due to its better alignment with human learning, attracts increasing interest. However, research on inductive…

Computation and Language · Computer Science 2025-10-17 Kedi Chen , Zhikai Lei , Xu Guo , Xuecheng Wu , Siyuan Zeng , Jianghao Yin , Yinqi Zhang , Qin Chen , Jie Zhou , Liang He , Qipeng Guo , Kai Chen , Wei Zhang

Improving the mathematical reasoning capabilities of Large Language Models (LLMs) is critical for advancing artificial intelligence. However, access to extensive, diverse, and high-quality reasoning datasets remains a significant challenge,…

Computation and Language · Computer Science 2025-05-28 Yuyang Ding , Xinyu Shi , Xiaobo Liang , Juntao Li , Zhaopeng Tu , Qiaoming Zhu , Min Zhang

Large language models make remarkable progress in reasoning capabilities. Existing works focus mainly on deductive reasoning tasks (e.g., code and math), while another type of reasoning mode that better aligns with human learning, inductive…

Computation and Language · Computer Science 2025-03-18 Kedi Chen , Zhikai Lei , Fan Zhang , Yinqi Zhang , Qin Chen , Jie Zhou , Liang He , Qipeng Guo , Kai Chen , Wei Zhang

Programming-by-example (PBE) is a synthesis paradigm that allows users to generate functions by simply providing input-output examples. While a promising interaction paradigm, synthesis is still too slow for realtime interaction and more…

Machine Learning · Computer Science 2020-02-10 Kairo Morton , William Hallahan , Elven Shum , Ruzica Piskac , Mark Santolucito

Although LLMs have made substantial progress in reasoning, systematically producing frontier-level reasoning data remains difficult. Existing synthesis methods often have limited visibility into the structural factors that govern problem…

Synthetic data is a standard component in training large language models, yet systematic comparisons across design dimensions, including rephrasing strategy, generator model, and source data, remain absent. We conduct extensive controlled…

Large language models (LLMs) often struggle with complex logical reasoning due to logical inconsistencies and the inherent difficulty of such reasoning. We use Lean, a theorem proving framework, to address these challenges. By formalizing…

Computation and Language · Computer Science 2024-03-21 Dongwei Jiang , Marcio Fonseca , Shay B. Cohen

Reinforcement Learning (RL) has been shown to significantly boost reasoning capabilities of large language models (LLMs) in math, coding, and multi-hop reasoning tasks. However, RL fine-tuning requires abundant high-quality verifiable data,…

Logical reasoning remains a challenge for natural language processing, but it can be improved by training language models to mimic theorem provers on procedurally generated problems. Previous work used domain-specific proof generation…

Computation and Language · Computer Science 2024-06-18 Damien Sileo

Proof assistants like Lean have revolutionized mathematical proof verification, ensuring high accuracy and reliability. Although large language models (LLMs) show promise in mathematical reasoning, their advancement in formal theorem…

Artificial Intelligence · Computer Science 2024-05-24 Huajian Xin , Daya Guo , Zhihong Shao , Zhizhou Ren , Qihao Zhu , Bo Liu , Chong Ruan , Wenda Li , Xiaodan Liang
‹ Prev 1 2 3 10 Next ›