English
Related papers

Related papers: Exposing Weaknesses of Large Reasoning Models thro…

200 papers

Large Language Models (LLMs) for Graph Reasoning have been extensively studied over the past two years, involving enabling LLMs to understand graph structures and reason on graphs to solve various graph problems, with graph algorithm…

Artificial Intelligence · Computer Science 2025-10-03 Yuwei Hu , Xinyi Huang , Zhewei Wei , Yongchao Liu , Chuntao Hong

Reasoning ability has become a central focus in the advancement of Large Reasoning Models (LRMs). Although notable progress has been achieved on several reasoning benchmarks such as MATH500 and LiveCodeBench, existing benchmarks for…

Artificial Intelligence · Computer Science 2026-01-12 Henan Sun , Kaichi Yu , Yuyao Wang , Bowen Liu , Xunkai Li , Rong-Hua Li , Nuo Chen , Jia Li

Large language models (LLMs) have shown significant progress in reasoning tasks. However, recent studies show that transformers and LLMs fail catastrophically once reasoning problems exceed modest complexity. We revisit these findings…

Artificial Intelligence · Computer Science 2025-10-28 Revanth Rameshkumar , Jimson Huang , Yunxin Sun , Fei Xia , Abulhair Saparov

Recent generations of language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes before providing answers. While these models demonstrate improved performance on reasoning benchmarks, their…

Artificial Intelligence · Computer Science 2025-11-21 Parshin Shojaee , Iman Mirzadeh , Keivan Alizadeh , Maxwell Horton , Samy Bengio , Mehrdad Farajtabar

Recent progress in Large Reasoning Models (LRMs) has significantly enhanced the reasoning abilities of Large Language Models (LLMs), empowering them to tackle increasingly complex tasks through reflection capabilities, such as making…

Computation and Language · Computer Science 2025-06-26 Jianghao Chen , Zhenlin Wei , Zhenjiang Ren , Ziyong Li , Jiajun Zhang

Graph Retrieval Augmented Generation (GraphRAG) has garnered increasing recognition for its potential to enhance large language models (LLMs) by structurally organizing domain-specific corpora and facilitating complex reasoning. However,…

Computation and Language · Computer Science 2025-06-23 Yilin Xiao , Junnan Dong , Chuang Zhou , Su Dong , Qian-wen Zhang , Di Yin , Xing Sun , Xiao Huang

Geometric spatial reasoning forms the foundation of many applications in artificial intelligence, yet the ability of large language models (LLMs) to operate over geometric spatial information expressed in procedural code remains…

Artificial Intelligence · Computer Science 2026-02-11 Shixian Luo , Zezhou Zhu , Yu Yuan , Yuncheng Yang , Lianlei Shan , Yong Wu

Large language models (LLMs) are deployed on increasingly complex tasks that require multi-step decision-making. Understanding their algorithmic reasoning abilities is therefore crucial. However, we lack a diagnostic benchmark for…

Machine Learning · Computer Science 2026-02-12 Yu He , Yingxi Li , Colin White , Ellen Vitercik

With the increasing use of large language models (LLMs), ensuring reliable performance in diverse, real-world environments is essential. Despite their remarkable achievements, LLMs often struggle with adversarial inputs, significantly…

Computation and Language · Computer Science 2024-06-18 Yuqing Wang , Yun Zhao

Despite the remarkable advancements and widespread applications of deep neural networks, their ability to perform reasoning tasks remains limited, particularly in domains requiring structured, abstract thought. In this paper, we investigate…

Computation and Language · Computer Science 2025-09-16 Satyam Goyal , Soham Dan

Evaluating the graph comprehension and reasoning abilities of Large Language Models (LLMs) is challenging and often incomplete. Existing benchmarks focus primarily on pure graph understanding, lacking a comprehensive evaluation across all…

Artificial Intelligence · Computer Science 2025-02-27 Zike Yuan , Ming Liu , Hui Wang , Bing Qin

Existing long-context benchmarks for Large Language Models (LLMs) focus on evaluating comprehension of long inputs, while overlooking the evaluation of long reasoning abilities. To address this gap, we introduce LongReasonArena, a benchmark…

Computation and Language · Computer Science 2025-08-28 Jiayu Ding , Shuming Ma , Lei Cui , Nanning Zheng , Furu Wei

Developments in Graph-Language Models (GLMs) aim to integrate the structural reasoning capabilities of Graph Neural Networks (GNNs) with the semantic understanding of Large Language Models (LLMs). However, we demonstrate that current…

Computation and Language · Computer Science 2025-08-29 Soham Petkar , Hari Aakash K , Anirudh Vempati , Akshit Sinha , Ponnurangam Kumarauguru , Chirag Agarwal

Large Language Models (LLMs) are increasingly described as possessing strong reasoning capabilities, supported by high performance on mathematical, logical, and planning benchmarks. However, most existing evaluations rely on aggregate…

Computation and Language · Computer Science 2026-04-16 Md. Fahad Ullah Utsho , Mohd. Ruhul Ameen , Akif Islam , Md. Golam Rashed , Dipankar Das

Large Language Models (LLMs) have showcased impressive reasoning capabilities, particularly when guided by specifically designed prompts in complex reasoning tasks such as math word problems. These models typically solve tasks using a…

Artificial Intelligence · Computer Science 2024-04-23 Lang Cao

The advancement of Large Language Models (LLMs) has remarkably pushed the boundaries towards artificial general intelligence (AGI), with their exceptional ability on understanding diverse types of information, including but not limited to…

Computation and Language · Computer Science 2023-10-10 Ziwei Chai , Tianjie Zhang , Liang Wu , Kaiqiao Han , Xiaohai Hu , Xuanwen Huang , Yang Yang

Large reasoning models (LRMs) have achieved impressive performance in complex tasks, often outperforming conventional large language models (LLMs). However, the prevalent issue of overthinking severely limits their computational efficiency.…

Computation and Language · Computer Science 2025-05-29 Zhiyuan Li , Yi Chang , Yuan Wu

Large Language Models (LLMs) have demonstrated impressive capabilities in natural language processing tasks, such as text generation and semantic understanding. However, their performance on numerical reasoning tasks, such as basic…

Computation and Language · Computer Science 2025-06-04 Haoyang Li , Xuejia Chen , Zhanchao XU , Darian Li , Nicole Hu , Fei Teng , Yiming Li , Luyu Qiu , Chen Jason Zhang , Qing Li , Lei Chen

Large language models (LLMs) have demonstrated remarkable progress in understanding long-context inputs. However, benchmarks for evaluating the long-context reasoning abilities of LLMs fall behind the pace. Existing benchmarks often focus…

Computation and Language · Computer Science 2025-11-19 Zhan Ling , Kang Liu , Kai Yan , Yifan Yang , Weijian Lin , Ting-Han Fan , Lingfeng Shen , Zhengyin Du , Jiecao Chen

Currently, process reward models (PRMs) have exhibited remarkable potential for test-time scaling. Since large language models (LLMs) regularly generate flawed intermediate reasoning steps when tackling a broad spectrum of reasoning and…

Artificial Intelligence · Computer Science 2026-05-08 Zhouhao Sun , Xuan Zhang , Xiao Ding , Bibo Cai , Li Du , Kai Xiong , Xinran Dai , Fei Zhang , weidi tang , Zhiyuan Kan , Yang Zhao , Bing Qin , Ting Liu
‹ Prev 1 2 3 10 Next ›