Related papers: Large Language Models are Algorithmically Blind

LLMs for Relational Reasoning: How Far are We?

Large language models (LLMs) have revolutionized many areas (e.g. natural language processing, software engineering, etc.) by achieving state-of-the-art performance on extensive downstream tasks. Aiming to achieve robust and general…

Artificial Intelligence · Computer Science 2024-01-18 Zhiming Li , Yushi Cao , Xiufeng Xu , Junzhe Jiang , Xu Liu , Yon Shin Teo , Shang-wei Lin , Yang Liu

Large Language Models and Mathematical Reasoning Failures

This paper investigates the mathematical reasoning capabilities of large language models (LLMs) using 50 newly constructed high-school-level word problems. Unlike prior studies that focus solely on answer correctness, we rigorously analyze…

Artificial Intelligence · Computer Science 2025-02-24 Johan Boye , Birger Moell

Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond

Logical reasoning consistently plays a fundamental and significant role in the domains of knowledge engineering and artificial intelligence. Recently, Large Language Models (LLMs) have emerged as a noteworthy innovation in natural language…

Computation and Language · Computer Science 2024-09-17 Fangzhi Xu , Qika Lin , Jiawei Han , Tianzhe Zhao , Jun Liu , Erik Cambria

What's in an embedding? Would a rose by any embedding smell as sweet?

Large Language Models (LLMs) are often criticized for lacking true "understanding" and the ability to "reason" with their knowledge, being seen merely as autocomplete systems. We believe that this assessment might be missing a nuanced…

Artificial Intelligence · Computer Science 2024-06-18 Venkat Venkatasubramanian

Large Language Models Are Not Strong Abstract Reasoners

Large Language Models have shown tremendous performance on a large variety of natural language processing tasks, ranging from text comprehension to common sense reasoning. However, the mechanisms responsible for this success remain opaque,…

Computation and Language · Computer Science 2024-01-04 Gaël Gendron , Qiming Bao , Michael Witbrock , Gillian Dobbie

LLMs' Understanding of Natural Language Revealed

Large language models (LLMs) are the result of a massive experiment in bottom-up, data-driven reverse engineering of language at scale. Despite their utility in a number of downstream NLP tasks, ample research has shown that LLMs are…

Artificial Intelligence · Computer Science 2024-08-05 Walid S. Saba

Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs

Advances in the general capabilities of large language models (LLMs) have led to their use for information retrieval, and as components in automated decision systems. A faithful representation of probabilistic reasoning in these models may…

Artificial Intelligence · Computer Science 2025-04-21 Gabriel Freedman , Francesca Toni

On the Fundamental Limits of LLMs at Scale

Large Language Models (LLMs) have benefited enormously from scaling, yet these gains are bounded by five fundamental limitations: (1) hallucination, (2) context compression, (3) reasoning degradation, (4) retrieval fragility, and (5)…

Machine Learning · Computer Science 2026-01-27 Muhammad Ahmed Mohsin , Muhammad Umer , Ahsan Bilal , Zeeshan Memon , Muhammad Ibtsaam Qadir , Sagnik Bhattacharya , Hassan Rizwan , Abhiram R. Gorle , Maahe Zehra Kazmi , Nukhba Amir , Ali Subhan , Muhammad Usman Rafique , Zihao He , Pulkit Mehta , Muhammad Ali Jamshed , John M. Cioffi

What Large Language Models Know and What People Think They Know

As artificial intelligence (AI) systems, particularly large language models (LLMs), become increasingly integrated into decision-making processes, the ability to trust their outputs is crucial. To earn human trust, LLMs must be well…

Machine Learning · Computer Science 2025-02-14 Mark Steyvers , Heliodoro Tejeda , Aakriti Kumar , Catarina Belem , Sheer Karny , Xinyue Hu , Lukas Mayer , Padhraic Smyth

Large Language Model Reasoning Failures

Large Language Models (LLMs) have exhibited remarkable reasoning capabilities, achieving impressive results across a wide range of tasks. Despite these advances, significant reasoning failures persist, occurring even in seemingly simple…

Artificial Intelligence · Computer Science 2026-02-09 Peiyang Song , Pengrui Han , Noah Goodman

Comprehension Without Competence: Architectural Limits of LLMs in Symbolic Computation and Reasoning

Large Language Models (LLMs) display striking surface fluency yet systematically fail at tasks requiring symbolic reasoning, arithmetic accuracy, and logical consistency. This paper offers a structural diagnosis of such failures, revealing…

Artificial Intelligence · Computer Science 2025-11-17 Zheng Zhang

Evaluating LLMs on Real-World Forecasting Against Expert Forecasters

Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks, but their ability to forecast future events remains understudied. A year ago, large language models struggle to come close to the accuracy of a…

Machine Learning · Computer Science 2025-08-06 Janna Lu

Do Large Language Models Know What They Are Capable Of?

We investigate whether large language models (LLMs) can predict whether they will succeed on a given task and whether their predictions improve as they progress through multi-step tasks. We also investigate whether LLMs can learn from…

Computation and Language · Computer Science 2026-01-01 Casey O. Barkan , Sid Black , Oliver Sourbut

Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents

Large Language Models (LLMs) represent a landmark achievement in Artificial Intelligence (AI), demonstrating unprecedented proficiency in procedural tasks such as text generation, code completion, and conversational coherence. These…

Artificial Intelligence · Computer Science 2025-05-07 Schaun Wheeler , Olivier Jeunen

Reasoning Capabilities and Invariability of Large Language Models

Large Language Models (LLMs) have shown remarkable capabilities in manipulating natural language across multiple applications, but their ability to handle simple reasoning tasks is often questioned. In this work, we aim to provide a…

Computation and Language · Computer Science 2025-05-05 Alessandro Raganato , Rafael Peñaloza , Marco Viviani , Gabriella Pasi

CBEval: A framework for evaluating and interpreting cognitive biases in LLMs

Rapid advancements in Large Language models (LLMs) has significantly enhanced their reasoning capabilities. Despite improved performance on benchmarks, LLMs exhibit notable gaps in their cognitive processes. Additionally, as reflections of…

Computation and Language · Computer Science 2024-12-06 Ammar Shaikh , Raj Abhijit Dandekar , Sreedath Panat , Rajat Dandekar

On the Limits of Innate Planning in Large Language Models

Large language models (LLMs) achieve impressive results on many benchmarks, yet their capacity for planning and stateful reasoning remains unclear. We study these abilities directly, without code execution or other tools, using the…

Artificial Intelligence · Computer Science 2025-11-27 Charles Schepanowski , Charles Ling

Do Large Language Models Know How Much They Know?

Large Language Models (LLMs) have emerged as highly capable systems and are increasingly being integrated into various uses. However, the rapid pace of their deployment has outpaced a comprehensive understanding of their internal mechanisms…

Computation and Language · Computer Science 2025-10-27 Gabriele Prato , Jerry Huang , Prasanna Parthasarathi , Shagun Sodhani , Sarath Chandar

A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners

This study introduces a hypothesis-testing framework to assess whether large language models (LLMs) possess genuine reasoning abilities or primarily depend on token bias. We go beyond evaluating LLMs on accuracy; rather, we aim to…

Computation and Language · Computer Science 2024-10-07 Bowen Jiang , Yangxinyu Xie , Zhuoqun Hao , Xiaomeng Wang , Tanwi Mallick , Weijie J. Su , Camillo J. Taylor , Dan Roth

LLMs for Mathematical Modeling: Towards Bridging the Gap between Natural and Mathematical Languages

Large Language Models (LLMs) have demonstrated strong performance across various natural language processing tasks, yet their proficiency in mathematical reasoning remains a key challenge. Addressing the gap between natural and mathematical…

Artificial Intelligence · Computer Science 2025-02-18 Xuhan Huang , Qingning Shen , Yan Hu , Anningzhe Gao , Benyou Wang