English
Related papers

Related papers: Solving Quantitative Reasoning Problems with Langu…

200 papers

Despite the increasing effectiveness of language models, their reasoning capabilities remain underdeveloped. In particular, causal reasoning through counterfactual question answering is lacking. This work aims to bridge this gap. We first…

Computation and Language · Computer Science 2025-03-18 Alihan Hüyük , Xinnuo Xu , Jacqueline Maasch , Aditya V. Nori , Javier González

Large language models (LLMs) have demonstrated impressive reasoning capabilities, particularly in textual mathematical problem-solving. However, existing open-source image instruction fine-tuning datasets, containing limited question-answer…

Computation and Language · Computer Science 2024-10-10 Wenhao Shi , Zhiqiang Hu , Yi Bin , Junhua Liu , Yang Yang , See-Kiong Ng , Lidong Bing , Roy Ka-Wei Lee

Large language models (LLMs) with billions of parameters exhibit in-context learning abilities, enabling few-shot learning on tasks that the model was not specifically trained for. Traditional models achieve breakthrough performance on…

Artificial Intelligence · Computer Science 2025-11-04 Aske Plaat , Annie Wong , Suzan Verberne , Joost Broekens , Niki van Stein , Thomas Back

Multimodal LLMs are turning their focus to video benchmarks, however most video benchmarks only provide outcome supervision, with no intermediate or interpretable reasoning steps. This makes it challenging to assess if models are truly able…

Large Language Models (LLMs) have shown remarkable performance in various natural language processing tasks but face challenges in mathematical reasoning, where complex problem-solving requires both linguistic understanding and mathematical…

Computation and Language · Computer Science 2025-03-20 Shuguang Chen , Guang Lin

Quantitative reasoning is a critical skill to analyze data, yet the assessment of such ability remains limited. To address this gap, we introduce the Quantitative Reasoning with Data (QRData) benchmark, aiming to evaluate Large Language…

Computation and Language · Computer Science 2024-06-11 Xiao Liu , Zirui Wu , Xueqing Wu , Pan Lu , Kai-Wei Chang , Yansong Feng

Mathematical reasoning serves as a cornerstone for assessing the fundamental cognitive capabilities of human intelligence. In recent times, there has been a notable surge in the development of Large Language Models (LLMs) geared towards the…

Computation and Language · Computer Science 2024-09-18 Janice Ahn , Rishu Verma , Renze Lou , Di Liu , Rui Zhang , Wenpeng Yin

We evaluate the reasoning abilities of large language models in multilingual settings. We introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset (Cobbe et…

Recent advances in reasoning with large language models (LLMs) have demonstrated strong performance on complex mathematical tasks, including combinatorial optimization. Techniques such as Chain-of-Thought and In-Context Learning have…

Artificial Intelligence · Computer Science 2025-09-17 Marylou Fauchard , Florian Carichon , Margarida Carvalho , Golnoosh Farnadi

Large language models (LLMs) have pushed the limits of natural language understanding and exhibited excellent problem-solving ability. Despite the great success, most existing open-source LLMs (e.g., LLaMA-2) are still far away from…

Computation and Language · Computer Science 2024-05-06 Longhui Yu , Weisen Jiang , Han Shi , Jincheng Yu , Zhengying Liu , Yu Zhang , James T. Kwok , Zhenguo Li , Adrian Weller , Weiyang Liu

This study examines how Large Language Models (LLMs) perform when tackling quantitative management decision problems in a zero-shot setting. Drawing on 900 responses generated by five leading models across 20 diverse managerial scenarios,…

Computation and Language · Computer Science 2025-02-25 Jonathan Kuzmanko

Recent advances in large language models (LLMs) and multimodal LLMs (MLLMs) have led to strong reasoning ability across a wide range of tasks. However, their ability to perform mathematical reasoning from spoken input remains underexplored.…

Computation and Language · Computer Science 2025-05-22 Chengwei Wei , Bin Wang , Jung-jae Kim , Nancy F. Chen

Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in…

Computation and Language · Computer Science 2023-05-29 Jie Huang , Kevin Chen-Chuan Chang

Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts has not been systematically studied. To…

Computer Vision and Pattern Recognition · Computer Science 2024-01-23 Pan Lu , Hritik Bansal , Tony Xia , Jiacheng Liu , Chunyuan Li , Hannaneh Hajishirzi , Hao Cheng , Kai-Wei Chang , Michel Galley , Jianfeng Gao

Cognitive science faces ongoing challenges in research integration, formalization, conceptual clarity, and other areas, in part due to its multifaceted and interdisciplinary nature. Recent advances in artificial intelligence, particularly…

Artificial Intelligence · Computer Science 2026-03-03 Dirk U. Wulff , Rui Mata

Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic…

Computation and Language · Computer Science 2020-05-08 Mostafa Abdou , Vinit Ravishankar , Maria Barrett , Yonatan Belinkov , Desmond Elliott , Anders Søgaard

Large Language Models (LLMs) achieve impressive performance in a wide range of tasks, even if they are often trained with the only objective of chatting fluently with users. Among other skills, LLMs show emergent abilities in mathematical…

Computation and Language · Computer Science 2024-06-12 Flavio Petruzzellis , Alberto Testolin , Alessandro Sperduti

Recent advances in language models have demonstrated their capability to solve mathematical reasoning problems, achieving near-perfect accuracy on grade-school level math benchmarks like GSM8K. In this paper, we formally study how language…

Artificial Intelligence · Computer Science 2024-07-31 Tian Ye , Zicheng Xu , Yuanzhi Li , Zeyuan Allen-Zhu

Multimodal Large Language Models (MLLMs) show promising results for embodied agents in operating meaningfully in complex, human-centered environments. Yet, evaluating their capacity for nuanced, human-like reasoning and decision-making…

Computation and Language · Computer Science 2025-09-30 Zhe Hu , Yixiao Ren , Guanzhong Liu , Jing Li , Yu Yin

Recent advancements in video models have shown tremendous progress, particularly in long video understanding. However, current benchmarks predominantly feature western-centric data and English as the dominant language, introducing…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Darshan Singh , Arsha Nagrani , Kawshik Manikantan , Harman Singh , Dinesh Tewari , Tobias Weyand , Cordelia Schmid , Anelia Angelova , Shachi Dave
‹ Prev 1 2 3 10 Next ›