English
Related papers

Related papers: GeoMathCode: Understanding Interleaved Math-Code R…

200 papers

Multimodal geometry reasoning requires models to jointly understand visual diagrams and perform structured symbolic inference, yet current vision--language models struggle with complex geometric constructions due to limited training data…

Computer Vision and Pattern Recognition · Computer Science 2026-02-24 Haobo Lin , Tianyi Bai , Chen Chen , Jiajun Zhang , Bohan Zeng , Wentao Zhang , Binhang Yuan

Large language models have shown impressive results for multi-hop mathematical reasoning when the input question is only textual. Many mathematical reasoning problems, however, contain both text and image. With the ever-increasing adoption…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Mehran Kazemi , Hamidreza Alvari , Ankit Anand , Jialin Wu , Xi Chen , Radu Soricut

Geometric Problem Solving (GPS) poses a unique challenge for Multimodal Large Language Models (MLLMs), requiring not only the joint interpretation of text and diagrams but also iterative visuospatial reasoning. While existing approaches…

Artificial Intelligence · Computer Science 2026-03-26 Shichao Weng , Zhiqiang Wang , Yuhua Zhou , Rui Lu , Ting Liu , Zhiyang Teng , Xiaozhang Liu , Hanmeng Liu

Large language models (LLMs) have demonstrated strong reasoning capabilities in text-based mathematical problem solving; however, when adapted to visual reasoning tasks, particularly geometric problem solving, their performance…

Artificial Intelligence · Computer Science 2025-10-28 Nannan Shi , Chuanyu Qin , Shipeng Song , Man Luo

Geometry problem-solving demands advanced reasoning abilities to process multimodal inputs and employ mathematical knowledge effectively. Vision-language models (VLMs) have made significant progress in various multimodal tasks. Yet, they…

Computation and Language · Computer Science 2024-10-18 Aditya Sharma , Aman Dalmia , Mehran Kazemi , Amal Zouaq , Christopher J. Pal

Evaluating the symbolic reasoning of large language models (LLMs) calls for geometry benchmarks that require multi-step proofs grounded in both text and diagrams. However, existing benchmarks are often limited in scale and rarely provide…

Computation and Language · Computer Science 2026-03-23 Yushun Zhang , Weiping Fu , Zesheng Yang , Bo Zhao , Lingling Zhang , Jian Zhang , Yumeng Fu , Jiaxing Huang , Jun Liu

Auxiliary lines are essential for solving complex geometric problems but remain challenging for large vision-language models (LVLMs). Recent attempts construct auxiliary lines via code-driven rendering, a strategy that relies on accurate…

Computer Vision and Pattern Recognition · Computer Science 2026-01-27 Shasha Guo , Liang Pang , Xi Wang , Yanling Wang , Huawei Shen , Jing Zhang

While Multimodal Large Language Models (MLLMs) demonstrate proficiency in 2D scenes, extending their perceptual intelligence to 3D point cloud understanding remains a significant challenge. Current approaches focus primarily on aligning 3D…

Computer Vision and Pattern Recognition · Computer Science 2026-03-02 Dongxu Zhang , Yiding Sun , Pengcheng Li , Yumou Liu , Hongqiang Lin , Haoran Xu , Xiaoxuan Mu , Liang Lin , Wenbiao Yan , Ning Yang , Chaowei Fang , Juanjuan Zhao , Jihua Zhu , Conghui He , Cheng Tan

Recent progress in Multi-modal Large Language Models (MLLMs) has enabled step-by-step multi-modal mathematical reasoning by performing visual operations based on the textual instructions. A promising approach uses code as an intermediate…

Computation and Language · Computer Science 2025-11-06 Xiaoyuan Li , Moxin Li , Wenjie Wang , Rui Men , Yichang Zhang , Fuli Feng , Dayiheng Liu

Large Language Models (LLMs) demonstrate ever-increasing abilities in mathematical and algorithmic tasks, yet their geometric reasoning skills are underexplored. We investigate LLMs' abilities in constructive geometric problem-solving one…

Computation and Language · Computer Science 2024-09-23 Spyridon Mouselinos , Henryk Michalewski , Mateusz Malinowski

Multimodal Large Language Models (MLLMs) have achieved remarkable progress but continue to struggle with geometric reasoning, primarily due to the perception bottleneck regarding fine-grained visual elements. While formal languages have…

Computer Vision and Pattern Recognition · Computer Science 2026-04-17 Peijie Wang , Ming-Liang Zhang , Jun Cao , Chao Deng , Dekang Ran , Hongda Sun , Pi Bu , Xuan Zhang , Yingyao Wang , Jun Song , Bo Zheng , Fei Yin , Cheng-Lin Liu

Multimodal Small-to-Medium sized Language Models (MSLMs) have demonstrated strong capabilities in integrating visual and textual information but still face significant limitations in visual comprehension and mathematical reasoning,…

Machine Learning · Computer Science 2026-01-27 Ashutosh Bajpai , Akshat Bhandari , Akshay Nambi , Tanmoy Chakraborty

Geometric spatial reasoning forms the foundation of many applications in artificial intelligence, yet the ability of large language models (LLMs) to operate over geometric spatial information expressed in procedural code remains…

Artificial Intelligence · Computer Science 2026-02-11 Shixian Luo , Zezhou Zhu , Yu Yuan , Yuncheng Yang , Lianlei Shan , Yong Wu

Large Language Models (LLMs) have demonstrated impressive capabilities in structured reasoning and symbolic tasks, with coding emerging as a particularly successful application. This progress has naturally motivated efforts to extend these…

Artificial Intelligence · Computer Science 2026-02-02 Andrea Asperti , Alberto Naibo , Claudio Sacerdoti Coen

Recent advances in Multimodal Large Language Models (MLLMs) have achieved remarkable progress in general domains and demonstrated promise in multimodal mathematical reasoning. However, applying MLLMs to geometry problem solving (GPS)…

Computation and Language · Computer Science 2025-04-18 Yicheng Pan , Zhenrong Zhang , Pengfei Hu , Jiefeng Ma , Jun Du , Jianshu Zhang , Quan Liu , Jianqing Gao , Feng Ma

Geometry mathematics problems pose significant challenges for large language models (LLMs) because they involve visual elements and spatial reasoning. Current methods primarily rely on symbolic character awareness to address these problems.…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Shihao Xu , Yiyang Luo , Wei Shi

Large language models (LLMs) often benefit from intermediate steps of reasoning to generate answers to complex problems. When these intermediate steps of reasoning are used to monitor the activity of the model, it is essential that this…

Machine Learning · Computer Science 2023-11-02 Fabien Roger , Ryan Greenblatt

Mathematical reasoning is regarded as a necessary ability for Language Models (LMs). Recent works demonstrate large LMs' impressive performance in solving math problems. The success is attributed to their Chain-of-Thought (CoT) reasoning…

Computation and Language · Computer Science 2023-06-08 Tianduo Wang , Wei Lu

Solving complex geometric problems inherently requires interleaved reasoning: a tight alternation between constructing diagrams and performing logical deductions. Although recent Multimodal Large Language Models (MLLMs) have demonstrated…

Computation and Language · Computer Science 2026-04-29 Xiangxiang Zhang , Caijun Jia , Siyuan Li , Dingyu He , Xiya Xiong , Zheng Sun , Honghao He , Yuchen Wu , Bihui Yu , Linzhuang Sun , Cheng Tan , Jingxuan Wei

Geometric problem solving constitutes a critical branch of mathematical reasoning, requiring precise analysis of shapes and spatial relationships. Current evaluations of geometric reasoning in vision-language models (VLMs) face limitations,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Yuan Feng , Yue Yang , Xiaohan He , Jiatong Zhao , Jianlong Chen , Zijun Chen , Daocheng Fu , Qi Liu , Renqiu Xia , Bo Zhang , Junchi Yan
‹ Prev 1 2 3 10 Next ›