Related papers: Spatial Aggregation: Theory and Applications

SpatialImaginer: Towards Adaptive Visual Imagination for Spatial Reasoning

Spatial intelligence, which refers to the ability to reason about geometric and physical structure from visual observations, remains a core challenge for multimodal large language models. Despite promising performance, recent multimodal…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Yian Li , Yang Jiao , Bin Zhu , Tianwen Qian , Shaoxiang Chen , Jingjing Chen , Yu-Gang Jiang

The Aggregator Model of Spatial Cognition

Tracking the positions of objects in local space is a core function of animal brains. We do not yet understand how it is done with limited neural resources. The challenges of spatial cognition are discussed under the criteria: (a) scaling…

Neurons and Cognition · Quantitative Biology 2020-11-20 Robert Worden

A Model of Spatial Thinking for Computational Intelligence

Trying to be effective (no matter who exactly and in what field) a person face the problem which inevitably destroys all our attempts to easily get to a desired goal. The problem is the existence of some insuperable barriers for our mind,…

Artificial Intelligence · Computer Science 2016-11-17 Kirill A. Sorudeykin

Exploring The Spatial Reasoning Ability of Neural Models in Human IQ Tests

Although neural models have performed impressively well on various tasks such as image recognition and question answering, their reasoning ability has been measured in only few studies. In this work, we focus on spatial reasoning and…

Artificial Intelligence · Computer Science 2021-08-19 Hyunjae Kim , Yookyung Koh , Jinheon Baek , Jaewoo Kang

A Spatial Logic for Simplicial Models

Collective Adaptive Systems often consist of many heterogeneous components typically organised in groups. These entities interact with each other by adapting their behaviour to pursue individual or collective goals. In these systems, the…

Logic in Computer Science · Computer Science 2024-02-14 Michele Loreti , Michela Quadrini

A Multi-Modal Neuro-Symbolic Approach for Spatial Reasoning-Based Visual Grounding in Robotics

Visual reasoning, particularly spatial reasoning, is a challenging cognitive task that requires understanding object relationships and their interactions within complex environments, especially in robotics domain. Existing vision_language…

Robotics · Computer Science 2025-11-03 Simindokht Jahangard , Mehrzad Mohammadi , Abhinav Dhall , Hamid Rezatofighi

Zero-shot visual reasoning through probabilistic analogical mapping

Human reasoning is grounded in an ability to identify highly abstract commonalities governing superficially dissimilar visual inputs. Recent efforts to develop algorithms with this capacity have largely focused on approaches that require…

Computer Vision and Pattern Recognition · Computer Science 2022-10-03 Taylor W. Webb , Shuhao Fu , Trevor Bihl , Keith J. Holyoak , Hongjing Lu

Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models

Recent advancements in multimodal large language models have driven breakthroughs in visual question answering. Yet, a critical gap persists, `conceptualization'-the ability to recognize and reason about the same concept despite variations…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Zahra Babaiee , Peyman M. Kiasari , Daniela Rus , Radu Grosu

Spatial Reasoning in Multimodal Large Language Models: A Survey of Tasks, Benchmarks and Methods

Spatial reasoning, which requires ability to perceive and manipulate spatial relationships in the 3D world, is a fundamental aspect of human intelligence, yet remains a persistent challenge for Multimodal large language models (MLLMs).…

Artificial Intelligence · Computer Science 2025-11-21 Weichen Liu , Qiyao Xue , Haoming Wang , Xiangyu Yin , Boyuan Yang , Wei Gao

Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing

As textual reasoning with large language models (LLMs) has advanced significantly, there has been growing interest in enhancing the multimodal reasoning capabilities of large vision-language models (LVLMs). However, existing methods…

Computer Vision and Pattern Recognition · Computer Science 2025-06-23 Junfei Wu , Jian Guan , Kaituo Feng , Qiang Liu , Shu Wu , Liang Wang , Wei Wu , Tieniu Tan

Reasoning in Computer Vision: Taxonomy, Models, Tasks, and Methodologies

Visual reasoning is critical for a wide range of computer vision tasks that go beyond surface-level object detection and classification. Despite notable advances in relational, symbolic, temporal, causal, and commonsense reasoning, existing…

Computer Vision and Pattern Recognition · Computer Science 2025-08-15 Ayushman Sarkar , Mohd Yamani Idna Idris , Zhenyu Yu

SpatialMath: Spatial Comprehension-Infused Symbolic Reasoning for Mathematical Problem-Solving

Multimodal Small-to-Medium sized Language Models (MSLMs) have demonstrated strong capabilities in integrating visual and textual information but still face significant limitations in visual comprehension and mathematical reasoning,…

Machine Learning · Computer Science 2026-01-27 Ashutosh Bajpai , Akshat Bhandari , Akshay Nambi , Tanmoy Chakraborty

Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models

Vision-Language Models (VLMs) have recently emerged as powerful tools, excelling in tasks that integrate visual and textual comprehension, such as image captioning, visual question answering, and image-text retrieval. However, existing…

Computer Vision and Pattern Recognition · Computer Science 2025-03-26 Ilias Stogiannidis , Steven McDonagh , Sotirios A. Tsaftaris

Semantic Area Graph Reasoning for Multi-Robot Language-Guided Search

Coordinating multi-robot systems (MRS) to search in unknown environments is particularly challenging for tasks that require semantic reasoning beyond geometric exploration. Classical coordination strategies rely on frontier coverage or…

Robotics · Computer Science 2026-04-20 Ruiyang Wang , Hao-Lun Hsu , Jiwoo Kim , Miroslav Pajic

Enhancing Spatial Reasoning through Visual and Textual Thinking

The spatial reasoning task aims to reason about the spatial relationships in 2D and 3D space, which is a fundamental capability for Visual Question Answering (VQA) and robotics. Although vision language models (VLMs) have developed rapidly…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Xun Liang , Xin Guo , Zhongming Jin , Weihang Pan , Penghui Shang , Deng Cai , Binbin Lin , Jieping Ye

An interactive semantics of logic programming

We apply to logic programming some recently emerging ideas from the field of reduction-based communicating systems, with the aim of giving evidence of the hidden interactions and the coordination mechanisms that rule the operational…

Logic in Computer Science · Computer Science 2007-05-23 Roberto Bruni , Ugo Montanari , Francesca Rossi

Neuro-Symbolic Spatio-Temporal Reasoning

Knowledge about space and time is necessary to solve problems in the physical world: An AI agent situated in the physical world and interacting with objects often needs to reason about positions of and relations between objects; and as soon…

Artificial Intelligence · Computer Science 2023-01-16 Jae Hee Lee , Michael Sioutis , Kyra Ahrens , Marjan Alirezaie , Matthias Kerzel , Stefan Wermter

Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations

In vision-and-language grounding problems, fine-grained representations of the image are considered to be of paramount importance. Most of the current systems incorporate visual features and textual concepts as a sketch of an image.…

Computation and Language · Computer Science 2019-11-05 Fenglin Liu , Yuanxin Liu , Xuancheng Ren , Xiaodong He , Xu Sun

Aggregative Semantics for Quantitative Bipolar Argumentation Frameworks

Formal argumentation is being used increasingly in artificial intelligence as an effective and understandable way to model potentially conflicting pieces of information, called arguments, and identify so-called acceptable arguments…

Artificial Intelligence · Computer Science 2026-03-09 Yann Munro , Isabelle Bloch , Marie-Jeanne Lesot

Thinking with Blueprints: Assisting Vision-Language Models in Spatial Reasoning via Structured Object Representation

Spatial reasoning -- the ability to perceive and reason about relationships in space -- advances vision-language models (VLMs) from visual perception toward spatial semantic understanding. Existing approaches either revisit local image…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Weijian Ma , Shizhao Sun , Tianyu Yu , Ruiyu Wang , Tat-Seng Chua , Jiang Bian