English
Related papers

Related papers: Spatial Aggregation: Theory and Applications

200 papers

Spatial intelligence, which refers to the ability to reason about geometric and physical structure from visual observations, remains a core challenge for multimodal large language models. Despite promising performance, recent multimodal…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Yian Li , Yang Jiao , Bin Zhu , Tianwen Qian , Shaoxiang Chen , Jingjing Chen , Yu-Gang Jiang

Tracking the positions of objects in local space is a core function of animal brains. We do not yet understand how it is done with limited neural resources. The challenges of spatial cognition are discussed under the criteria: (a) scaling…

Neurons and Cognition · Quantitative Biology 2020-11-20 Robert Worden

Trying to be effective (no matter who exactly and in what field) a person face the problem which inevitably destroys all our attempts to easily get to a desired goal. The problem is the existence of some insuperable barriers for our mind,…

Artificial Intelligence · Computer Science 2016-11-17 Kirill A. Sorudeykin

Although neural models have performed impressively well on various tasks such as image recognition and question answering, their reasoning ability has been measured in only few studies. In this work, we focus on spatial reasoning and…

Artificial Intelligence · Computer Science 2021-08-19 Hyunjae Kim , Yookyung Koh , Jinheon Baek , Jaewoo Kang

Collective Adaptive Systems often consist of many heterogeneous components typically organised in groups. These entities interact with each other by adapting their behaviour to pursue individual or collective goals. In these systems, the…

Logic in Computer Science · Computer Science 2024-02-14 Michele Loreti , Michela Quadrini

Visual reasoning, particularly spatial reasoning, is a challenging cognitive task that requires understanding object relationships and their interactions within complex environments, especially in robotics domain. Existing vision_language…

Robotics · Computer Science 2025-11-03 Simindokht Jahangard , Mehrzad Mohammadi , Abhinav Dhall , Hamid Rezatofighi

Human reasoning is grounded in an ability to identify highly abstract commonalities governing superficially dissimilar visual inputs. Recent efforts to develop algorithms with this capacity have largely focused on approaches that require…

Computer Vision and Pattern Recognition · Computer Science 2022-10-03 Taylor W. Webb , Shuhao Fu , Trevor Bihl , Keith J. Holyoak , Hongjing Lu

Recent advancements in multimodal large language models have driven breakthroughs in visual question answering. Yet, a critical gap persists, `conceptualization'-the ability to recognize and reason about the same concept despite variations…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Zahra Babaiee , Peyman M. Kiasari , Daniela Rus , Radu Grosu

Spatial reasoning, which requires ability to perceive and manipulate spatial relationships in the 3D world, is a fundamental aspect of human intelligence, yet remains a persistent challenge for Multimodal large language models (MLLMs).…

Artificial Intelligence · Computer Science 2025-11-21 Weichen Liu , Qiyao Xue , Haoming Wang , Xiangyu Yin , Boyuan Yang , Wei Gao

As textual reasoning with large language models (LLMs) has advanced significantly, there has been growing interest in enhancing the multimodal reasoning capabilities of large vision-language models (LVLMs). However, existing methods…

Computer Vision and Pattern Recognition · Computer Science 2025-06-23 Junfei Wu , Jian Guan , Kaituo Feng , Qiang Liu , Shu Wu , Liang Wang , Wei Wu , Tieniu Tan

Visual reasoning is critical for a wide range of computer vision tasks that go beyond surface-level object detection and classification. Despite notable advances in relational, symbolic, temporal, causal, and commonsense reasoning, existing…

Computer Vision and Pattern Recognition · Computer Science 2025-08-15 Ayushman Sarkar , Mohd Yamani Idna Idris , Zhenyu Yu

Multimodal Small-to-Medium sized Language Models (MSLMs) have demonstrated strong capabilities in integrating visual and textual information but still face significant limitations in visual comprehension and mathematical reasoning,…

Machine Learning · Computer Science 2026-01-27 Ashutosh Bajpai , Akshat Bhandari , Akshay Nambi , Tanmoy Chakraborty

Vision-Language Models (VLMs) have recently emerged as powerful tools, excelling in tasks that integrate visual and textual comprehension, such as image captioning, visual question answering, and image-text retrieval. However, existing…

Computer Vision and Pattern Recognition · Computer Science 2025-03-26 Ilias Stogiannidis , Steven McDonagh , Sotirios A. Tsaftaris

Coordinating multi-robot systems (MRS) to search in unknown environments is particularly challenging for tasks that require semantic reasoning beyond geometric exploration. Classical coordination strategies rely on frontier coverage or…

Robotics · Computer Science 2026-04-20 Ruiyang Wang , Hao-Lun Hsu , Jiwoo Kim , Miroslav Pajic

The spatial reasoning task aims to reason about the spatial relationships in 2D and 3D space, which is a fundamental capability for Visual Question Answering (VQA) and robotics. Although vision language models (VLMs) have developed rapidly…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Xun Liang , Xin Guo , Zhongming Jin , Weihang Pan , Penghui Shang , Deng Cai , Binbin Lin , Jieping Ye

We apply to logic programming some recently emerging ideas from the field of reduction-based communicating systems, with the aim of giving evidence of the hidden interactions and the coordination mechanisms that rule the operational…

Logic in Computer Science · Computer Science 2007-05-23 Roberto Bruni , Ugo Montanari , Francesca Rossi

Knowledge about space and time is necessary to solve problems in the physical world: An AI agent situated in the physical world and interacting with objects often needs to reason about positions of and relations between objects; and as soon…

Artificial Intelligence · Computer Science 2023-01-16 Jae Hee Lee , Michael Sioutis , Kyra Ahrens , Marjan Alirezaie , Matthias Kerzel , Stefan Wermter

In vision-and-language grounding problems, fine-grained representations of the image are considered to be of paramount importance. Most of the current systems incorporate visual features and textual concepts as a sketch of an image.…

Computation and Language · Computer Science 2019-11-05 Fenglin Liu , Yuanxin Liu , Xuancheng Ren , Xiaodong He , Xu Sun

Formal argumentation is being used increasingly in artificial intelligence as an effective and understandable way to model potentially conflicting pieces of information, called arguments, and identify so-called acceptable arguments…

Artificial Intelligence · Computer Science 2026-03-09 Yann Munro , Isabelle Bloch , Marie-Jeanne Lesot

Spatial reasoning -- the ability to perceive and reason about relationships in space -- advances vision-language models (VLMs) from visual perception toward spatial semantic understanding. Existing approaches either revisit local image…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Weijian Ma , Shizhao Sun , Tianyu Yu , Ruiyu Wang , Tat-Seng Chua , Jiang Bian
‹ Prev 1 2 3 10 Next ›