Related papers: From Spatial Relations to Spatial Configurations

Encoding Spatial Relations from Natural Language

Natural language processing has made significant inroads into learning the semantics of words through distributional approaches, however representations learnt via these methods fail to capture certain kinds of information implicit in the…

Computation and Language · Computer Science 2018-07-06 Tiago Ramalho , Tomáš Kočiský , Frederic Besse , S. M. Ali Eslami , Gábor Melis , Fabio Viola , Phil Blunsom , Karl Moritz Hermann

Neuro-symbolic Training for Reasoning over Spatial Language

Spatial reasoning based on natural language expressions is essential for everyday human tasks. This reasoning ability is also crucial for machines to interact with their environment in a human-like manner. However, recent research shows…

Computation and Language · Computer Science 2025-09-23 Tanawan Premsri , Parisa Kordjamshidi

Natural Language and Spatial Rules

We develop a system that formally represents spatial semantics concepts within natural language descriptions of spatial arrangements. The system builds on a model of spatial semantics representation according to which words in a sentence…

Computation and Language · Computer Science 2021-11-30 Alexandros Haridis , Stella Rossikopoulou Pappa

Exploring Spatial Language Grounding Through Referring Expressions

Spatial Reasoning is an important component of human cognition and is an area in which the latest Vision-language models (VLMs) show signs of difficulty. The current analysis works use image captioning tasks and visual question answering.…

Computation and Language · Computer Science 2025-02-10 Akshar Tumu , Parisa Kordjamshidi

Understanding Spatial Relations through Multiple Modalities

Recognizing spatial relations and reasoning about them is essential in multiple applications including navigation, direction giving and human-computer interaction in general. Spatial relations between objects can either be explicit --…

Computation and Language · Computer Science 2020-07-21 Soham Dan , Hangfeng He , Dan Roth

Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning

Spatial reasoning plays a vital role in both human cognition and machine intelligence, prompting new research into language models' (LMs) capabilities in this regard. However, existing benchmarks reveal shortcomings in evaluating…

Computation and Language · Computer Science 2024-05-27 Fangjun Li , David C. Hogg , Anthony G. Cohn

Visual Semantic Parsing: From Images to Abstract Meaning Representation

The success of scene graphs for visual scene understanding has brought attention to the benefits of abstracting a visual input (e.g., image) into a structured representation, where entities (people and objects) are nodes connected by edges…

Computer Vision and Pattern Recognition · Computer Science 2022-10-28 Mohamed Ashraf Abdelsalam , Zhan Shi , Federico Fancellu , Kalliopi Basioti , Dhaivat J. Bhatt , Vladimir Pavlovic , Afsaneh Fazly

A Pooling Approach to Modelling Spatial Relations for Image Retrieval and Annotation

Over the last two decades we have witnessed strong progress on modeling visual object classes, scenes and attributes that have significantly contributed to automated image understanding. On the other hand, surprisingly little progress has…

Computer Vision and Pattern Recognition · Computer Science 2015-05-06 Mateusz Malinowski , Mario Fritz

Spatial Reasoning in Multimodal Large Language Models: A Survey of Tasks, Benchmarks and Methods

Spatial reasoning, which requires ability to perceive and manipulate spatial relationships in the 3D world, is a fundamental aspect of human intelligence, yet remains a persistent challenge for Multimodal large language models (MLLMs).…

Artificial Intelligence · Computer Science 2025-11-21 Weichen Liu , Qiyao Xue , Haoming Wang , Xiangyu Yin , Boyuan Yang , Wei Gao

Visual Spatial Reasoning

Spatial relations are a basic part of human cognition. However, they are expressed in natural language in a variety of ways, and previous work has suggested that current vision-and-language models (VLMs) struggle to capture relational…

Computation and Language · Computer Science 2023-03-23 Fangyu Liu , Guy Emerson , Nigel Collier

Scaling and Beyond: Advancing Spatial Reasoning in MLLMs Requires New Recipes

Multimodal Large Language Models (MLLMs) have demonstrated impressive performance in general vision-language tasks. However, recent studies have exposed critical limitations in their spatial reasoning capabilities. This deficiency in…

Machine Learning · Computer Science 2025-06-04 Huanyu Zhang , Chengzu Li , Wenshan Wu , Shaoguang Mao , Yifan Zhang , Haochen Tian , Ivan Vulić , Zhang Zhang , Liang Wang , Tieniu Tan , Furu Wei

Embodied Spatial Intelligence: from Implicit Scene Modeling to Spatial Reasoning

This thesis introduces "Embodied Spatial Intelligence" to address the challenge of creating robots that can perceive and act in the real world based on natural language instructions. To bridge the gap between Large Language Models (LLMs)…

Robotics · Computer Science 2025-09-03 Jiading Fang

Referring Expressions as a Lens into Spatial Language Grounding in Vision-Language Models

Spatial Reasoning is an important component of human cognition and is an area in which the latest Vision-language models (VLMs) show signs of difficulty. The current analysis works use image captioning tasks and visual question answering.…

Computation and Language · Computer Science 2025-11-11 Akshar Tumu , Varad Shinde , Parisa Kordjamshidi

SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models

Spatial reasoning is a crucial component of both biological and artificial intelligence. In this work, we present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning. To…

Computation and Language · Computer Science 2024-06-10 Md Imbesat Hassan Rizvi , Xiaodan Zhu , Iryna Gurevych

Structured Spatial Reasoning with Open Vocabulary Object Detectors

Reasoning about spatial relationships between objects is essential for many real-world robotic tasks, such as fetch-and-delivery, object rearrangement, and object search. The ability to detect and disambiguate different objects and identify…

Computer Vision and Pattern Recognition · Computer Science 2024-10-11 Negar Nejatishahidin , Madhukar Reddy Vongala , Jana Kosecka

Neuro-Symbolic Spatio-Temporal Reasoning

Knowledge about space and time is necessary to solve problems in the physical world: An AI agent situated in the physical world and interacting with objects often needs to reason about positions of and relations between objects; and as soon…

Artificial Intelligence · Computer Science 2023-01-16 Jae Hee Lee , Michael Sioutis , Kyra Ahrens , Marjan Alirezaie , Matthias Kerzel , Stefan Wermter

SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning

Despite impressive advancements in Visual-Language Models (VLMs) for multi-modal tasks, their reliance on RGB inputs limits precise spatial understanding. Existing methods for integrating spatial cues, such as point clouds or depth, either…

Computer Vision and Pattern Recognition · Computer Science 2025-10-27 Yang Liu , Ming Ma , Xiaomin Yu , Pengxiang Ding , Han Zhao , Mingyang Sun , Siteng Huang , Donglin Wang

An Incremental Parser for Abstract Meaning Representation

Meaning Representation (AMR) is a semantic representation for natural language that embeds annotations related to traditional tasks such as named entity recognition, semantic role labeling, word sense disambiguation and co-reference…

Computation and Language · Computer Science 2017-04-11 Marco Damonte , Shay B. Cohen , Giorgio Satta

SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models

Genuine spatial reasoning relies on the capacity to construct and manipulate coherent internal spatial representations, often conceptualized as mental models, rather than merely processing surface linguistic associations. While large…

Artificial Intelligence · Computer Science 2026-03-04 Peiyao Jiang , Zequn Qin , Xi Li

Improving Vision-and-Language Reasoning via Spatial Relations Modeling

Visual commonsense reasoning (VCR) is a challenging multi-modal task, which requires high-level cognition and commonsense reasoning ability about the real world. In recent years, large-scale pre-training approaches have been developed and…

Computer Vision and Pattern Recognition · Computer Science 2023-11-10 Cheng Yang , Rui Xu , Ye Guo , Peixiang Huang , Yiru Chen , Wenkui Ding , Zhongyuan Wang , Hong Zhou