English
Related papers

Related papers: From Spatial Relations to Spatial Configurations

200 papers

Natural language processing has made significant inroads into learning the semantics of words through distributional approaches, however representations learnt via these methods fail to capture certain kinds of information implicit in the…

Computation and Language · Computer Science 2018-07-06 Tiago Ramalho , Tomáš Kočiský , Frederic Besse , S. M. Ali Eslami , Gábor Melis , Fabio Viola , Phil Blunsom , Karl Moritz Hermann

Spatial reasoning based on natural language expressions is essential for everyday human tasks. This reasoning ability is also crucial for machines to interact with their environment in a human-like manner. However, recent research shows…

Computation and Language · Computer Science 2025-09-23 Tanawan Premsri , Parisa Kordjamshidi

We develop a system that formally represents spatial semantics concepts within natural language descriptions of spatial arrangements. The system builds on a model of spatial semantics representation according to which words in a sentence…

Computation and Language · Computer Science 2021-11-30 Alexandros Haridis , Stella Rossikopoulou Pappa

Spatial Reasoning is an important component of human cognition and is an area in which the latest Vision-language models (VLMs) show signs of difficulty. The current analysis works use image captioning tasks and visual question answering.…

Computation and Language · Computer Science 2025-02-10 Akshar Tumu , Parisa Kordjamshidi

Recognizing spatial relations and reasoning about them is essential in multiple applications including navigation, direction giving and human-computer interaction in general. Spatial relations between objects can either be explicit --…

Computation and Language · Computer Science 2020-07-21 Soham Dan , Hangfeng He , Dan Roth

Spatial reasoning plays a vital role in both human cognition and machine intelligence, prompting new research into language models' (LMs) capabilities in this regard. However, existing benchmarks reveal shortcomings in evaluating…

Computation and Language · Computer Science 2024-05-27 Fangjun Li , David C. Hogg , Anthony G. Cohn

The success of scene graphs for visual scene understanding has brought attention to the benefits of abstracting a visual input (e.g., image) into a structured representation, where entities (people and objects) are nodes connected by edges…

Computer Vision and Pattern Recognition · Computer Science 2022-10-28 Mohamed Ashraf Abdelsalam , Zhan Shi , Federico Fancellu , Kalliopi Basioti , Dhaivat J. Bhatt , Vladimir Pavlovic , Afsaneh Fazly

Over the last two decades we have witnessed strong progress on modeling visual object classes, scenes and attributes that have significantly contributed to automated image understanding. On the other hand, surprisingly little progress has…

Computer Vision and Pattern Recognition · Computer Science 2015-05-06 Mateusz Malinowski , Mario Fritz

Spatial reasoning, which requires ability to perceive and manipulate spatial relationships in the 3D world, is a fundamental aspect of human intelligence, yet remains a persistent challenge for Multimodal large language models (MLLMs).…

Artificial Intelligence · Computer Science 2025-11-21 Weichen Liu , Qiyao Xue , Haoming Wang , Xiangyu Yin , Boyuan Yang , Wei Gao

Spatial relations are a basic part of human cognition. However, they are expressed in natural language in a variety of ways, and previous work has suggested that current vision-and-language models (VLMs) struggle to capture relational…

Computation and Language · Computer Science 2023-03-23 Fangyu Liu , Guy Emerson , Nigel Collier

Multimodal Large Language Models (MLLMs) have demonstrated impressive performance in general vision-language tasks. However, recent studies have exposed critical limitations in their spatial reasoning capabilities. This deficiency in…

Machine Learning · Computer Science 2025-06-04 Huanyu Zhang , Chengzu Li , Wenshan Wu , Shaoguang Mao , Yifan Zhang , Haochen Tian , Ivan Vulić , Zhang Zhang , Liang Wang , Tieniu Tan , Furu Wei

This thesis introduces "Embodied Spatial Intelligence" to address the challenge of creating robots that can perceive and act in the real world based on natural language instructions. To bridge the gap between Large Language Models (LLMs)…

Robotics · Computer Science 2025-09-03 Jiading Fang

Spatial Reasoning is an important component of human cognition and is an area in which the latest Vision-language models (VLMs) show signs of difficulty. The current analysis works use image captioning tasks and visual question answering.…

Computation and Language · Computer Science 2025-11-11 Akshar Tumu , Varad Shinde , Parisa Kordjamshidi

Spatial reasoning is a crucial component of both biological and artificial intelligence. In this work, we present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning. To…

Computation and Language · Computer Science 2024-06-10 Md Imbesat Hassan Rizvi , Xiaodan Zhu , Iryna Gurevych

Reasoning about spatial relationships between objects is essential for many real-world robotic tasks, such as fetch-and-delivery, object rearrangement, and object search. The ability to detect and disambiguate different objects and identify…

Computer Vision and Pattern Recognition · Computer Science 2024-10-11 Negar Nejatishahidin , Madhukar Reddy Vongala , Jana Kosecka

Knowledge about space and time is necessary to solve problems in the physical world: An AI agent situated in the physical world and interacting with objects often needs to reason about positions of and relations between objects; and as soon…

Artificial Intelligence · Computer Science 2023-01-16 Jae Hee Lee , Michael Sioutis , Kyra Ahrens , Marjan Alirezaie , Matthias Kerzel , Stefan Wermter

Despite impressive advancements in Visual-Language Models (VLMs) for multi-modal tasks, their reliance on RGB inputs limits precise spatial understanding. Existing methods for integrating spatial cues, such as point clouds or depth, either…

Computer Vision and Pattern Recognition · Computer Science 2025-10-27 Yang Liu , Ming Ma , Xiaomin Yu , Pengxiang Ding , Han Zhao , Mingyang Sun , Siteng Huang , Donglin Wang

Meaning Representation (AMR) is a semantic representation for natural language that embeds annotations related to traditional tasks such as named entity recognition, semantic role labeling, word sense disambiguation and co-reference…

Computation and Language · Computer Science 2017-04-11 Marco Damonte , Shay B. Cohen , Giorgio Satta

Genuine spatial reasoning relies on the capacity to construct and manipulate coherent internal spatial representations, often conceptualized as mental models, rather than merely processing surface linguistic associations. While large…

Artificial Intelligence · Computer Science 2026-03-04 Peiyao Jiang , Zequn Qin , Xi Li

Visual commonsense reasoning (VCR) is a challenging multi-modal task, which requires high-level cognition and commonsense reasoning ability about the real world. In recent years, large-scale pre-training approaches have been developed and…

Computer Vision and Pattern Recognition · Computer Science 2023-11-10 Cheng Yang , Rui Xu , Ye Guo , Peixiang Huang , Yiru Chen , Wenkui Ding , Zhongyuan Wang , Hong Zhou
‹ Prev 1 2 3 10 Next ›