SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence

Jiabin Chen; Haiping Wang; Jinpeng Li; Yuan Liu; Zhen Dong; Bisheng Yang

SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence

Computer Vision and Pattern Recognition 2025-05-28 v1 Artificial Intelligence

Authors: Jiabin Chen , Haiping Wang , Jinpeng Li , Yuan Liu , Zhen Dong , Bisheng Yang

Abstract

We propose SpatialLLM, a novel approach advancing spatial intelligence tasks in complex urban scenes. Unlike previous methods requiring geographic analysis tools or domain expertise, SpatialLLM is a unified language model directly addressing various spatial intelligence tasks without any training, fine-tuning, or expert intervention. The core of SpatialLLM lies in constructing detailed and structured scene descriptions from raw spatial data to prompt pre-trained LLMs for scene-based analysis. Extensive experiments show that, with our designs, pretrained LLMs can accurately perceive spatial distribution information and enable zero-shot execution of advanced spatial intelligence tasks, including urban planning, ecological analysis, traffic management, etc. We argue that multi-field knowledge, context length, and reasoning ability are key factors influencing LLM performances in urban analysis. We hope that SpatialLLM will provide a novel viable perspective for urban intelligent analysis and management. The code and dataset are available at https://github.com/WHU-USI3DV/SpatialLLM.

Keywords

vision-language understanding robot learning urban computing

Cite

@article{arxiv.2505.12703,
  title  = {SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence},
  author = {Jiabin Chen and Haiping Wang and Jinpeng Li and Yuan Liu and Zhen Dong and Bisheng Yang},
  journal= {arXiv preprint arXiv:2505.12703},
  year   = {2025}
}

SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence

Abstract

Keywords

Cite

Related papers