English

Spatial Language Representation with Multi-Level Geocoding

Computation and Language 2020-08-24 v1

Abstract

We present a multi-level geocoding model (MLG) that learns to associate texts to geographic locations. The Earth's surface is represented using space-filling curves that decompose the sphere into a hierarchy of similarly sized, non-overlapping cells. MLG balances generalization and accuracy by combining losses across multiple levels and predicting cells at each level simultaneously. Without using any dataset-specific tuning, we show that MLG obtains state-of-the-art results for toponym resolution on three English datasets. Furthermore, it obtains large gains without any knowledge base metadata, demonstrating that it can effectively learn the connection between text spans and coordinates - and thus can be extended to toponymns not present in knowledge bases.

Keywords

Cite

@article{arxiv.2008.09236,
  title  = {Spatial Language Representation with Multi-Level Geocoding},
  author = {Sayali Kulkarni and Shailee Jain and Mohammad Javad Hosseini and Jason Baldridge and Eugene Ie and Li Zhang},
  journal= {arXiv preprint arXiv:2008.09236},
  year   = {2020}
}
R2 v1 2026-06-23T18:00:17.572Z