Related papers: GeoBlocks: A Query-Cache Accelerated Data Structur…
Modeling geospatial tabular data with deep learning has become a promising alternative to traditional statistical and machine learning approaches. However, existing deep learning models often face challenges related to scalability and…
Accurate forecasting of bus ridership (passengers numbers) is crucial for efficient management and optimization of public transport systems. Traditional forecasting models often fail to capture the unique and localized dynamics of different…
Web-based services often run randomized experiments to improve their products. A popular way to run these experiments is to use geographical regions as units of experimentation, since this does not require tracking of individual users or…
The inherent connectivity and dependency of graph-structured data, combined with its unique topology-driven access patterns, pose fundamental challenges to conventional data replication and request routing strategies in geo-distributed…
With the sharp increase in the number of vehicles, the issue of parking difficulties has emerged as an urgent challenge that many cities need to address promptly. In the task of predicting large-scale urban parking data, existing research…
Block diffusion enables efficient parallel refinement in diffusion language models, but its decoding behavior depends critically on block size. Existing block-sizing strategies rely on fixed rules or heuristic signals and do not account for…
We present GeoRocket, a software for the management of very large geospatial datasets in the cloud. GeoRocket employs a novel way to handle arbitrarily large datasets by splitting them into chunks that are processed individually. The…
Data aggregation in Geographic Information Systems (GIS) is only marginally present in commercial systems nowadays, mostly through ad-hoc solutions. In this paper, we first present a formal model for representing spatial data. This model…
The availability of low cost sensors has led to an unprecedented growth in the volume of spatial data. However, the time required to evaluate even simple spatial queries over large data sets greatly hampers our ability to interactively…
We study the problem of aggregating polygons by covering them with disjoint representative regions, thereby inducing a clustering of the polygons. Our objective is to minimize a weighted sum of the total area and the total perimeter of the…
Cities play a pivotal role in human development and sustainability, yet studying them presents significant challenges due to the vast scale and complexity of spatial-temporal data. One such challenge is the need to uncover universal urban…
Current point cloud segmentation architectures suffer from limited long-range feature modeling, as they mostly rely on aggregating information with local neighborhoods. Furthermore, in order to learn point features at multiple scales, most…
This study presents a novel small-area estimation framework to enhance urban transportation planning through detailed characterization of travel behavior. Our approach improves on the four-step travel model by employing publicly available…
Explainable numerical representations or latent information of otherwise complex datasets are more convenient to analyze and study. These representations assist in identifying clusters and outliers, assess similar data points, and explore…
Optimization tasks over relational data, such as clustering, often suffer from the prohibitive cost of join operations, which are necessary to access the full dataset. While geometric data structures like BBD trees yield fast approximation…
We selected 48 European cities and gathered their public transport timetables in the GTFS format. We utilized Uber's H3 spatial index to divide each city into hexagonal micro-regions. Based on the timetables data we created certain features…
Distributed data mining techniques and mainly distributed clustering are widely used in the last decade because they deal with very large and heterogeneous datasets which cannot be gathered centrally. Current distributed clustering…
Many industries rely on visual insights to support decision- making processes in their businesses. In mining, the analysis of drills and geological shapes, represented as 3D geometries, is an important tool to assist geologists on the…
Pandemic measures such as social distancing and contact tracing can be enhanced by rapidly integrating dynamic location data and demographic data. Projecting billions of longitude and latitude locations onto hundreds of thousands of highly…
Airborne magnetic data are commonly used to produce preliminary geological maps. Machine learning has the potential to partly fulfill this task rapidly and objectively, as geological mapping is comparable to a semantic segmentation problem.…