Related papers: Multi-Source Spatial Entity Linkage
In this paper, we study the problem of spatial link discovery (LD), focusing primarily on topological and proximity relations between spatial objects. The problem is timely due to the large number of sources that generate spatial data,…
In this paper, for the first time, we introduce the concept of skyblocking, which aims to efficiently identify the "most preferred" blocking scheme in terms of a given set of selection criteria for entity resolution blocking. To capture all…
Urban environments are continuously mapped and modeled by various data collection platforms, including satellites, unmanned aerial vehicles and street cameras. The growing availability of 3D geospatial data from multiple modalities has…
Cross-view geo-localization aims at establishing location correspondences between different viewpoints. Existing approaches typically learn cross-view correlations through direct feature similarity matching, often overlooking semantic…
Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers. A major advancement in ER methodology has been the application of Bayesian…
Knowledge bases (KBs) store rich yet heterogeneous entities and facts. Entity resolution (ER) aims to identify entities in KBs which refer to the same real-world object. Recent studies have shown significant benefits of involving humans in…
Spatial data are central to applications such as environmental monitoring and urban planning, but are often distributed across devices where privacy and communication constraints limit direct sharing. Federated modeling offers a practical…
Accurate and efficient entity resolution (ER) is a significant challenge in many data mining and analysis projects requiring integrating and processing massive data collections. It is becoming increasingly important in real-world…
We are witnessing an enormous growth in the volume of data generated by various online services. An important portion of this data contains geographic references, since many of these services are \emph{location-enhanced} and thus produce…
Entity Resolution suffers from quadratic time complexity. To increase its time efficiency, three kinds of filtering techniques are typically used for restricting its search space: (i) blocking workflows, which group together entity profiles…
Cross-lingual entity linking (XEL) grounds named entities in a source language to an English Knowledge Base (KB), such as Wikipedia. XEL is challenging for most languages because of limited availability of requisite resources. However, much…
Recently, MapReduce based spatial query systems have emerged as a cost effective and scalable solution to large scale spatial data processing and analytics. MapReduce based systems achieve massive scalability by partitioning the data and…
Trajectory-based spatiotemporal entity linking is to match the same moving object in different datasets based on their movement traces. It is a fundamental step to support spatiotemporal data integration and analysis. In this paper, we…
Federated learning is proposed by Google to safeguard data privacy through training models locally on users' devices. However, with deep learning models growing in size to achieve better results, it becomes increasingly difficult to…
Cross-view UAV geolocalization is fundamentally a challenging large-scale image retrieval task, aiming to determine the geographic coordinates of Unmanned Aerial Vehicle (UAV) queries by matching them against an extensive geo-tagged…
Entity resolution (record linkage or deduplication) is the process of identifying and linking duplicate records in databases. In this paper, we propose a Bayesian graphical approach for entity resolution that links records to latent…
Merging datafiles containing information on overlapping sets of entities is a challenging task in the absence of unique identifiers, and is further complicated when some entities are duplicated in the datafiles. Most approaches to this…
Entity linking is an indispensable operation of populating knowledge repositories for information extraction. It studies on aligning a textual entity mention to its corresponding disambiguated entry in a knowledge repository. In this paper,…
Mobility data science offers insights into the complex interconnections of spatial data of moving objects and their surroundings, often based on a combination of vector and raster data. For example, mobility traces are usually in vector…
Multi-source spatial point data prediction is crucial in fields like environmental monitoring and natural resource management, where integrating data from various sensors is the key to achieving a holistic environmental understanding.…