English
Related papers

Related papers: Generalized Supervised Meta-blocking (technical re…

200 papers

Entity Resolution, also called record linkage or deduplication, refers to the process of identifying and merging duplicate versions of the same entity into a unified representation. The standard practice is to use a Rule based or Machine…

Artificial Intelligence · Computer Science 2016-09-22 Janani Balaji , Faizan Javed , Mayank Kejriwal , Chris Min , Sam Sander , Ozgur Ozturk

Efficiency techniques are an integral part of Entity Resolution, since its infancy. In this survey, we organized the bulk of works in the field into Blocking, Filtering and hybrid techniques, facilitating their understanding and use. We…

Databases · Computer Science 2020-08-24 George Papadakis , Dimitrios Skoutas , Emmanouil Thanos , Themis Palpanas

Entity matching seeks to identify data records over one or multiple data sources that refer to the same real-world entity. Virtually every entity matching task on large datasets requires blocking, a step that reduces the number of record…

Databases · Computer Science 2019-12-10 Wei Zhang , Hao Wei , Bunyamin Sisman , Xin Luna Dong , Christos Faloutsos , David Page

Entity Matching (EM) is crucial for identifying equivalent data entities across different sources, a task that becomes increasingly challenging with the growth and heterogeneity of data. Blocking techniques, which reduce the computational…

Machine Learning · Computer Science 2024-09-26 Mohammad Hossein Moslemi , Harini Balamurugan , Mostafa Milani

An increasing number of entities are described by interlinked data rather than documents on the Web. Entity Resolution (ER) aims to identify descriptions of the same real-world entity within one or across knowledge bases in the Web of data.…

Databases · Computer Science 2020-05-20 Vasilis Efthymiou , Kostas Stefanidis , Vassilis Christophides

Accurate and efficient entity resolution is an open challenge of particular relevance to intelligence organisations that collect large datasets from disparate sources with differing levels of quality and standard. Starting from a…

Databases · Computer Science 2018-03-20 Yuhang Zhang , Kee Siong Ng , Michael Walker , Pauline Chou , Tania Churchill , Peter Christen

Blocking is a critical step in entity resolution, and the emergence of neural network-based representation models has led to the development of dense blocking as a promising approach for exploring deep semantics in blocking. However,…

Databases · Computer Science 2024-04-26 Tianshu Wang , Hongyu Lin , Xianpei Han , Xiaoyang Chen , Boxi Cao , Le Sun

Entity Resolution suffers from quadratic time complexity. To increase its time efficiency, three kinds of filtering techniques are typically used for restricting its search space: (i) blocking workflows, which group together entity profiles…

Entity Resolution (ER) is the task of finding entity profiles that correspond to the same real-world entity. Progressive ER aims to efficiently resolve large datasets when limited time and/or computational resources are available. In…

Databases · Computer Science 2019-05-17 Giovanni Simonini , George Papadakis , Themis Palpanas , Sonia Bergamaschi

In this paper, for the first time, we introduce the concept of skyblocking, which aims to efficiently identify the "most preferred" blocking scheme in terms of a given set of selection criteria for entity resolution blocking. To capture all…

Databases · Computer Science 2018-09-19 Jingyu Shao , Qing Wang , Yu Lin

Blocking is a mechanism to improve the efficiency of Entity Resolution (ER) which aims to quickly prune out all non-matching record pairs. However, depending on the distributions of entity cluster sizes, existing techniques can be either…

Databases · Computer Science 2021-03-17 Sainyam Galhotra , Donatella Firmani , Barna Saha , Divesh Srivastava

Background: Classifications in meta-research enable researchers to cope with an increasing body of scientific knowledge. They provide a framework for, e.g., distinguishing methods, reports, reproducibility, and evaluation in a knowledge…

Software Engineering · Computer Science 2022-09-22 Angelika Kaplan , Thomas Kühn , Ralf Reussner

Many recent works on Entity Resolution (ER) leverage Deep Learning techniques involving language models to improve effectiveness. This is applied to both main steps of ER, i.e., blocking and matching. Several pre-trained embeddings have…

Databases · Computer Science 2023-04-26 Alexandros Zeakis , George Papadakis , Dimitrios Skoutas , Manolis Koubarakis

The problem of selecting an algorithm that appears most suitable for a specific instance of an algorithmic problem class, such as the Boolean satisfiability problem, is called instance-specific algorithm selection. Over the past decade, the…

Machine Learning · Computer Science 2021-07-21 Alexander Tornede , Lukas Gehring , Tanja Tornede , Marcel Wever , Eyke Hüllermeier

Entity Resolution (ER) is typically implemented as a batch task that processes all available data before identifying duplicate records. However, applications with time or computational constraints, e.g., those running in the cloud, require…

Databases · Computer Science 2025-03-12 Jakub Maciejewski , Konstantinos Nikoletos , George Papadakis , Yannis Velegrakis

Entity resolution (ER) is the process of identifying records that refer to the same entities within one or across multiple databases. Numerous techniques have been developed to tackle ER challenges over the years, with recent emphasis…

Databases · Computer Science 2023-11-14 George Papadakis , Nishadi Kirielle , Peter Christen , Themis Palpanas

In collaborative learning, learners coordinate to enhance each of their learning performances. From the perspective of any learner, a critical challenge is to filter out unqualified collaborators. We propose a framework named meta…

Machine Learning · Computer Science 2022-09-29 Chenglong Ye , Reza Ghanadan , Jie Ding

Entity alignment is to find identical entities in different knowledge graphs. Although embedding-based entity alignment has recently achieved remarkable progress, training data insufficiency remains a critical challenge. Conventional…

Artificial Intelligence · Computer Science 2022-03-15 Kexuan Xin , Zequn Sun , Wen Hua , Bing Liu , Wei Hu , Jianfeng Qu , Xiaofang Zhou

One critical challenge for large language models (LLMs) for making complex reasoning is their reliance on matching reasoning patterns from training data, instead of proactively selecting the most appropriate cognitive strategy to solve a…

Computation and Language · Computer Science 2025-03-18 Qin Liu , Wenxuan Zhou , Nan Xu , James Y. Huang , Fei Wang , Sheng Zhang , Hoifung Poon , Muhao Chen

Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers. A major advancement in ER methodology has been the application of Bayesian…

‹ Prev 1 2 3 10 Next ›