English
Related papers

Related papers: Robust Group Linkage

200 papers

Record linkage is the process of identifying records that refer to the same entities from several databases. This process is challenging because commonly no unique entity identifiers are available. Linkage therefore has to rely on partially…

Databases · Computer Science 2016-12-14 Peter Christen

Record linkage is an essential part of nearly all real-world systems that consume structured and unstructured data coming from different sources. Typically no common key is available for connecting records. Massive data cleaning and data…

Databases · Computer Science 2019-09-30 Thomas Gschwind , Christoph Miksovic , Julian Minder , Katsiaryna Mirylenka , Paolo Scotton

Record linkage is aimed at the accurate and efficient identification of records that represent the same entity within or across disparate databases. It is a fundamental task in data integration and increasingly required for accurate…

Databases · Computer Science 2021-04-21 Thilina Ranbaduge , Peter Christen , Rainer Schnell

Accurate and efficient record linkage is an open challenge of particular relevance to Australian Government Agencies, who recognise that so-called wicked social problems are best tackled by forming partnerships founded on large-scale data…

Databases · Computer Science 2018-02-01 Yuhang Zhang , Tania Churchill , Kee Siong Ng

We propose an unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation involves the representation of the pattern of links between records as…

Methodology · Statistics 2015-11-03 Rebecca C. Steorts , Rob Hall , Stephen E. Fienberg

Many systems can be described using graphs, or networks. Detecting communities in these networks can provide information about the underlying structure and functioning of the original systems. Yet this detection is a complex task and a…

Data Structures and Algorithms · Computer Science 2013-02-06 Erwan Le Martelot , Chris Hankin

To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data…

Machine Learning · Computer Science 2024-05-02 Lunjia Hu , Charlotte Peale , Judy Hanwen Shen

Standard agglomerative clustering suggests establishing a new reliable linkage at every step. However, in order to provide adaptive, density-consistent and flexible solutions, we study extracting all the reliable linkages at each step,…

Machine Learning · Computer Science 2023-01-02 Morteza Haghir Chehreghani

In record linkage (RL), or exact file matching, the goal is to identify the links between entities with information on two or more files. RL is an important activity in areas including counting the population, enhancing survey frames and…

Statistics Theory · Mathematics 2012-12-21 Michael D. Larsen

Consider the following problem: given a database of records indexed by names (e.g., name of companies, restaurants, businesses, or universities) and a new name, determine whether the new name is in the database, and if so, which record it…

Databases · Computer Science 2018-06-29 Bahare Fatemi , Seyed Mehran Kazemi , David Poole

Graph algorithms are central to large-scale applications such as navigation systems, social networks, and data analysis platforms. This thesis studies two important challenges in such systems: robustness to failures and fairness in…

Data Structures and Algorithms · Computer Science 2026-05-21 Kushagra Chatterjee

Link prediction problem has increasingly become prominent in many domains such as social network analyses, bioinformatics experiments, transportation networks, criminal investigations and so forth. A variety of techniques has been developed…

Artificial Intelligence · Computer Science 2023-05-18 Safiye Ghasemi , Amin Zarei

There has been substantial recent interest in record linkage, attempting to group the records pertaining to the same entities from a large database lacking unique identifiers. This can be viewed as a type of "microclustering," with few…

Statistics Theory · Mathematics 2017-03-16 James E. Johndrow , Kristian Lum , David B. Dunson

Entity alignment has always had significant uses within a multitude of diverse scientific fields. In particular, the concept of matching entities across networks has grown in significance in the world of social science as communicative…

Social and Information Networks · Computer Science 2020-04-21 James Flamino , Christopher Abriola , Ben Zimmerman , Zhongheng Li , Joel Douglas

Community detection, which focuses on clustering vertex interactions, plays a significant role in network analysis. However, it also faces numerous challenges like missing data and adversarial attack. How to further improve the performance…

Social and Information Networks · Computer Science 2021-07-02 Jiajun Zhou , Zhi Chen , Min Du , Lihong Chen , Shanqing Yu , Guanrong Chen , Qi Xuan

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao

A significant problem in analysis of complex network is to reveal community structure, in which network nodes are tightly connected in the same communities, between which there are sparse connections. Previous algorithms for community…

Physics and Society · Physics 2018-04-25 Jingming Zhang , Jianjun Cheng , Xing Su , Xinhong Yin , Shiyan Zhao , Xiaoyun Chen

Understanding the similar properties of people involved in group search sessions has the potential to significantly improve collaborative search systems; such systems could be enhanced by information retrieval algorithms and user interface…

Information Retrieval · Computer Science 2009-08-06 Meredith Ringel Morris , Jaime Teevan

In this paper, we focus on exploiting the group structure for large-dimensional factor models, which captures the homogeneous effects of common factors on individuals within the same group. In view of the fact that datasets in…

Methodology · Statistics 2024-05-14 Yong He , Xiaoyang Ma , Xingheng Wang , Yalin Wang

Data Linkage is an important step that can provide valuable insights for evidence-based decision making, especially for crucial events. Performing sensible queries across heterogeneous databases containing millions of records is a complex…

Databases · Computer Science 2015-10-09 Mohammed Gollapalli
‹ Prev 1 2 3 10 Next ›