Related papers: Distribution-Agnostic Database De-Anonymization Un…
Database de-anonymization typically involves matching an anonymized database with correlated publicly available data. Existing research focuses either on practical aspects without requiring knowledge of the data distribution yet provides…
The re-identification or de-anonymization of users from anonymized data through matching with publicly-available correlated user data has raised privacy concerns, leading to the complementary measure of obfuscation in addition to…
De-anonymizing user identities by matching various forms of user data available on the internet raises privacy concerns. A fundamental understanding of the privacy leakage in such scenarios requires a careful study of conditions under which…
The re-identification or de-anonymization of users from anonymized data through matching with publicly available correlated user data has raised privacy concerns, leading to the complementary measure of obfuscation in addition to…
The de-anonymization of users from anonymized microdata through matching or aligning with publicly-available correlated databases has been of scientific interest recently. While most of the rigorous analyses of database matching have…
It is important to study the risks of publishing privacy-sensitive data. Even if sensitive identities (e.g., name, social security number) were removed and advanced data perturbation techniques were applied, several de-anonymization attacks…
We address the problem of social network de-anonymization when relationships between people are described by scale-free graphs. In particular, we propose a rigorous, asymptotic mathematical analysis of the network de-anonymization problem…
We consider nonparametric sequential hypothesis testing problem when the distribution under the null hypothesis is fully known but the alternate hypothesis corresponds to some other unknown distribution with some loose constraints. We…
In this paper, matching of correlated high-dimensional databases is investigated. A stochastic database model is considered where the correlation among the database entries is governed by an arbitrary joint distribution. Concentration of…
Recently, graph matching algorithms have been successfully applied to the problem of network de-anonymization, in which nodes (users) participating to more than one social network are identified only by means of the structure of their links…
Background knowledge is an important factor in privacy preserving data publishing. Distribution-based background knowledge is one of the well studied background knowledge. However, to the best of our knowledge, there is no existing work…
Consider the problem where a statistician in a two-node system receives rate-limited information from a transmitter about marginal observations of a memoryless process generated from two possible distributions. Using its own observations,…
We consider the problem of performing community detection on a network, while maintaining privacy, assuming that the adversary has access to an auxiliary correlated network. We ask the question "Does there exist a regime where the network…
We present a generic and automated approach to re-identifying nodes in anonymized social networks which enables novel anonymization techniques to be quickly evaluated. It uses machine learning (decision forests) to matching pairs of nodes…
We propose a data-dependent denoising procedure to restore noisy images. Different from existing denoising algorithms which search for patches from either the noisy image or a generic database, the new algorithm finds patches from a…
This paper develops the sufficiency principle suitable for data reduction in decentralized inference systems. Both parallel and tandem networks are studied and we focus on the cases where observations at decentralized nodes are…
Privacy-preserving distributed processing has received considerable attention recently. The main purpose of these algorithms is to solve certain signal processing tasks over a network in a decentralised fashion without revealing…
Motivated by distributed machine learning settings such as Federated Learning, we consider the problem of fitting a statistical model across a distributed collection of heterogeneous data sets whose similarity structure is encoded by a…
Enormous amounts of data collected from social networks or other online platforms are being published for the sake of statistics, marketing, and research, among other objectives. The consequent privacy and data security concerns have…
Data sharing between different organizations is an essential process in today's connected world. However, recently there were many concerns about data sharing as sharing sensitive information can jeopardize users' privacy. To preserve the…