Related papers: DPM: Clustering Sensitive Data through Separation

Privacy-Preserving Vertical K-Means Clustering

Clustering is a fundamental data processing task used for grouping records based on one or more features. In the vertically partitioned setting, data is distributed among entities, with each holding only a subset of those features. A key…

Cryptography and Security · Computer Science 2025-04-11 Federico Mazzone , Trevor Brown , Florian Kerschbaum , Kevin H. Wilson , Maarten Everts , Florian Hahn , Andreas Peter

Differentially Private Explanations for Clusters

The dire need to protect sensitive data has led to various flavors of privacy definitions. Among these, Differential privacy (DP) is considered one of the most rigorous and secure notions of privacy, enabling data analysis while preserving…

Cryptography and Security · Computer Science 2025-06-09 Amir Gilad , Tova Milo , Kathy Razmadze , Ron Zadicario

Differentially-Private Clustering of Easy Instances

Clustering is a fundamental problem in data analysis. In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points. Despite significant research progress, the…

Machine Learning · Computer Science 2021-12-30 Edith Cohen , Haim Kaplan , Yishay Mansour , Uri Stemmer , Eliad Tsfadia

Differentially Private k-Means Clustering with Guaranteed Convergence

Iterative clustering algorithms help us to learn the insights behind the data. Unfortunately, this may allow adversaries to infer the privacy of individuals with some background knowledge. In the worst case, the adversaries know the…

Cryptography and Security · Computer Science 2022-04-05 Zhigang Lu , Hong Shen

Differentially Private Clustering in Data Streams

Clustering problems (such as $k$-means and $k$-median) are fundamental unsupervised machine learning primitives, and streaming clustering algorithms have been extensively studied in the past. However, since data privacy becomes a central…

Data Structures and Algorithms · Computer Science 2025-10-03 Alessandro Epasto , Tamalika Mukherjee , Peilin Zhong

Differentially Private Federated $k$-Means Clustering with Server-Side Data

Clustering is a cornerstone of data analysis that is particularly suited to identifying coherent subgroups or substructures in unlabeled data, as are generated continuously in large amounts these days. However, in many cases traditional…

Cryptography and Security · Computer Science 2025-06-12 Jonathan Scott , Christoph H. Lampert , David Saulpic

Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach

Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes.…

Artificial Intelligence · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng

Privacy Preserving K-Means Clustering: A Secure Multi-Party Computation Approach

Knowledge discovery is one of the main goals of Artificial Intelligence. This Knowledge is usually stored in databases spread in different environments, being a tedious (or impossible) task to access and extract data from them. To this…

Machine Learning · Computer Science 2020-09-23 Daniel Hurtado Ramírez , J. M. Auñón

Differentially Private Clustering via Maximum Coverage

This paper studies the problem of clustering in metric spaces while preserving the privacy of individual data. Specifically, we examine differentially private variants of the k-medians and Euclidean k-means problems. We present polynomial…

Data Structures and Algorithms · Computer Science 2020-08-31 Matthew Jones , Huy Lê Nguyen , Thy Nguyen

Privacy Preserving Multi-Server k-means Computation over Horizontally Partitioned Data

The k-means clustering is one of the most popular clustering algorithms in data mining. Recently a lot of research has been concentrated on the algorithm when the dataset is divided into multiple parties or when the dataset is too large to…

Cryptography and Security · Computer Science 2019-07-02 Riddhi Ghosal , Sanjit Chatterjee

Privacy-Preserving Optimal Parameter Selection for Collaborative Clustering

This study investigates the optimal selection of parameters for collaborative clustering while ensuring data privacy. We focus on key clustering algorithms within a collaborative framework, where multiple data owners combine their data. A…

Machine Learning · Computer Science 2024-06-11 Maryam Ghasemian , Erman Ayday

k-Means SubClustering: A Differentially Private Algorithm with Improved Clustering Quality

In today's data-driven world, the sensitivity of information has been a significant concern. With this data and additional information on the person's background, one can easily infer an individual's private data. Many differentially…

Machine Learning · Computer Science 2023-01-10 Devvrat Joshi , Janvi Thakkar

Utility-efficient Differentially Private K-means Clustering based on Cluster Merging

Differential privacy is widely used in data analysis. State-of-the-art $k$-means clustering algorithms with differential privacy typically add an equal amount of noise to centroids for each iterative computation. In this paper, we propose a…

Cryptography and Security · Computer Science 2020-10-06 Tianjiao Ni , Minghao Qiao , Zhili Chen , Shun Zhang , Hong Zhong

On the Price of Differential Privacy for Hierarchical Clustering

Hierarchical clustering is a fundamental unsupervised machine learning task with the aim of organizing data into a hierarchy of clusters. Many applications of hierarchical clustering involve sensitive user information, therefore motivating…

Data Structures and Algorithms · Computer Science 2025-04-23 Chengyuan Deng , Jie Gao , Jalaj Upadhyay , Chen Wang , Samson Zhou

WaveCluster with Differential Privacy

WaveCluster is an important family of grid-based clustering algorithms that are capable of finding clusters of arbitrary shapes. In this paper, we investigate techniques to perform WaveCluster while ensuring differential privacy. Our goal…

Databases · Computer Science 2015-08-04 Ling Chen , Ting Yu , Rada Chirkova

Locating a Small Cluster Privately

We present a new algorithm for locating a small cluster of points with differential privacy [Dwork, McSherry, Nissim, and Smith, 2006]. Our algorithm has implications to private data exploration, clustering, and removal of outliers.…

Data Structures and Algorithms · Computer Science 2017-03-14 Kobbi Nissim , Uri Stemmer , Salil Vadhan

Improving the Variance of Differentially Private Randomized Experiments through Clustering

Estimating causal effects from randomized experiments is only possible if participants are willing to disclose their potentially sensitive responses. Differential privacy, a widely used framework for ensuring an algorithms privacy…

Machine Learning · Statistics 2025-05-29 Adel Javanmard , Vahab Mirrokni , Jean Pouget-Abadie

Deep Continuous Clustering

Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The…

Machine Learning · Computer Science 2018-03-06 Sohil Atul Shah , Vladlen Koltun

Improving the Utility of Differentially Private Clustering through Dynamical Processing

This study aims to alleviate the trade-off between utility and privacy of differentially private clustering. Existing works focus on simple methods, which show poor performance for non-convex clusters. To fit complex cluster distributions,…

Machine Learning · Computer Science 2024-08-23 Junyoung Byun , Yujin Choi , Jaewook Lee

Achieving Data Utility-Privacy Tradeoff in Internet of Medical Things: A Machine Learning Approach

The emergence and rapid development of the Internet of Medical Things (IoMT), an application of the Internet of Things into the medical and healthcare systems, have brought many changes and challenges to modern medical and healthcare…

Cryptography and Security · Computer Science 2019-02-11 Zhitao Guan , Zefang Lv , Xiaojiang Du , Longfei Wu , Mohsen Guizani