Related papers: Distributed Clustering based on Distributional Ker…

Distribution-Based Trajectory Clustering

Trajectory clustering enables the discovery of common patterns in trajectory data. Current methods of trajectory clustering rely on a distance measure between two points in order to measure the dissimilarity between two trajectories. The…

Artificial Intelligence · Computer Science 2023-10-31 Zi Jing Wang , Ye Zhu , Kai Ming Ting

Distributed Clustering Algorithm for Spatial Data Mining

Distributed data mining techniques and mainly distributed clustering are widely used in the last decade because they deal with very large and heterogeneous datasets which cannot be gathered centrally. Current distributed clustering…

Databases · Computer Science 2018-02-02 Malika Bendechache , M-Tahar Kechadi

Distributed k-Means and k-Median Clustering on General Topologies

This paper provides new algorithms for distributed clustering for two popular center-based objectives, k-median and k-means. These algorithms have provable guarantees and improve communication complexity over existing approaches. Following…

Machine Learning · Computer Science 2020-01-28 Maria Florina Balcan , Steven Ehrlich , Yingyu Liang

Estimating the Optimal Number of Clusters in Categorical Data Clustering by Silhouette Coefficient

The problem of estimating the number of clusters (say k) is one of the major challenges for the partitional clustering. This paper proposes an algorithm named k-SCC to estimate the optimal k in categorical data clustering. For the…

Machine Learning · Computer Science 2025-01-28 Duy-Tai Dinh , Tsutomu Fujinami , Van-Nam Huynh

Hashing-Based Distributed Clustering for Massive High-Dimensional Data

Clustering analysis is of substantial significance for data mining. The properties of big data raise higher demand for more efficient and economical distributed clustering methods. However, existing distributed clustering methods mainly…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-03 Yifeng Xiao , Jiang Xue , Deyu Meng

Distributional Clustering: A distribution-preserving clustering method

One key use of k-means clustering is to identify cluster prototypes which can serve as representative points for a dataset. However, a drawback of using k-means cluster centers as representative points is that such points distort the…

Machine Learning · Statistics 2019-11-15 Arvind Krishna , Simon Mak , Roshan Joseph

Scalable Kernel Clustering: Approximate Kernel k-means

Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…

Computer Vision and Pattern Recognition · Computer Science 2014-02-18 Radha Chitta , Rong Jin , Timothy C. Havens , Anil K. Jain

Distributed Algorithms for Finding Local Clusters Using Heat Kernel Pagerank

A distributed algorithm performs local computations on pieces of input and communicates the results through given communication links. When processing a massive graph in a distributed algorithm, local outputs must be configured as a…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-12-06 Fan Chung , Olivia Simpson

On Heterogeneous Coded Distributed Computing

We consider the recently proposed Coded Distributed Computing (CDC) framework that leverages carefully designed redundant computations to enable coding opportunities that substantially reduce the communication load of distributed computing.…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-04 Mehrdad Kiamari , Chenwei Wang , A. Salman Avestimehr

Fast Distributed k-Means with a Small Number of Rounds

We propose a new algorithm for k-means clustering in a distributed setting, where the data is distributed across many machines, and a coordinator communicates with these machines to calculate the output clustering. Our algorithm guarantees…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-14 Tom Hess , Ron Visbord , Sivan Sabato

Quartile Clustering: A quartile based technique for Generating Meaningful Clusters

Clustering is one of the main tasks in exploratory data analysis and descriptive statistics where the main objective is partitioning observations in groups. Clustering has a broad range of application in varied domains like climate,…

Databases · Computer Science 2012-03-20 Saptarsi Goswami , Amlan Chakrabarti

On Distributed Algorithms for Cost-Efficient Data Center Placement in Cloud Computing

The increasing popularity of cloud computing has resulted in a proliferation of data centers. Effective placement of data centers improves network performance and minimizes clients' perceived latency. The problem of determining the optimal…

Networking and Internet Architecture · Computer Science 2018-02-06 Wuqiong Luo , Wee Peng Tay , Peng Sun , Yonggang Wen

Kernel Spectral Clustering and applications

In this chapter we review the main literature related to kernel spectral clustering (KSC), an approach to clustering cast within a kernel-based optimization setting. KSC represents a least-squares support vector machine based formulation of…

Machine Learning · Computer Science 2015-05-05 Rocco Langone , Raghvendra Mall , Carlos Alzate , Johan A. K. Suykens

Clustering evolving data using kernel-based methods

In this thesis, we propose several modelling strategies to tackle evolving data in different contexts. In the framework of static clustering, we start by introducing a soft kernel spectral clustering (SKSC) algorithm, which can better deal…

Social and Information Networks · Computer Science 2014-11-24 Rocco Langone

Quantile-based clustering

A new cluster analysis method, $K$-quantiles clustering, is introduced. $K$-quantiles clustering can be computed by a simple greedy algorithm in the style of the classical Lloyd's algorithm for $K$-means. It can be applied to large and…

Methodology · Statistics 2019-11-12 Christian Hennig , Cinzia Viroli , Laura Anderlucci

Robust Clustering using Hyperdimensional Computing

This paper addresses the clustering of data in the hyperdimensional computing (HDC) domain. In prior work, an HDC-based clustering framework, referred to as HDCluster, has been proposed. However, the performance of the existing HDCluster is…

Machine Learning · Computer Science 2024-04-19 Lulu Ge , Keshab K. Parhi

Distributed Kernel K-Means for Large Scale Clustering

Clustering samples according to an effective metric and/or vector space representation is a challenging unsupervised learning task with a wide spectrum of applications. Among several clustering algorithms, k-means and its kernelized version…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-10 Marco Jacopo Ferrarotti , Sergio Decherchi , Walter Rocchia

K-expectiles clustering

$K$-means clustering is one of the most widely-used partitioning algorithm in cluster analysis due to its simplicity and computational efficiency. However, $K$-means does not provide an appropriate clustering result when applying to data…

Methodology · Statistics 2021-03-18 Bingling Wang , Yinxing Li , Wolfgang Karl Härdle

Kernel K-means clustering of distributional data

We consider the problem of clustering a sample of probability distributions from a random distribution on $\mathbb R^p$. Our proposed partitioning method makes use of a symmetric, positive-definite kernel $k$ and its associated reproducing…

Machine Learning · Statistics 2025-09-23 Amparo Baíllo , Jose R. Berrendero , Martín Sánchez-Signorini

Efficient Large Scale Clustering based on Data Partitioning

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi