Related papers: Fast k-means algorithm clustering

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz

Fast Approximate $K$-Means via Cluster Closures

$K$-means, a simple and effective clustering algorithm, is one of the most widely used algorithms in multimedia and computer vision community. Traditional $k$-means is an iterative algorithm---in each iteration new cluster centers are…

Computer Vision and Pattern Recognition · Computer Science 2013-12-12 Jingdong Wang , Jing Wang , Qifa Ke , Gang Zeng , Shipeng Li

Fast k-means based on KNN Graph

In the era of big data, k-means clustering has been widely adopted as a basic processing tool in various contexts. However, its computational cost could be prohibitively high as the data size and the cluster number are large. It is well…

Machine Learning · Computer Science 2017-05-05 Cheng-Hao Deng , Wan-Lei Zhao

Faster K-Means Cluster Estimation

There has been considerable work on improving popular clustering algorithm `K-means' in terms of mean squared error (MSE) and speed, both. However, most of the k-means variants tend to compute distance of each data point to each cluster…

Machine Learning · Computer Science 2017-01-18 Siddhesh Khandelwal , Amit Awekar

Fast Distributed k-Means with a Small Number of Rounds

We propose a new algorithm for k-means clustering in a distributed setting, where the data is distributed across many machines, and a coordinator communicates with these machines to calculate the output clustering. Our algorithm guarantees…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-14 Tom Hess , Ron Visbord , Sivan Sabato

A Computational Approach to Improving Fairness in K-means Clustering

The popular K-means clustering algorithm potentially suffers from a major weakness for further analysis or interpretation. Some cluster may have disproportionately more (or fewer) points from one of the subpopulations in terms of some…

Machine Learning · Computer Science 2026-02-10 Guancheng Zhou , Haiping Xu , Hongkang Xu , Chenyu Li , Donghui Yan

An efficient K -means clustering algorithm for massive data

The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the…

Machine Learning · Statistics 2018-01-10 Marco Capó , Aritz Pérez , Jose A. Lozano

An efficient K-means algorithm for Massive Data

Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to ma- nipulate and analyze such information. Even though datasets have grown in size, the K-means algorithm…

Machine Learning · Statistics 2016-05-11 Marco Capó , Aritz Pérez , José Antonio Lozano

Seeding K-Means using Method of Moments

K-means is one of the most widely used algorithms for clustering in Data Mining applications, which attempts to minimize the sum of the square of the Euclidean distance of the points in the clusters from the respective means of the…

Machine Learning · Computer Science 2016-11-01 Sayantan Dasgupta

K-Splits: Improved K-Means Clustering Algorithm to Automatically Detect the Number of Clusters

This paper introduces k-splits, an improved hierarchical algorithm based on k-means to cluster data without prior knowledge of the number of clusters. K-splits starts from a small number of clusters and uses the most significant data…

Computer Vision and Pattern Recognition · Computer Science 2022-05-25 Seyed Omid Mohammadi , Ahmad Kalhor , Hossein Bodaghi

Improved Performance of Unsupervised Method by Renovated K-Means

Clustering is a separation of data into groups of similar objects. Every group called cluster consists of objects that are similar to one another and dissimilar to objects of other groups. In this paper, the K-Means algorithm is implemented…

Machine Learning · Computer Science 2013-04-03 P. Ashok , G. M Kadhar Nawaz , E. Elayaraja , V. Vadivel

k2-means for fast and accurate large scale clustering

We propose k^2-means, a new clustering method which efficiently copes with large numbers of clusters and achieves low energy solutions. k^2-means builds upon the standard k-means (Lloyd's algorithm) and combines a new strategy to accelerate…

Machine Learning · Computer Science 2016-05-31 Eirikur Agustsson , Radu Timofte , Luc Van Gool

Explainable $k$-Means and $k$-Medians Clustering

Clustering is a popular form of unsupervised learning for geometric data. Unfortunately, many clustering algorithms lead to cluster assignments that are hard to explain, partially because they depend on all the features of the data in a…

Machine Learning · Computer Science 2020-09-23 Sanjoy Dasgupta , Nave Frost , Michal Moshkovitz , Cyrus Rashtchian

Fast Clustering of Categorical Big Data

The K-Modes algorithm, developed for clustering categorical data, is of high algorithmic simplicity but suffers from unreliable performances in clustering quality and clustering efficiency, both heavily influenced by the choice of initial…

Machine Learning · Computer Science 2025-02-18 Bipana Thapaliya , Yu Zhuang

A novel k-means clustering approach using two distance measures for Gaussian data

Clustering algorithms have long been the topic of research, representing the more popular side of unsupervised learning. Since clustering analysis is one of the best ways to find some clarity and structure within raw data, this paper…

Machine Learning · Computer Science 2025-11-25 Naitik Gada

How to Use K-means for Big Data Clustering?

K-means plays a vital role in data mining and is the simplest and most widely used algorithm under the Euclidean Minimum Sum-of-Squares Clustering (MSSC) model. However, its performance drastically drops when applied to vast amounts of…

Machine Learning · Computer Science 2023-11-27 Rustam Mussabayev , Nenad Mladenovic , Bassem Jarboui , Ravil Mussabayev

A new distance measurement and its application in K-Means Algorithm

K-Means clustering algorithm is one of the most commonly used clustering algorithms because of its simplicity and efficiency. K-Means clustering algorithm based on Euclidean distance only pays attention to the linear distance between…

Machine Learning · Computer Science 2022-06-13 Yiqun Zhang , Houbiao Li

Quantum Clustering with k-Means: a Hybrid Approach

Quantum computing is a promising paradigm based on quantum theory for performing fast computations. Quantum algorithms are expected to surpass their classical counterparts in terms of computational complexity for certain tasks, including…

Quantum Physics · Physics 2024-02-23 Alessandro Poggiali , Alessandro Berti , Anna Bernasconi , Gianna M. Del Corso , Riccardo Guidotti

Active Distance-Based Clustering using K-medoids

k-medoids algorithm is a partitional, centroid-based clustering algorithm which uses pairwise distances of data points and tries to directly decompose the dataset with $n$ points into a set of $k$ disjoint clusters. However, k-medoids…

Machine Learning · Computer Science 2015-12-15 Mehrdad Ghadiri , Amin Aghaee , Mahdieh Soleymani Baghshah

Parallelization of the K-Means Algorithm with Applications to Big Data Clustering

The K-Means clustering using LLoyd's algorithm is an iterative approach to partition the given dataset into K different clusters. The algorithm assigns each point to the cluster based on the following objective function \[\ \min…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-21 Ashish Srivastava , Mohammed Nawfal