Related papers: Improving the K-means algorithm using improved dow…

Modified K-means Algorithm with Local Optimality Guarantees

The K-means algorithm is one of the most widely studied clustering algorithms in machine learning. While extensive research has focused on its ability to achieve a globally optimal solution, there still lacks a rigorous analysis of its…

Machine Learning · Computer Science 2025-06-12 Mingyi Li , Michael R. Metel , Akiko Takeda

An enhanced method of initial cluster center selection for K-means algorithm

Clustering is one of the widely used techniques to find out patterns from a dataset that can be applied in different applications or analyses. K-means, the most popular and simple clustering algorithm, might get trapped into local minima if…

Machine Learning · Computer Science 2022-10-19 Zillur Rahman , Md. Sabir Hossain , Mohammad Hasan , Ahmed Imteaj

Accelerating k-Means Clustering with Cover Trees

The k-means clustering algorithm is a popular algorithm that partitions data into k clusters. There are many improvements to accelerate the standard algorithm. Most current research employs upper and lower bounds on point-to-cluster…

Machine Learning · Computer Science 2024-10-22 Andreas Lang , Erich Schubert

Performance Analysis of AIM-K-means & K-means in Quality Cluster Generation

Among all the partition based clustering algorithms K-means is the most popular and well known method. It generally shows impressive results even in considerably large data sets. The computational complexity of K-means does not suffer from…

Machine Learning · Computer Science 2009-12-22 Samarjeet Borah , Mrinal Kanti Ghose

Combining K-means type algorithms with Hill Climbing for Joint Stratification and Sample Allocation Designs

In this paper we combine the k-means and/or k-means type algorithms with a hill climbing algorithm in stages to solve the joint stratification and sample allocation problem. This is a combinatorial optimisation problem in which we search…

Machine Learning · Statistics 2021-08-19 Mervyn O'Luing , Steven Prestwich , S. Armagan Tarim

An efficient K -means clustering algorithm for massive data

The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the…

Machine Learning · Statistics 2018-01-10 Marco Capó , Aritz Pérez , Jose A. Lozano

An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve Replicability of Cluster Assignments for Mapping Application

K-means is one of the most widely used clustering algorithms in various disciplines, especially for large datasets. However the method is known to be highly sensitive to initial seed selection of cluster centers. K-means++ has been proposed…

Machine Learning · Computer Science 2016-04-19 Fouad Khan

HG-means: A scalable hybrid genetic algorithm for minimum sum-of-squares clustering

Minimum sum-of-squares clustering (MSSC) is a widely used clustering model, of which the popular K-means algorithm constitutes a local minimizer. It is well known that the solutions of K-means can be arbitrarily distant from the true MSSC…

Machine Learning · Computer Science 2018-12-21 Daniel Gribel , Thibaut Vidal

How to Use K-means for Big Data Clustering?

K-means plays a vital role in data mining and is the simplest and most widely used algorithm under the Euclidean Minimum Sum-of-Squares Clustering (MSSC) model. However, its performance drastically drops when applied to vast amounts of…

Machine Learning · Computer Science 2023-11-27 Rustam Mussabayev , Nenad Mladenovic , Bassem Jarboui , Ravil Mussabayev

A Computational Approach to Improving Fairness in K-means Clustering

The popular K-means clustering algorithm potentially suffers from a major weakness for further analysis or interpretation. Some cluster may have disproportionately more (or fewer) points from one of the subpopulations in terms of some…

Machine Learning · Computer Science 2026-02-10 Guancheng Zhou , Haiping Xu , Hongkang Xu , Chenyu Li , Donghui Yan

Global $k$-means$++$: an effective relaxation of the global $k$-means clustering algorithm

The $k$-means algorithm is a prevalent clustering method due to its simplicity, effectiveness, and speed. However, its main disadvantage is its high sensitivity to the initial positions of the cluster centers. The global $k$-means is a…

Machine Learning · Computer Science 2023-07-17 Georgios Vardakas , Aristidis Likas

A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization…

Machine Learning · Computer Science 2012-09-11 M. Emre Celebi , Hassan A. Kingravi , Patricio A. Vela

Scalable Kernel Clustering: Approximate Kernel k-means

Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…

Computer Vision and Pattern Recognition · Computer Science 2014-02-18 Radha Chitta , Rong Jin , Timothy C. Havens , Anil K. Jain

Seeding K-Means using Method of Moments

K-means is one of the most widely used algorithms for clustering in Data Mining applications, which attempts to minimize the sum of the square of the Euclidean distance of the points in the clusters from the respective means of the…

Machine Learning · Computer Science 2016-11-01 Sayantan Dasgupta

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz

K+ Means : An Enhancement Over K-Means Clustering Algorithm

K-means (MacQueen, 1967) [1] is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem. The procedure follows a simple and easy way to classify a given data set to a predefined, say K number of…

Machine Learning · Computer Science 2017-06-23 Srikanta Kolay , Kumar Sankar Ray , Abhoy Chand Mondal

Fast k-means algorithm clustering

k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since k-means depends mainly on distance calculation between all data points and the centers, the time cost will be high when the size of…

Data Structures and Algorithms · Computer Science 2011-08-08 Raied Salman , Vojislav Kecman , Qi Li , Robert Strack , Erik Test

Multi-Prototypes Convex Merging Based K-Means Clustering Algorithm

K-Means algorithm is a popular clustering method. However, it has two limitations: 1) it gets stuck easily in spurious local minima, and 2) the number of clusters k has to be given a priori. To solve these two issues, a multi-prototypes…

Machine Learning · Computer Science 2023-02-15 Dong Li , Shuisheng Zhou , Tieyong Zeng , Raymond H. Chan

CavMerge: Merging K-means Based on Local Log-Concavity

K-means clustering, a classic and widely-used clustering technique, is known to exhibit suboptimal performance when applied to non-linearly separable data. Numerous adjustments and modifications have been proposed to address this issue,…

Methodology · Statistics 2026-04-07 Zhili Qiao , Wangqian Ju , Peng Liu

A hybrid clustering algorithm for data mining

Data clustering is a process of arranging similar data into groups. A clustering algorithm partitions a data set into several groups such that the similarity within a group is better than among groups. In this paper a hybrid clustering…

Databases · Computer Science 2012-05-25 Ravindra Jain