Related papers: Improving the K-means algorithm using improved dow…
The K-means algorithm is one of the most widely studied clustering algorithms in machine learning. While extensive research has focused on its ability to achieve a globally optimal solution, there still lacks a rigorous analysis of its…
Clustering is one of the widely used techniques to find out patterns from a dataset that can be applied in different applications or analyses. K-means, the most popular and simple clustering algorithm, might get trapped into local minima if…
The k-means clustering algorithm is a popular algorithm that partitions data into k clusters. There are many improvements to accelerate the standard algorithm. Most current research employs upper and lower bounds on point-to-cluster…
Among all the partition based clustering algorithms K-means is the most popular and well known method. It generally shows impressive results even in considerably large data sets. The computational complexity of K-means does not suffer from…
In this paper we combine the k-means and/or k-means type algorithms with a hill climbing algorithm in stages to solve the joint stratification and sample allocation problem. This is a combinatorial optimisation problem in which we search…
The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the…
K-means is one of the most widely used clustering algorithms in various disciplines, especially for large datasets. However the method is known to be highly sensitive to initial seed selection of cluster centers. K-means++ has been proposed…
Minimum sum-of-squares clustering (MSSC) is a widely used clustering model, of which the popular K-means algorithm constitutes a local minimizer. It is well known that the solutions of K-means can be arbitrarily distant from the true MSSC…
K-means plays a vital role in data mining and is the simplest and most widely used algorithm under the Euclidean Minimum Sum-of-Squares Clustering (MSSC) model. However, its performance drastically drops when applied to vast amounts of…
The popular K-means clustering algorithm potentially suffers from a major weakness for further analysis or interpretation. Some cluster may have disproportionately more (or fewer) points from one of the subpopulations in terms of some…
The $k$-means algorithm is a prevalent clustering method due to its simplicity, effectiveness, and speed. However, its main disadvantage is its high sensitivity to the initial positions of the cluster centers. The global $k$-means is a…
K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization…
Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…
K-means is one of the most widely used algorithms for clustering in Data Mining applications, which attempts to minimize the sum of the square of the Euclidean distance of the points in the clusters from the respective means of the…
We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…
K-means (MacQueen, 1967) [1] is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem. The procedure follows a simple and easy way to classify a given data set to a predefined, say K number of…
k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since k-means depends mainly on distance calculation between all data points and the centers, the time cost will be high when the size of…
K-Means algorithm is a popular clustering method. However, it has two limitations: 1) it gets stuck easily in spurious local minima, and 2) the number of clusters k has to be given a priori. To solve these two issues, a multi-prototypes…
K-means clustering, a classic and widely-used clustering technique, is known to exhibit suboptimal performance when applied to non-linearly separable data. Numerous adjustments and modifications have been proposed to address this issue,…
Data clustering is a process of arranging similar data into groups. A clustering algorithm partitions a data set into several groups such that the similarity within a group is better than among groups. In this paper a hybrid clustering…