English
Related papers

Related papers: Scalable Initialization Methods for Large-Scale Cl…

200 papers

K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization…

Machine Learning · Computer Science 2012-09-11 M. Emre Celebi , Hassan A. Kingravi , Patricio A. Vela

We propose k^2-means, a new clustering method which efficiently copes with large numbers of clusters and achieves low energy solutions. k^2-means builds upon the standard k-means (Lloyd's algorithm) and combines a new strategy to accelerate…

Machine Learning · Computer Science 2016-05-31 Eirikur Agustsson , Radu Timofte , Luc Van Gool

Over the past five decades, k-means has become the clustering algorithm of choice in many application domains primarily due to its simplicity, time/space efficiency, and invariance to the ordering of the data points. Unfortunately, the…

Machine Learning · Computer Science 2014-09-16 M. Emre Celebi , Hassan A. Kingravi

K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization…

Machine Learning · Computer Science 2013-04-30 M. Emre Celebi , Hassan A. Kingravi

The K-means algorithm is arguably the most popular data clustering method, commonly applied to processed datasets in some "feature spaces", as is in spectral clustering. Highly sensitive to initializations, however, K-means encounters a…

Machine Learning · Computer Science 2019-06-04 Feiyu Chen , Yuchen Yang , Liwei Xu , Taiping Zhang , Yin Zhang

K-means is one of the most widely used clustering algorithms in various disciplines, especially for large datasets. However the method is known to be highly sensitive to initial seed selection of cluster centers. K-means++ has been proposed…

Machine Learning · Computer Science 2016-04-19 Fouad Khan

This paper introduces a novel K-means clustering algorithm, an advancement on the conventional Big-means methodology. The proposed method efficiently integrates parallel processing, stochastic sampling, and competitive optimization to…

Machine Learning · Computer Science 2024-03-28 Rustam Mussabayev , Ravil Mussabayev

This paper presents a comparative analysis of different optimization techniques for the K-means algorithm in the context of big data. K-means is a widely used clustering algorithm, but it can suffer from scalability issues when dealing with…

Machine Learning · Computer Science 2024-05-21 Ravil Mussabayev , Rustam Mussabayev

K-Means is one of the most used algorithms for data clustering and the usual clustering method for benchmarking. Despite its wide application it is well-known that it suffers from a series of disadvantages; it is only able to find local…

Machine Learning · Computer Science 2021-03-02 Avgoustinos Vouros , Stephen Langdell , Mike Croucher , Eleni Vasilaki

Over half a century old and showing no signs of aging, k-means remains one of the most popular data processing algorithms. As is well-known, a proper initialization of k-means is crucial for obtaining a good final solution. The recently…

Databases · Computer Science 2012-03-30 Bahman Bahmani , Benjamin Moseley , Andrea Vattani , Ravi Kumar , Sergei Vassilvitskii

The K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses the random method to determine the initial cluster centers, which make clustering results…

Machine Learning · Computer Science 2019-11-28 Jie Yang , Yu-Kai Wang , Xin Yao , Chin-Teng Lin

We examine methods for clustering in high dimensions. In the first part of the paper, we perform an experimental comparison between three batch clustering algorithms: the Expectation-Maximization (EM) algorithm, a winner take all version of…

Machine Learning · Computer Science 2015-05-19 Marina Meila , David Heckerman

This paper presents two novel deterministic initialization procedures for K-means clustering based on a modified crowding distance. The procedures, named CKmeans and FCKmeans, use more crowded points as initial centroids. Experimental…

Machine Learning · Computer Science 2023-05-02 Abdesslem Layeb

Among all the partition based clustering algorithms K-means is the most popular and well known method. It generally shows impressive results even in considerably large data sets. The computational complexity of K-means does not suffer from…

Machine Learning · Computer Science 2009-12-22 Samarjeet Borah , Mrinal Kanti Ghose

K-means is an effective clustering technique used to separate similar data into groups based on initial centroids of clusters. In this paper, Normalization based K-means clustering algorithm(N-K means) is proposed. Proposed N-K means…

Machine Learning · Computer Science 2015-03-04 Deepali Virmani , Shweta Taneja , Geetika Malhotra

K-means plays a vital role in data mining and is the simplest and most widely used algorithm under the Euclidean Minimum Sum-of-Squares Clustering (MSSC) model. However, its performance drastically drops when applied to vast amounts of…

Machine Learning · Computer Science 2023-11-27 Rustam Mussabayev , Nenad Mladenovic , Bassem Jarboui , Ravil Mussabayev

One of the applications of center-based clustering algorithms such as K-Means is partitioning data points into K clusters. In some examples, the feature space relates to the underlying problem we are trying to solve, and sometimes we can…

Machine Learning · Computer Science 2020-09-23 Ali Hassani , Amir Iranmanesh , Mahdi Eftekhari , Abbas Salemi

The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the…

Machine Learning · Statistics 2018-01-10 Marco Capó , Aritz Pérez , Jose A. Lozano

Clustering is a key task in machine learning, with $k$-means being widely used for its simplicity and effectiveness. While 1D clustering is common, existing methods often fail to exploit the structure of 1D data, leading to inefficiencies.…

Data Structures and Algorithms · Computer Science 2024-12-25 Jake Hyun

The $k$-means algorithm is a prevalent clustering method due to its simplicity, effectiveness, and speed. However, its main disadvantage is its high sensitivity to the initial positions of the cluster centers. The global $k$-means is a…

Machine Learning · Computer Science 2023-07-17 Georgios Vardakas , Aristidis Likas
‹ Prev 1 2 3 10 Next ›