Related papers: Pseudo-Centroid Clustering

Balanced k-Means and Min-Cut Clustering

Clustering is an effective technique in data mining to generate groups that are the matter of interest. Among various clustering approaches, the family of k-means algorithms and min-cut algorithms gain most popularity due to their…

Machine Learning · Computer Science 2014-11-25 Xiaojun Chang , Feiping Nie , Zhigang Ma , Yi Yang

Determining Optimal Number of k-Clusters based on Predefined Level-of-Similarity

This paper proposes a centroid-based clustering algorithm which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity…

Machine Learning · Computer Science 2020-10-08 Rabindra Lamsal , Shubham Katiyar

DISCERN: Diversity-based Selection of Centroids for k-Estimation and Rapid Non-stochastic Clustering

One of the applications of center-based clustering algorithms such as K-Means is partitioning data points into K clusters. In some examples, the feature space relates to the underlying problem we are trying to solve, and sometimes we can…

Machine Learning · Computer Science 2020-09-23 Ali Hassani , Amir Iranmanesh , Mahdi Eftekhari , Abbas Salemi

The K-modes algorithm for clustering

Many clustering algorithms exist that estimate a cluster centroid, such as K-means, K-medoids or mean-shift, but no algorithm seems to exist that clusters data by returning exactly K meaningful modes. We propose a natural definition of a…

Machine Learning · Computer Science 2013-04-25 Miguel Á. Carreira-Perpiñán , Weiran Wang

The Laplacian K-modes algorithm for clustering

In addition to finding meaningful clusters, centroid-based clustering algorithms such as K-means or mean-shift should ideally find centroids that are valid patterns in the input space, representative of data in their cluster. This is…

Machine Learning · Computer Science 2014-06-17 Weiran Wang , Miguel Á. Carreira-Perpiñán

Decentralized Clustering on Compressed Data without Prior Knowledge of the Number of Clusters

In sensor networks, it is not always practical to set up a fusion center. Therefore, there is need for fully decentralized clustering algorithms. Decentralized clustering algorithms should minimize the amount of data exchanged between…

Machine Learning · Statistics 2018-07-13 Elsa Dupraz , Dominique Pastor , François-Xavier Socheleau

K-Splits: Improved K-Means Clustering Algorithm to Automatically Detect the Number of Clusters

This paper introduces k-splits, an improved hierarchical algorithm based on k-means to cluster data without prior knowledge of the number of clusters. K-splits starts from a small number of clusters and uses the most significant data…

Computer Vision and Pattern Recognition · Computer Science 2022-05-25 Seyed Omid Mohammadi , Ahmad Kalhor , Hossein Bodaghi

No More Than 6ft Apart: Robust K-Means via Radius Upper Bounds

Centroid based clustering methods such as k-means, k-medoids and k-centers are heavily applied as a go-to tool in exploratory data analysis. In many cases, those methods are used to obtain representative centroids of the data manifold for…

Machine Learning · Computer Science 2022-06-16 Ahmed Imtiaz Humayun , Randall Balestriero , Anastasios Kyrillidis , Richard Baraniuk

Balancing clusters to reduce response time variability in large scale image search

Many algorithms for approximate nearest neighbor search in high-dimensional spaces partition the data into clusters. At query time, in order to avoid exhaustive search, an index selects the few (or a single) clusters nearest to the query…

Computer Vision and Pattern Recognition · Computer Science 2010-09-27 Romain Tavenard , Laurent Amsaleg , Hervé Jégou

Active Distance-Based Clustering using K-medoids

k-medoids algorithm is a partitional, centroid-based clustering algorithm which uses pairwise distances of data points and tries to directly decompose the dataset with $n$ points into a set of $k$ disjoint clusters. However, k-medoids…

Machine Learning · Computer Science 2015-12-15 Mehrdad Ghadiri , Amin Aghaee , Mahdieh Soleymani Baghshah

Fuzzy K-Means Clustering without Cluster Centroids

Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. Unlike traditional hard clustering algorithms such as K-Means, it allows data points to belong to multiple clusters with varying degrees of membership,…

Machine Learning · Computer Science 2024-11-08 Yichen Bao , Han Lu , Quanxue Gao

Normalization based K means Clustering Algorithm

K-means is an effective clustering technique used to separate similar data into groups based on initial centroids of clusters. In this paper, Normalization based K-means clustering algorithm(N-K means) is proposed. Proposed N-K means…

Machine Learning · Computer Science 2015-03-04 Deepali Virmani , Shweta Taneja , Geetika Malhotra

Parallelization of the K-Means Algorithm with Applications to Big Data Clustering

The K-Means clustering using LLoyd's algorithm is an iterative approach to partition the given dataset into K different clusters. The algorithm assigns each point to the cluster based on the following objective function \[\ \min…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-21 Ashish Srivastava , Mohammed Nawfal

Cluster-level Group Representativity Fairness in $k$-means Clustering

There has been much interest recently in developing fair clustering algorithms that seek to do justice to the representation of groups defined along sensitive attributes such as race and gender. We observe that clustering algorithms could…

Machine Learning · Computer Science 2023-01-02 Stanley Simoes , Deepak P , Muiris MacCarthaigh

K-Tensors: Clustering Positive Semi-Definite Matrices

This paper presents a new clustering algorithm for symmetric positive semi-definite (SPSD) matrices, called K-Tensors. The method identifies structured subsets of the SPSD cone characterized by common principal component (CPC)…

Machine Learning · Computer Science 2025-09-03 Hanchao Zhang , Xiaomeng Ju , Baoyi Shi , Lingsong Meng , Thaddeus Tarpey

How to Use K-means for Big Data Clustering?

K-means plays a vital role in data mining and is the simplest and most widely used algorithm under the Euclidean Minimum Sum-of-Squares Clustering (MSSC) model. However, its performance drastically drops when applied to vast amounts of…

Machine Learning · Computer Science 2023-11-27 Rustam Mussabayev , Nenad Mladenovic , Bassem Jarboui , Ravil Mussabayev

A novel k-means clustering approach using two distance measures for Gaussian data

Clustering algorithms have long been the topic of research, representing the more popular side of unsupervised learning. Since clustering analysis is one of the best ways to find some clarity and structure within raw data, this paper…

Machine Learning · Computer Science 2025-11-25 Naitik Gada

Quantum Clustering with k-Means: a Hybrid Approach

Quantum computing is a promising paradigm based on quantum theory for performing fast computations. Quantum algorithms are expected to surpass their classical counterparts in terms of computational complexity for certain tasks, including…

Quantum Physics · Physics 2024-02-23 Alessandro Poggiali , Alessandro Berti , Anna Bernasconi , Gianna M. Del Corso , Riccardo Guidotti

Silhouette-Driven Instance-Weighted $k$-means

Clustering is a fundamental unsupervised learning task with applications across a wide range of domains. Popular algorithms such as $k$-means are efficient and widely used, but can be sensitive to outliers, ambiguous boundary points, and…

Machine Learning · Computer Science 2026-03-12 Aggelos Semoglou , Aristidis Likas , John Pavlopoulos

Faster Balanced Clusterings in High Dimension

The problem of constrained clustering has attracted significant attention in the past decades. In this paper, we study the balanced $k$-center, $k$-median, and $k$-means clustering problems where the size of each cluster is constrained by…

Computational Geometry · Computer Science 2018-09-11 Hu Ding