Related papers: Quantile-based clustering

An efficient K -means clustering algorithm for massive data

The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the…

Machine Learning · Statistics 2018-01-10 Marco Capó , Aritz Pérez , Jose A. Lozano

Merging $K$-means with hierarchical clustering for identifying general-shaped groups

Clustering partitions a dataset such that observations placed together in a group are similar but different from those in other groups. Hierarchical and $K$-means clustering are two approaches but have different strengths and weaknesses.…

Machine Learning · Statistics 2017-12-27 Anna D. Peterson , Arka P. Ghosh , Ranjan Maitra

Quantum Clustering with k-Means: a Hybrid Approach

Quantum computing is a promising paradigm based on quantum theory for performing fast computations. Quantum algorithms are expected to surpass their classical counterparts in terms of computational complexity for certain tasks, including…

Quantum Physics · Physics 2024-02-23 Alessandro Poggiali , Alessandro Berti , Anna Bernasconi , Gianna M. Del Corso , Riccardo Guidotti

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz

Do you know what q-means?

Clustering is one of the most important tools for analysis of large datasets, and perhaps the most popular clustering algorithm is Lloyd's algorithm for $k$-means. This algorithm takes $n$ vectors $V=[v_1,\dots,v_n]\in\mathbb{R}^{d\times…

Quantum Physics · Physics 2025-07-18 Arjan Cornelissen , Joao F. Doriguello , Alessandro Luongo , Ewin Tang

Quartile Clustering: A quartile based technique for Generating Meaningful Clusters

Clustering is one of the main tasks in exploratory data analysis and descriptive statistics where the main objective is partitioning observations in groups. Clustering has a broad range of application in varied domains like climate,…

Databases · Computer Science 2012-03-20 Saptarsi Goswami , Amlan Chakrabarti

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Bayesian models offer great flexibility for clustering applications---Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For…

Machine Learning · Computer Science 2012-06-15 Brian Kulis , Michael I. Jordan

Scalable Kernel Clustering: Approximate Kernel k-means

Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…

Computer Vision and Pattern Recognition · Computer Science 2014-02-18 Radha Chitta , Rong Jin , Timothy C. Havens , Anil K. Jain

Distributed Clustering Algorithm for Spatial Data Mining

Distributed data mining techniques and mainly distributed clustering are widely used in the last decade because they deal with very large and heterogeneous datasets which cannot be gathered centrally. Current distributed clustering…

Databases · Computer Science 2018-02-02 Malika Bendechache , M-Tahar Kechadi

Quantum Unsupervised and Supervised Learning on Superconducting Processors

Machine learning algorithms perform well on identifying patterns in many different datasets due to their versatility. However, as one increases the size of the dataset, the computation time for training and using these statistical models…

Quantum Physics · Physics 2024-09-19 Abhijat Sarma , Rupak Chatterjee , Kaitlin Gili , Ting Yu

Provably faster randomized and quantum algorithms for $k$-means clustering via uniform sampling

The $k$-means algorithm (Lloyd's algorithm) is a widely used method for clustering unlabeled data. A key bottleneck of the $k$-means algorithm is that each iteration requires time linear in the number of data points, which can be expensive…

Quantum Physics · Physics 2025-10-14 Tyler Chen , Archan Ray , Akshay Seshadri , Dylan Herman , Bao Bach , Pranav Deshpande , Abhishek Som , Niraj Kumar , Marco Pistoia

Semi-supervised clustering methods

Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…

Methodology · Statistics 2014-07-11 Eric Bair

K-expectiles clustering

$K$-means clustering is one of the most widely-used partitioning algorithm in cluster analysis due to its simplicity and computational efficiency. However, $K$-means does not provide an appropriate clustering result when applying to data…

Methodology · Statistics 2021-03-18 Bingling Wang , Yinxing Li , Wolfgang Karl Härdle

The Laplacian K-modes algorithm for clustering

In addition to finding meaningful clusters, centroid-based clustering algorithms such as K-means or mean-shift should ideally find centroids that are valid patterns in the input space, representative of data in their cluster. This is…

Machine Learning · Computer Science 2014-06-17 Weiran Wang , Miguel Á. Carreira-Perpiñán

Strong Consistency of Factorial K-means Clustering

Factorial k-means (FKM) clustering is a method for clustering objects in a low-dimensional subspace. The advantage of this method is that the partition of objects and the low-dimensional subspace reflecting the cluster structure are…

Statistics Theory · Mathematics 2014-02-14 Yoshikazu Terada

Comparison three methods of clustering: k-means, spectral clustering and hierarchical clustering

Comparison of three kind of the clustering and find cost function and loss function and calculate them. Error rate of the clustering methods and how to calculate the error percentage always be one on the important factor for evaluating the…

Machine Learning · Computer Science 2014-11-14 Kamran Kowsari

How to Use K-means for Big Data Clustering?

K-means plays a vital role in data mining and is the simplest and most widely used algorithm under the Euclidean Minimum Sum-of-Squares Clustering (MSSC) model. However, its performance drastically drops when applied to vast amounts of…

Machine Learning · Computer Science 2023-11-27 Rustam Mussabayev , Nenad Mladenovic , Bassem Jarboui , Ravil Mussabayev

Probabilistic Partitive Partitioning (PPP)

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies…

Databases · Computer Science 2020-03-11 Mujahid Sultan

Simple and Scalable Sparse k-means Clustering via Feature Ranking

Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters.…

Machine Learning · Statistics 2020-10-23 Zhiyue Zhang , Kenneth Lange , Jason Xu

Practical Quantum K-Means Clustering: Performance Analysis and Applications in Energy Grid Classification

In this work, we aim to solve a practical use-case of unsupervised clustering which has applications in predictive maintenance in the energy operations sector using quantum computers. Using only cloud access to quantum computers, we…

Quantum Physics · Physics 2022-09-13 Stephen DiAdamo , Corey O'Meara , Giorgio Cortiana , Juan Bernabé-Moreno