Related papers: Quantized Compressive K-Means

Compressive K-means

The Lloyd-Max algorithm is a classical approach to perform K-means clustering. Unfortunately, its cost becomes prohibitive as the training dataset grows large. We propose a compressive version of K-means (CKM), that estimates cluster…

Machine Learning · Computer Science 2017-02-13 Nicolas Keriven , Nicolas Tremblay , Yann Traonmilin , Rémi Gribonval

Compressive Statistical Learning with Random Feature Moments

We describe a general framework -- compressive statistical learning -- for resource-efficient large-scale learning: the training collection is compressed in one pass into a low-dimensional sketch (a vector of random empirical generalized…

Machine Learning · Statistics 2021-06-23 Rémi Gribonval , Gilles Blanchard , Nicolas Keriven , Yann Traonmilin

Mean Nystr\"om Embeddings for Adaptive Compressive Learning

Compressive learning is an approach to efficient large scale learning based on sketching an entire dataset to a single mean embedding (the sketch), i.e. a vector of generalized moments. The learning task is then approximately solved as an…

Machine Learning · Statistics 2022-02-11 Antoine Chatalic , Luigi Carratino , Ernesto De Vito , Lorenzo Rosasco

ck-means, a novel unsupervised learning method that combines fuzzy and crispy clustering methods to extract intersecting data

Clustering data is a popular feature in the field of unsupervised machine learning. Most algorithms aim to find the best method to extract consistent clusters of data, but very few of them intend to cluster data that share the same…

Machine Learning · Computer Science 2022-06-22 Jean-Sébastien Dessureault , Daniel Massicotte

Quantization/clustering: when and why does k-means work?

Though mostly used as a clustering algorithm, k-means are originally designed as a quantization algorithm. Namely, it aims at providing a compression of a probability distribution with k points. Building upon [21, 33], we try to investigate…

Statistics Theory · Mathematics 2018-01-31 Clément Levrard

Kernel k-Means, By All Means: Algorithms and Strong Consistency

Kernel $k$-means clustering is a powerful tool for unsupervised learning of non-linearly separable data. Since the earliest attempts, researchers have noted that such algorithms often become trapped by local minima arising from…

Machine Learning · Statistics 2020-11-13 Debolina Paul , Saptarshi Chakraborty , Swagatam Das , Jason Xu

Statistical Learning Guarantees for Compressive Clustering and Compressive Mixture Modeling

We provide statistical learning guarantees for two unsupervised learning tasks in the context of compressive statistical learning, a general framework for resource-efficient large-scale learning that we introduced in a companion paper.The…

Machine Learning · Computer Science 2021-08-18 Rémi Gribonval , Gilles Blanchard , Nicolas Keriven , Yann Traonmilin

Sketching Datasets for Large-Scale Learning (long version)

This article considers "compressive learning," an approach to large-scale machine learning where datasets are massively compressed before learning (e.g., clustering, classification, or regression) is performed. In particular, a "sketch" is…

Machine Learning · Statistics 2021-06-28 Rémi Gribonval , Antoine Chatalic , Nicolas Keriven , Vincent Schellekens , Laurent Jacques , Philip Schniter

Asymmetric compressive learning guarantees with applications to quantized sketches

The compressive learning framework reduces the computational cost of training on large-scale datasets. In a sketching phase, the data is first compressed to a lightweight sketch vector, obtained by mapping the data samples through a…

Machine Learning · Statistics 2022-04-20 Vincent Schellekens , Laurent Jacques

Multi-Prototypes Convex Merging Based K-Means Clustering Algorithm

K-Means algorithm is a popular clustering method. However, it has two limitations: 1) it gets stuck easily in spurious local minima, and 2) the number of clusters k has to be given a priori. To solve these two issues, a multi-prototypes…

Machine Learning · Computer Science 2023-02-15 Dong Li , Shuisheng Zhou , Tieyong Zeng , Raymond H. Chan

Provably faster randomized and quantum algorithms for $k$-means clustering via uniform sampling

The $k$-means algorithm (Lloyd's algorithm) is a widely used method for clustering unlabeled data. A key bottleneck of the $k$-means algorithm is that each iteration requires time linear in the number of data points, which can be expensive…

Quantum Physics · Physics 2025-10-14 Tyler Chen , Archan Ray , Akshay Seshadri , Dylan Herman , Bao Bach , Pranav Deshpande , Abhishek Som , Niraj Kumar , Marco Pistoia

Scalable Kernel Clustering: Approximate Kernel k-means

Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…

Computer Vision and Pattern Recognition · Computer Science 2014-02-18 Radha Chitta , Rong Jin , Timothy C. Havens , Anil K. Jain

K-means Algorithm over Compressed Binary Data

We consider a network of binary-valued sensors with a fusion center. The fusion center has to perform K-means clustering on the binary data transmitted by the sensors. In order to reduce the amount of data transmitted within the network,…

Information Theory · Computer Science 2018-01-18 Elsa Dupraz

The Informativeness of K -Means for Learning Mixture Models

The learning of mixture models can be viewed as a clustering problem. Indeed, given data samples independently generated from a mixture of distributions, we often would like to find the {\it correct target clustering} of the samples…

Machine Learning · Statistics 2022-08-26 Zhaoqiang Liu , Vincent Y. F. Tan

Spatial Transformer K-Means

K-means defines one of the most employed centroid-based clustering algorithms with performances tied to the data's embedding. Intricate data embeddings have been designed to push $K$-means performances at the cost of reduced theoretical…

Machine Learning · Computer Science 2022-02-17 Romain Cosentino , Randall Balestriero , Yanis Bahroun , Anirvan Sengupta , Richard Baraniuk , Behnaam Aazhang

Probabilistic K-means Clustering via Nonlinear Programming

K-means is a classical clustering algorithm with wide applications. However, soft K-means, or fuzzy c-means at m=1, remains unsolved since 1981. To address this challenging open problem, we propose a novel clustering model, i.e.…

Machine Learning · Computer Science 2020-11-23 Yujian Li , Bowen Liu , Zhaoying Liu , Ting Zhang

An efficient $k$-means-type algorithm for clustering datasets with incomplete records

The $k$-means algorithm is arguably the most popular nonparametric clustering method but cannot generally be applied to datasets with incomplete records. The usual practice then is to either impute missing values under an assumed…

Machine Learning · Statistics 2018-09-11 Andrew Lithio , Ranjan Maitra

Clustering Students and Inferring Skill Set Profiles with Skill Hierarchies

Cognitive diagnosis models (CDMs) are a popular tool for assessing students' mastery of sets of skills. Given a set of $K$ skills tested on an assessment, students are classified into one of $2^K$ latent skill set profiles that represent…

Applications · Statistics 2021-04-07 Alan Mishler , Rebecca Nugent

Quantum Unsupervised and Supervised Learning on Superconducting Processors

Machine learning algorithms perform well on identifying patterns in many different datasets due to their versatility. However, as one increases the size of the dataset, the computation time for training and using these statistical models…

Quantum Physics · Physics 2024-09-19 Abhijat Sarma , Rupak Chatterjee , Kaitlin Gili , Ting Yu

A Randomized Approach to Efficient Kernel Clustering

Kernel-based K-means clustering has gained popularity due to its simplicity and the power of its implicit non-linear representation of the data. A dominant concern is the memory requirement since memory scales as the square of the number of…

Machine Learning · Statistics 2016-12-05 Farhad Pourkamali-Anaraki , Stephen Becker