English
Related papers

Related papers: Quantized Compressive K-Means

200 papers

The Lloyd-Max algorithm is a classical approach to perform K-means clustering. Unfortunately, its cost becomes prohibitive as the training dataset grows large. We propose a compressive version of K-means (CKM), that estimates cluster…

Machine Learning · Computer Science 2017-02-13 Nicolas Keriven , Nicolas Tremblay , Yann Traonmilin , Rémi Gribonval

We describe a general framework -- compressive statistical learning -- for resource-efficient large-scale learning: the training collection is compressed in one pass into a low-dimensional sketch (a vector of random empirical generalized…

Machine Learning · Statistics 2021-06-23 Rémi Gribonval , Gilles Blanchard , Nicolas Keriven , Yann Traonmilin

Compressive learning is an approach to efficient large scale learning based on sketching an entire dataset to a single mean embedding (the sketch), i.e. a vector of generalized moments. The learning task is then approximately solved as an…

Machine Learning · Statistics 2022-02-11 Antoine Chatalic , Luigi Carratino , Ernesto De Vito , Lorenzo Rosasco

Clustering data is a popular feature in the field of unsupervised machine learning. Most algorithms aim to find the best method to extract consistent clusters of data, but very few of them intend to cluster data that share the same…

Machine Learning · Computer Science 2022-06-22 Jean-Sébastien Dessureault , Daniel Massicotte

Though mostly used as a clustering algorithm, k-means are originally designed as a quantization algorithm. Namely, it aims at providing a compression of a probability distribution with k points. Building upon [21, 33], we try to investigate…

Statistics Theory · Mathematics 2018-01-31 Clément Levrard

Kernel $k$-means clustering is a powerful tool for unsupervised learning of non-linearly separable data. Since the earliest attempts, researchers have noted that such algorithms often become trapped by local minima arising from…

Machine Learning · Statistics 2020-11-13 Debolina Paul , Saptarshi Chakraborty , Swagatam Das , Jason Xu

We provide statistical learning guarantees for two unsupervised learning tasks in the context of compressive statistical learning, a general framework for resource-efficient large-scale learning that we introduced in a companion paper.The…

Machine Learning · Computer Science 2021-08-18 Rémi Gribonval , Gilles Blanchard , Nicolas Keriven , Yann Traonmilin

This article considers "compressive learning," an approach to large-scale machine learning where datasets are massively compressed before learning (e.g., clustering, classification, or regression) is performed. In particular, a "sketch" is…

The compressive learning framework reduces the computational cost of training on large-scale datasets. In a sketching phase, the data is first compressed to a lightweight sketch vector, obtained by mapping the data samples through a…

Machine Learning · Statistics 2022-04-20 Vincent Schellekens , Laurent Jacques

K-Means algorithm is a popular clustering method. However, it has two limitations: 1) it gets stuck easily in spurious local minima, and 2) the number of clusters k has to be given a priori. To solve these two issues, a multi-prototypes…

Machine Learning · Computer Science 2023-02-15 Dong Li , Shuisheng Zhou , Tieyong Zeng , Raymond H. Chan

The $k$-means algorithm (Lloyd's algorithm) is a widely used method for clustering unlabeled data. A key bottleneck of the $k$-means algorithm is that each iteration requires time linear in the number of data points, which can be expensive…

Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…

Computer Vision and Pattern Recognition · Computer Science 2014-02-18 Radha Chitta , Rong Jin , Timothy C. Havens , Anil K. Jain

We consider a network of binary-valued sensors with a fusion center. The fusion center has to perform K-means clustering on the binary data transmitted by the sensors. In order to reduce the amount of data transmitted within the network,…

Information Theory · Computer Science 2018-01-18 Elsa Dupraz

The learning of mixture models can be viewed as a clustering problem. Indeed, given data samples independently generated from a mixture of distributions, we often would like to find the {\it correct target clustering} of the samples…

Machine Learning · Statistics 2022-08-26 Zhaoqiang Liu , Vincent Y. F. Tan

K-means defines one of the most employed centroid-based clustering algorithms with performances tied to the data's embedding. Intricate data embeddings have been designed to push $K$-means performances at the cost of reduced theoretical…

Machine Learning · Computer Science 2022-02-17 Romain Cosentino , Randall Balestriero , Yanis Bahroun , Anirvan Sengupta , Richard Baraniuk , Behnaam Aazhang

K-means is a classical clustering algorithm with wide applications. However, soft K-means, or fuzzy c-means at m=1, remains unsolved since 1981. To address this challenging open problem, we propose a novel clustering model, i.e.…

Machine Learning · Computer Science 2020-11-23 Yujian Li , Bowen Liu , Zhaoying Liu , Ting Zhang

The $k$-means algorithm is arguably the most popular nonparametric clustering method but cannot generally be applied to datasets with incomplete records. The usual practice then is to either impute missing values under an assumed…

Machine Learning · Statistics 2018-09-11 Andrew Lithio , Ranjan Maitra

Cognitive diagnosis models (CDMs) are a popular tool for assessing students' mastery of sets of skills. Given a set of $K$ skills tested on an assessment, students are classified into one of $2^K$ latent skill set profiles that represent…

Applications · Statistics 2021-04-07 Alan Mishler , Rebecca Nugent

Machine learning algorithms perform well on identifying patterns in many different datasets due to their versatility. However, as one increases the size of the dataset, the computation time for training and using these statistical models…

Quantum Physics · Physics 2024-09-19 Abhijat Sarma , Rupak Chatterjee , Kaitlin Gili , Ting Yu

Kernel-based K-means clustering has gained popularity due to its simplicity and the power of its implicit non-linear representation of the data. A dominant concern is the memory requirement since memory scales as the square of the number of…

Machine Learning · Statistics 2016-12-05 Farhad Pourkamali-Anaraki , Stephen Becker
‹ Prev 1 2 3 10 Next ›