English
Related papers

Related papers: Nested Mini-Batch K-Means

200 papers

Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to ma- nipulate and analyze such information. Even though datasets have grown in size, the K-means algorithm…

Machine Learning · Statistics 2016-05-11 Marco Capó , Aritz Pérez , José Antonio Lozano

We present the first mini-batch kernel $k$-means algorithm, offering an order of magnitude improvement in running time compared to the full batch algorithm. A single iteration of our algorithm takes $\widetilde{O}(kb^2)$ time, significantly…

Machine Learning · Computer Science 2024-10-10 Ben Jourdan , Gregory Schwartzman

k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since k-means depends mainly on distance calculation between all data points and the centers, the time cost will be high when the size of…

Data Structures and Algorithms · Computer Science 2011-08-08 Raied Salman , Vojislav Kecman , Qi Li , Robert Strack , Erik Test

The K-Means clustering using LLoyd's algorithm is an iterative approach to partition the given dataset into K different clusters. The algorithm assigns each point to the cluster based on the following objective function \[\ \min…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-21 Ashish Srivastava , Mohammed Nawfal

We analyze online \cite{BottouBengio} and mini-batch \cite{Sculley} $k$-means variants. Both scale up the widely used $k$-means algorithm via stochastic approximation, and have become popular for large-scale clustering and unsupervised…

Machine Learning · Computer Science 2016-11-17 Cheng Tang , Claire Monteleoni

$K$-means, a simple and effective clustering algorithm, is one of the most widely used algorithms in multimedia and computer vision community. Traditional $k$-means is an iterative algorithm---in each iteration new cluster centers are…

Computer Vision and Pattern Recognition · Computer Science 2013-12-12 Jingdong Wang , Jing Wang , Qifa Ke , Gang Zeng , Shipeng Li

In this paper, the decades-old clustering method k-means is revisited. The original distortion minimization model of k-means is addressed by a pure stochastic minimization procedure. In each step of the iteration, one sample is tentatively…

Machine Learning · Computer Science 2020-05-20 Wan-Lei Zhao , Run-Qing Chen , Hui Ye , Chong-Wah Ngo

K-means is one of the most widely used clustering algorithms in various disciplines, especially for large datasets. However the method is known to be highly sensitive to initial seed selection of cluster centers. K-means++ has been proposed…

Machine Learning · Computer Science 2016-04-19 Fouad Khan

This paper presents a novel accelerated exact k-means algorithm called the Ball k-means algorithm, which uses a ball to describe a cluster, focusing on reducing the point-centroid distance computation. The Ball k-means can accurately find…

Machine Learning · Computer Science 2020-05-05 Shuyin Xia , Daowan Peng , Deyu Meng , Changqing Zhang , Guoyin Wang , Zizhong Chen , Wei Wei

This thesis aims to invent new approaches for making inferences with the k-means algorithm. k-means is an iterative clustering algorithm that randomly assigns k centroids, then assigns data points to the nearest centroid, and updates…

Machine Learning · Computer Science 2024-10-24 Alfred K. Adzika , Prudence Djagba

Motivated by the increasing availability of low- and mixed-precision arithmetic on modern hardware, we develop mixed-precision variants of Lloyd's algorithm for k-means clustering. The main ingredient is a family of mixed-precision kernels…

Numerical Analysis · Mathematics 2026-05-26 Erin Carson , Xinye Chen , Xiaobo Liu

The problem of constrained clustering has attracted significant attention in the past decades. In this paper, we study the balanced $k$-center, $k$-median, and $k$-means clustering problems where the size of each cluster is constrained by…

Computational Geometry · Computer Science 2018-09-11 Hu Ding

We propose a novel method to accelerate Lloyd's algorithm for K-Means clustering. Unlike previous acceleration approaches that reduce computational cost per iterations or improve initialization, our approach is focused on reducing the…

Machine Learning · Computer Science 2018-05-29 Juyong Zhang , Yuxin Yao , Yue Peng , Hao Yu , Bailin Deng

Mini-batch algorithms have been proposed as a way to speed-up stochastic convex optimization problems. We study how such algorithms can be improved using accelerated gradient methods. We provide a novel analysis, which shows how standard…

Machine Learning · Computer Science 2011-06-24 Andrew Cotter , Ohad Shamir , Nathan Srebro , Karthik Sridharan

In the era of big data, k-means clustering has been widely adopted as a basic processing tool in various contexts. However, its computational cost could be prohibitively high as the data size and the cluster number are large. It is well…

Machine Learning · Computer Science 2017-05-05 Cheng-Hao Deng , Wan-Lei Zhao

This paper extends k-means algorithms from the Euclidean domain to the domain of graphs. To recompute the centroids, we apply subgradient methods for solving the optimization-based formulation of the sample mean of graphs. To accelerate the…

Artificial Intelligence · Computer Science 2009-12-24 Brijnesh J. Jain , Klaus Obermayer

Often, machine learning applications have to cope with dynamic environments where data are collected in the form of continuous data streams with potentially infinite length and transient behavior. Compared to traditional (batch) data…

Machine Learning · Computer Science 2021-12-21 Guilherme Cassales , Heitor Gomes , Albert Bifet , Bernhard Pfahringer , Hermes Senger

The $k$-Means clustering problem on $n$ points is NP-Hard for any dimension $d\ge 2$, however, for the 1D case there exists exact polynomial time algorithms. Previous literature reported an $O(kn^2)$ time dynamic programming algorithm that…

Data Structures and Algorithms · Computer Science 2018-04-26 Allan Grønlund , Kasper Green Larsen , Alexander Mathiasen , Jesper Sindahl Nielsen , Stefan Schneider , Mingzhou Song

We answer the question: "Does local progress (on batches) imply global progress (on the entire dataset) for mini-batch $k$-means?". Specifically, we consider mini-batch $k$-means which terminates only when the improvement in the quality of…

Machine Learning · Computer Science 2023-04-04 Gregory Schwartzman

Clustering is an effective technique in data mining to generate groups that are the matter of interest. Among various clustering approaches, the family of k-means algorithms and min-cut algorithms gain most popularity due to their…

Machine Learning · Computer Science 2014-11-25 Xiaojun Chang , Feiping Nie , Zhigang Ma , Yi Yang
‹ Prev 1 2 3 10 Next ›