English
Related papers

Related papers: Improved Approximation Algorithms for Relational C…

200 papers

This paper gives a k-means approximation algorithm that is efficient in the relational algorithms model. This is an algorithm that operates directly on a relational database without performing a join to convert it to a matrix whose rows…

Data Structures and Algorithms · Computer Science 2021-05-24 Benjamin Moseley , Kirk Pruhs , Alireza Samadian , Yuyan Wang

The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the…

Machine Learning · Statistics 2018-01-10 Marco Capó , Aritz Pérez , Jose A. Lozano

Clustering is a popular form of unsupervised learning for geometric data. Unfortunately, many clustering algorithms lead to cluster assignments that are hard to explain, partially because they depend on all the features of the data in a…

Machine Learning · Computer Science 2020-09-23 Sanjoy Dasgupta , Nave Frost , Michal Moshkovitz , Cyrus Rashtchian

Conventional machine learning algorithms cannot be applied until a data matrix is available to process. When the data matrix needs to be obtained from a relational database via a feature extraction query, the computation cost can be…

Machine Learning · Computer Science 2019-10-14 Ryan Curtin , Ben Moseley , Hung Q. Ngo , XuanLong Nguyen , Dan Olteanu , Maximilian Schleich

Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…

Computer Vision and Pattern Recognition · Computer Science 2014-02-18 Radha Chitta , Rong Jin , Timothy C. Havens , Anil K. Jain

$k$-Clustering in $\mathbb{R}^d$ (e.g., $k$-median and $k$-means) is a fundamental machine learning problem. While near-linear time approximation algorithms were known in the classical setting for a dataset with cardinality $n$, it remains…

Quantum Physics · Physics 2023-06-06 Yecheng Xue , Xiaoyu Chen , Tongyang Li , Shaofeng H. -C. Jiang

The problem of constrained clustering has attracted significant attention in the past decades. In this paper, we study the balanced $k$-center, $k$-median, and $k$-means clustering problems where the size of each cluster is constrained by…

Computational Geometry · Computer Science 2018-09-11 Hu Ding

Clustering is a fundamental problem in unsupervised machine learning with many applications in data analysis. Popular clustering algorithms such as Lloyd's algorithm and $k$-means++ can take $\Omega(ndk)$ time when clustering $n$ points in…

Machine Learning · Computer Science 2023-10-26 Moses Charikar , Monika Henzinger , Lunjia Hu , Maxmilian Vötsch , Erik Waingarten

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz

$K$-means, a simple and effective clustering algorithm, is one of the most widely used algorithms in multimedia and computer vision community. Traditional $k$-means is an iterative algorithm---in each iteration new cluster centers are…

Computer Vision and Pattern Recognition · Computer Science 2013-12-12 Jingdong Wang , Jing Wang , Qifa Ke , Gang Zeng , Shipeng Li

Clustering is a key task in machine learning, with $k$-means being widely used for its simplicity and effectiveness. While 1D clustering is common, existing methods often fail to exploit the structure of 1D data, leading to inefficiencies.…

Data Structures and Algorithms · Computer Science 2024-12-25 Jake Hyun

Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to ma- nipulate and analyze such information. Even though datasets have grown in size, the K-means algorithm…

Machine Learning · Statistics 2016-05-11 Marco Capó , Aritz Pérez , José Antonio Lozano

The clustering problem, in its many variants, has numerous applications in operations research and computer science (e.g., in applications in bioinformatics, image processing, social network analysis, etc.). As sizes of data sets have grown…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-24 Sayan Bandyapadhyay , Tanmay Inamdar , Shreyas Pai , Sriram V. Pemmaraju

Connected clustering denotes a family of constrained clustering problems in which we are given a distance metric and an undirected connectivity graph $G$ that can be completely unrelated to the metric. The aim is to partition the $n$…

Data Structures and Algorithms · Computer Science 2025-11-25 Jan Eube , Heiko Röglin

The k-means clustering algorithm is a popular algorithm that partitions data into k clusters. There are many improvements to accelerate the standard algorithm. Most current research employs upper and lower bounds on point-to-cluster…

Machine Learning · Computer Science 2024-10-22 Andreas Lang , Erich Schubert

Clustering problems have numerous applications and are becoming more challenging as the size of the data increases. In this paper, we consider designing clustering algorithms that can be used in MapReduce, the most popular programming…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-09 Alina Ene , Sungjin Im , Benjamin Moseley

Bateni et al. has recently introduced the weak-strong distance oracle model to study clustering problems in settings with limited distance information. Given query access to the strong-oracle and weak-oracle in the weak-strong oracle model,…

Data Structures and Algorithms · Computer Science 2026-02-23 Pinki Pradhan , Anup Bhattacharya , Ragesh Jaiswal

In this paper, we study clustering with respect to the k-modes objective function, a natural formulation of clustering for categorical data. One of the main contributions of this paper is to establish the connection between k-modes and…

Artificial Intelligence · Computer Science 2007-05-23 Zengyou He

Clustering is a long-standing research problem and a fundamental tool in AI and data analysis. The traditional k-center problem, a fundamental theoretical challenge in clustering, has a best possible approximation ratio of 2, and any…

Machine Learning · Computer Science 2026-04-28 Chaoqi Jia , Longkun Guo , Kewen Liao , Zhigang Lu , Chao Chen , Jason Xue

Many algorithms for approximate nearest neighbor search in high-dimensional spaces partition the data into clusters. At query time, in order to avoid exhaustive search, an index selects the few (or a single) clusters nearest to the query…

Computer Vision and Pattern Recognition · Computer Science 2010-09-27 Romain Tavenard , Laurent Amsaleg , Hervé Jégou
‹ Prev 1 2 3 10 Next ›