Related papers: Approximate kernel clustering

Sharp kernel clustering algorithms and their associated Grothendieck inequalities

In the kernel clustering problem we are given a (large) $n\times n$ symmetric positive semidefinite matrix $A=(a_{ij})$ with $\sum_{i=1}^n\sum_{j=1}^n a_{ij}=0$ and a (small) $k\times k$ symmetric positive semidefinite matrix $B=(b_{ij})$.…

Data Structures and Algorithms · Computer Science 2009-06-29 Subhash Khot , Assaf Naor

New Algorithms and Hardness Results for Connected Clustering

Connected clustering denotes a family of constrained clustering problems in which we are given a distance metric and an undirected connectivity graph $G$ that can be completely unrelated to the metric. The aim is to partition the $n$…

Data Structures and Algorithms · Computer Science 2025-11-25 Jan Eube , Heiko Röglin

A Subquadratic Time Approximation Algorithm for Individually Fair k-Center

We study the $k$-center problem in the context of individual fairness. Let $P$ be a set of $n$ points in a metric space and $r_x$ be the distance between $x \in P$ and its $\lceil n/k \rceil$-th nearest neighbor. The problem asks to…

Data Structures and Algorithms · Computer Science 2025-03-26 Matthijs Ebbens , Nicole Funk , Jan Höckendorff , Christian Sohler , Vera Weil

Diversity-aware clustering: Computational Complexity and Approximation Algorithms

In this work, we study diversity-aware clustering problems where the data points are associated with multiple attributes resulting in intersecting groups. A clustering solution needs to ensure that the number of chosen cluster centers from…

Data Structures and Algorithms · Computer Science 2025-05-21 Suhas Thejaswi , Ameet Gadekar , Bruno Ordozgoiti , Aristides Gionis

Scalable Kernel Clustering: Approximate Kernel k-means

Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…

Computer Vision and Pattern Recognition · Computer Science 2014-02-18 Radha Chitta , Rong Jin , Timothy C. Havens , Anil K. Jain

Approximation Algorithms for Bregman Co-clustering and Tensor Clustering

In the past few years powerful generalizations to the Euclidean k-means problem have been made, such as Bregman clustering [7], co-clustering (i.e., simultaneous clustering of rows and columns of an input matrix) [9,18], and tensor…

Data Structures and Algorithms · Computer Science 2009-11-09 Stefanie Jegelka , Suvrit Sra , Arindam Banerjee

Clustering in Varying Metrics

We introduce the aggregated clustering problem, where one is given $T$ instances of a center-based clustering task over the same $n$ points, but under different metrics. The goal is to open $k$ centers to minimize an aggregate of the…

Data Structures and Algorithms · Computer Science 2025-10-10 Deeparnab Chakrabarty , Jonathan Conroy , Ankita Sarkar

Guessing Efficiently for Constrained Subspace Approximation

In this paper we study constrained subspace approximation problem. Given a set of $n$ points $\{a_1,\ldots,a_n\}$ in $\mathbb{R}^d$, the goal of the {\em subspace approximation} problem is to find a $k$ dimensional subspace that best…

Data Structures and Algorithms · Computer Science 2025-04-30 Aditya Bhaskara , Sepideh Mahabadi , Madhusudhan Reddy Pittu , Ali Vakilian , David P. Woodruff

Solution of the propeller conjecture in $\mathbb{R}^3$

It is shown that every measurable partition ${A_1,..., A_k}$ of $\mathbb{R}^3$ satisfies $$\sum_{i=1}^k||\int_{A_i} xe^{-\frac12||x||_2^2}dx||_2^2\le 9\pi^2.\qquad(*)$$ Let ${P_1,P_2,P_3}$ be the partition of $\mathbb{R}^2$ into $120^\circ$…

Computational Complexity · Computer Science 2014-04-08 Steven Heilman , Aukosh Jagannath , Assaf Naor

Near-Optimal Clustering in the $k$-machine model

The clustering problem, in its many variants, has numerous applications in operations research and computer science (e.g., in applications in bioinformatics, image processing, social network analysis, etc.). As sizes of data sets have grown…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-24 Sayan Bandyapadhyay , Tanmay Inamdar , Shreyas Pai , Sriram V. Pemmaraju

Balanced $k$-Center Clustering When $k$ Is A Constant

The problem of constrained $k$-center clustering has attracted significant attention in the past decades. In this paper, we study balanced $k$-center cluster where the size of each cluster is constrained by the given lower and upper bounds.…

Computational Geometry · Computer Science 2017-04-11 Hu Ding

Constant factor approximations for Lower and Upper bounded Clusterings

Clustering is one of the most fundamental problem in Machine Learning. Researchers in the field often require a lower bound on the size of the clusters to maintain anonymity and upper bound for the ease of analysis. Specifying an optimal…

Data Structures and Algorithms · Computer Science 2022-03-29 Neelima Gupta , Sapna Grover , Rajni Dabas

Universal NP-Hardness of Clustering under General Utilities

Clustering is a central primitive in unsupervised learning, yet practice is dominated by heuristics whose outputs can be unstable and highly sensitive to representations, hyperparameters, and initialisation. Existing theoretical results are…

Computational Complexity · Computer Science 2026-03-03 Angshul Majumdar

Faster Kernel Matrix Algebra via Density Estimation

We study fast algorithms for computing fundamental properties of a positive semidefinite kernel matrix $K \in \mathbb{R}^{n \times n}$ corresponding to $n$ points $x_1,\ldots,x_n \in \mathbb{R}^d$. In particular, we consider estimating the…

Data Structures and Algorithms · Computer Science 2021-06-21 Arturs Backurs , Piotr Indyk , Cameron Musco , Tal Wagner

A bi-criteria approximation algorithm for $k$ Means

We consider the classical $k$-means clustering problem in the setting bi-criteria approximation, in which an algoithm is allowed to output $\beta k > k$ clusters, and must produce a clustering with cost at most $\alpha$ times the to the…

Data Structures and Algorithms · Computer Science 2015-08-04 Konstantin Makarychev , Yury Makarychev , Maxim Sviridenko , Justin Ward

A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering

In this work, we study pairwise fair clustering with $\ell \ge 2$ groups, where for every cluster $C$ and every group $i \in [\ell]$, the number of points in $C$ from group $i$ must be at most $t$ times the number of points in $C$ from any…

Data Structures and Algorithms · Computer Science 2025-02-28 Sayan Bandyapadhyay , Eden Chlamtáč , Zachary Friggstad , Mahya Jamshidian , Yury Makarychev , Ali Vakilian

Clustering with fair-center representation: parameterized approximation algorithms and heuristics

We study a variant of classical clustering formulations in the context of algorithmic fairness, known as diversity-aware clustering. In this variant we are given a collection of facility subsets, and a solution must contain at least a…

Data Structures and Algorithms · Computer Science 2022-10-25 Suhas Thejaswi , Ameet Gadekar , Bruno Ordozgoiti , Michal Osadnik

Near-Optimal Quantum Coreset Construction Algorithms for Clustering

$k$-Clustering in $\mathbb{R}^d$ (e.g., $k$-median and $k$-means) is a fundamental machine learning problem. While near-linear time approximation algorithms were known in the classical setting for a dataset with cardinality $n$, it remains…

Quantum Physics · Physics 2023-06-06 Yecheng Xue , Xiaoyu Chen , Tongyang Li , Shaofeng H. -C. Jiang

Feature-based Individual Fairness in k-Clustering

Ensuring fairness in machine learning algorithms is a challenging and essential task. We consider the problem of clustering a set of points while satisfying fairness constraints. While there have been several attempts to capture group…

Machine Learning · Computer Science 2023-02-07 Debajyoti Kar , Mert Kosan , Debmalya Mandal , Sourav Medya , Arlei Silva , Palash Dey , Swagato Sanyal

On Generalization Bounds for Projective Clustering

Given a set of points, clustering consists of finding a partition of a point set into $k$ clusters such that the center to which a point is assigned is as close as possible. Most commonly, centers are points themselves, which leads to the…

Machine Learning · Computer Science 2023-10-16 Maria Sofia Bucarelli , Matilde Fjeldsø Larsen , Chris Schwiegelshohn , Mads Bech Toftrup