Related papers: Guessing Efficiently for Constrained Subspace Appr…

Algorithms and Hardness for Subspace Approximation

The subspace approximation problem Subspace($k$,$p$) asks for a $k$-dimensional linear subspace that fits a given set of points optimally, where the error for fitting is a generalization of the least squares fit and uses the $\ell_{p}$ norm…

Data Structures and Algorithms · Computer Science 2011-01-04 Amit Deshpande , Kasturi Varadarajan , Madhur Tulsiani , Nisheeth K. Vishnoi

Almost-Optimal Upper and Lower Bounds for Clustering in Low Dimensional Euclidean Spaces

The $k$-median and $k$-means clustering objectives are classic objectives for modeling clustering in a metric space. Given a set of points in a metric space, the goal of the $k$-median (resp. $k$-means) problem is to find $k$ representative…

Computational Geometry · Computer Science 2026-03-11 Vincent Cohen-Addad , Karthik C. S. , David Saulpic , Chris Schwiegelshohn

Faster Projective Clustering Approximation of Big Data

In projective clustering we are given a set of n points in $R^d$ and wish to cluster them to a set $S$ of $k$ linear subspaces in $R^d$ according to some given distance function. An $\eps$-coreset for this problem is a weighted (scaled)…

Data Structures and Algorithms · Computer Science 2020-11-30 Adiel Statman , Liat Rozenberg , Dan Feldman

Fast approximate $\ell$-center clustering in high dimensional spaces

We study the design of efficient approximation algorithms for the $\ell$-center clustering and minimum-diameter $\ell$-clustering problems in high dimensional Euclidean and Hamming spaces. Our main tool is randomized dimension reduction.…

Data Structures and Algorithms · Computer Science 2025-12-04 Mirosław Kowaluk , Andrzej Lingas , Mia Persson

New Algorithms and Hardness Results for Connected Clustering

Connected clustering denotes a family of constrained clustering problems in which we are given a distance metric and an undirected connectivity graph $G$ that can be completely unrelated to the metric. The aim is to partition the $n$…

Data Structures and Algorithms · Computer Science 2025-11-25 Jan Eube , Heiko Röglin

Clustering What Matters in Constrained Settings

Constrained clustering problems generalize classical clustering formulations, e.g., $k$-median, $k$-means, by imposing additional constraints on the feasibility of clustering. There has been significant recent progress in obtaining…

Data Structures and Algorithms · Computer Science 2025-04-22 Ragesh Jaiswal , Amit Kumar

Near-Optimal Quantum Coreset Construction Algorithms for Clustering

$k$-Clustering in $\mathbb{R}^d$ (e.g., $k$-median and $k$-means) is a fundamental machine learning problem. While near-linear time approximation algorithms were known in the classical setting for a dataset with cardinality $n$, it remains…

Quantum Physics · Physics 2023-06-06 Yecheng Xue , Xiaoyu Chen , Tongyang Li , Shaofeng H. -C. Jiang

Greedy Subspace Clustering

We consider the problem of subspace clustering: given points that lie on or near the union of many low-dimensional linear subspaces, recover the subspaces. To this end, one first identifies sets of points close to the same subspace and uses…

Machine Learning · Statistics 2014-11-03 Dohyung Park , Constantine Caramanis , Sujay Sanghavi

Leveraging Union of Subspace Structure to Improve Constrained Clustering

Many clustering problems in computer vision and other contexts are also classification problems, where each cluster shares a meaningful label. Subspace clustering algorithms in particular are often applied to problems that fit this…

Machine Learning · Computer Science 2017-09-15 John Lipor , Laura Balzano

Inapproximability of Maximum Diameter Clustering for Few Clusters

In the Max-k-diameter problem, we are given a set of points in a metric space, and the goal is to partition the input points into k parts such that the maximum pairwise distance between points in the same part of the partition is minimized.…

Computational Geometry · Computer Science 2024-04-08 Henry Fleischmann , Kyrylo Karlov , Karthik C. S. , Ashwin Padaki , Stepan Zharkov

Faster Balanced Clusterings in High Dimension

The problem of constrained clustering has attracted significant attention in the past decades. In this paper, we study the balanced $k$-center, $k$-median, and $k$-means clustering problems where the size of each cluster is constrained by…

Computational Geometry · Computer Science 2018-09-11 Hu Ding

Faster Approximation Algorithms for k-Center via Data Reduction

We study efficient algorithms for the Euclidean $k$-Center problem, focusing on the regime of large $k$. We take the approach of data reduction by considering $\alpha$-coreset, which is a small subset $S$ of the dataset $P$ such that any…

Data Structures and Algorithms · Computer Science 2025-02-11 Arnold Filtser , Shaofeng H. -C. Jiang , Yi Li , Anurag Murty Naredla , Ioannis Psarros , Qiaoyuan Yang , Qin Zhang

Approximate Counting in Local Lemma Regimes

We establish efficient approximate counting algorithms for several natural problems in local lemma regimes. In particular, we consider the probability of intersection of events and the dimension of intersection of subspaces. Our approach is…

Data Structures and Algorithms · Computer Science 2025-12-12 Ryan L. Mann , Gabriel Waite

A Unified Framework for Approximating and Clustering Data

Given a set $F$ of $n$ positive functions over a ground set $X$, we consider the problem of computing $x^*$ that minimizes the expression $\sum_{f\in F}f(x)$, over $x\in X$. A typical application is \emph{shape fitting}, where we wish to…

Machine Learning · Computer Science 2016-05-31 Dan Feldman , Michael Langberg

Parameterized Approximation Schemes for Clustering with General Norm Objectives

This paper considers the well-studied algorithmic regime of designing a $(1+\epsilon)$-approximation algorithm for a $k$-clustering problem that runs in time $f(k,\epsilon)poly(n)$ (sometimes called an efficient parameterized approximation…

Data Structures and Algorithms · Computer Science 2023-04-07 Fateme Abbasi , Sandip Banerjee , Jarosław Byrka , Parinya Chalermsook , Ameet Gadekar , Kamyar Khodamoradi , Dániel Marx , Roohani Sharma , Joachim Spoerhase

Subspace Clustering using Ensembles of $K$-Subspaces

Subspace clustering is the unsupervised grouping of points lying near a union of low-dimensional linear subspaces. Algorithms based directly on geometric properties of such data tend to either provide poor empirical performance, lack…

Computer Vision and Pattern Recognition · Computer Science 2021-01-08 John Lipor , David Hong , Yan Shuo Tan , Laura Balzano

Approximation Algorithms for Bregman Co-clustering and Tensor Clustering

In the past few years powerful generalizations to the Euclidean k-means problem have been made, such as Bregman clustering [7], co-clustering (i.e., simultaneous clustering of rows and columns of an input matrix) [9,18], and tensor…

Data Structures and Algorithms · Computer Science 2009-11-09 Stefanie Jegelka , Suvrit Sra , Arindam Banerjee

Accurate MapReduce Algorithms for $k$-median and $k$-means in General Metric Spaces

Center-based clustering is a fundamental primitive for data analysis and becomes very challenging for large datasets. In this paper, we focus on the popular $k$-median and $k$-means variants which, given a set $P$ of points from a metric…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-01 Alessio Mazzetto , Andrea Pietracaprina , Geppino Pucci

On Generalization Bounds for Projective Clustering

Given a set of points, clustering consists of finding a partition of a point set into $k$ clusters such that the center to which a point is assigned is as close as possible. Most commonly, centers are points themselves, which leads to the…

Machine Learning · Computer Science 2023-10-16 Maria Sofia Bucarelli , Matilde Fjeldsø Larsen , Chris Schwiegelshohn , Mads Bech Toftrup

Subspace approximation with outliers

The subspace approximation problem with outliers, for given $n$ points in $d$ dimensions $x_{1},\ldots, x_{n} \in R^{d}$, an integer $1 \leq k \leq d$, and an outlier parameter $0 \leq \alpha \leq 1$, is to find a $k$-dimensional linear…

Computational Geometry · Computer Science 2020-07-01 Amit Deshpande , Rameshwar Pratap