Related papers: Sublinear Time Quantum Sensitivity Sampling

Sensitivity Sampling for $k$-Means: Worst Case and Stability Optimal Coreset Bounds

Coresets are arguably the most popular compression paradigm for center-based clustering objectives such as $k$-means. Given a point set $P$, a coreset $\Omega$ is a small, weighted summary that preserves the cost of all candidate solutions…

Data Structures and Algorithms · Computer Science 2024-05-03 Nikhil Bansal , Vincent Cohen-Addad , Milind Prabhu , David Saulpic , Chris Schwiegelshohn

Near-Optimal Quantum Coreset Construction Algorithms for Clustering

$k$-Clustering in $\mathbb{R}^d$ (e.g., $k$-median and $k$-means) is a fundamental machine learning problem. While near-linear time approximation algorithms were known in the classical setting for a dataset with cardinality $n$, it remains…

Quantum Physics · Physics 2023-06-06 Yecheng Xue , Xiaoyu Chen , Tongyang Li , Shaofeng H. -C. Jiang

New Frameworks for Offline and Streaming Coreset Constructions

A coreset for a set of points is a small subset of weighted points that approximately preserves important properties of the original set. Specifically, if $P$ is a set of points, $Q$ is a set of queries, and $f:P\times Q\to\mathbb{R}$ is a…

Data Structures and Algorithms · Computer Science 2022-09-20 Vladimir Braverman , Dan Feldman , Harry Lang , Adiel Statman , Samson Zhou

Faster Sublinear Algorithms using Conditional Sampling

A conditional sampling oracle for a probability distribution D returns samples from the conditional distribution of D restricted to a specified subset of the domain. A recent line of work (Chakraborty et al. 2013 and Cannone et al. 2014)…

Data Structures and Algorithms · Computer Science 2016-08-18 Themistoklis Gouleakis , Christos Tzamos , Manolis Zampetakis

Quantum Unsupervised and Supervised Learning on Superconducting Processors

Machine learning algorithms perform well on identifying patterns in many different datasets due to their versatility. However, as one increases the size of the dataset, the computation time for training and using these statistical models…

Quantum Physics · Physics 2024-09-19 Abhijat Sarma , Rupak Chatterjee , Kaitlin Gili , Ting Yu

Sublinear quantum algorithms for training linear and kernel-based classifiers

We investigate quantum algorithms for classification, a fundamental problem in machine learning, with provable guarantees. Given $n$ $d$-dimensional data points, the state-of-the-art (and optimal) classical algorithm for training…

Quantum Physics · Physics 2019-05-28 Tongyang Li , Shouvanik Chakrabarti , Xiaodi Wu

Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling

To accelerate kernel methods, we propose a near input sparsity time algorithm for sampling the high-dimensional feature space implicitly defined by a kernel transformation. Our main contribution is an importance sampling method for…

Data Structures and Algorithms · Computer Science 2020-07-15 David P. Woodruff , Amir Zandieh

Provably faster randomized and quantum algorithms for $k$-means clustering via uniform sampling

The $k$-means algorithm (Lloyd's algorithm) is a widely used method for clustering unlabeled data. A key bottleneck of the $k$-means algorithm is that each iteration requires time linear in the number of data points, which can be expensive…

Quantum Physics · Physics 2025-10-14 Tyler Chen , Archan Ray , Akshay Seshadri , Dylan Herman , Bao Bach , Pranav Deshpande , Abhishek Som , Niraj Kumar , Marco Pistoia

Settling Time vs. Accuracy Tradeoffs for Clustering Big Data

We study the theoretical and practical runtime limits of k-means and k-median clustering on large datasets. Since effectively all clustering methods are slower than the time it takes to read the dataset, the fastest approach is to quickly…

Machine Learning · Computer Science 2024-04-03 Andrew Draganov , David Saulpic , Chris Schwiegelshohn

q-means: A quantum algorithm for unsupervised machine learning

Quantum machine learning is one of the most promising applications of a full-scale quantum computer. Over the past few years, many quantum machine learning algorithms have been proposed that can potentially offer considerable speedups over…

Quantum Physics · Physics 2021-06-14 Iordanis Kerenidis , Jonas Landman , Alessandro Luongo , Anupam Prakash

Towards Efficient Quantum Anomaly Detection: One-Class SVMs using Variable Subsampling and Randomized Measurements

Quantum computing, with its potential to enhance various machine learning tasks, allows significant advancements in kernel calculation and model precision. Utilizing the one-class Support Vector Machine alongside a quantum kernel, known for…

Quantum Physics · Physics 2023-12-15 Michael Kölle , Afrae Ahouzi , Pascal Debus , Robert Müller , Danielle Schuman , Claudia Linnhoff-Popien

Efficient Parameter Optimisation for Quantum Kernel Alignment: A Sub-sampling Approach in Variational Training

Quantum machine learning with quantum kernels for classification problems is a growing area of research. Recently, quantum kernel alignment techniques that parameterise the kernel have been developed, allowing the kernel to be trained and…

Quantum Physics · Physics 2024-10-23 M. Emre Sahin , Benjamin C. B. Symons , Pushpak Pati , Fayyaz Minhas , Declan Millar , Maria Gabrani , Stefano Mensa , Jan Lukas Robertus

The Power of Uniform Sampling for Coresets

Motivated by practical generalizations of the classic $k$-median and $k$-means objectives, such as clustering with size constraints, fair clustering, and Wasserstein barycenter, we introduce a meta-theorem for designing coresets for…

Data Structures and Algorithms · Computer Science 2022-09-20 Vladimir Braverman , Vincent Cohen-Addad , Shaofeng H. -C. Jiang , Robert Krauthgamer , Chris Schwiegelshohn , Mads Bech Toftrup , Xuan Wu

A Quantum Approximation Scheme for k-Means

We give a quantum approximation scheme (i.e., $(1 + \varepsilon)$-approximation for every $\varepsilon > 0$) for the classical $k$-means clustering problem in the QRAM model with a running time that has only polylogarithmic dependence on…

Quantum Physics · Physics 2025-05-27 Ragesh Jaiswal

Tight Sensitivity Bounds For Smaller Coresets

An $\varepsilon$-coreset for Least-Mean-Squares (LMS) of a matrix $A\in{\mathbb{R}}^{n\times d}$ is a small weighted subset of its rows that approximates the sum of squared distances from its rows to every affine $k$-dimensional subspace of…

Machine Learning · Computer Science 2019-07-03 Alaa Maalouf , Adiel Statman , Dan Feldman

A simple analysis of a quantum-inspired algorithm for solving low-rank linear systems

We describe and analyze a simple algorithm for sampling from the solution $\mathbf{x}^* := \mathbf{A}^+\mathbf{b}$ to a linear system $\mathbf{A}\mathbf{x} = \mathbf{b}$. We assume access to a sampler which allows us to draw indices…

Data Structures and Algorithms · Computer Science 2025-08-19 Tyler Chen , Junhyung Lyle Kim , Archan Ray , Shouvanik Chakrabarti , Dylan Herman , Niraj Kumar

Data-Efficient Learning via Clustering-Based Sensitivity Sampling: Foundation Models and Beyond

We study the data selection problem, whose aim is to select a small representative subset of data that can be used to efficiently train a machine learning model. We present a new data selection approach based on $k$-means clustering and…

Machine Learning · Computer Science 2024-02-28 Kyriakos Axiotis , Vincent Cohen-Addad , Monika Henzinger , Sammy Jerome , Vahab Mirrokni , David Saulpic , David Woodruff , Michael Wunder

Linear time small coresets for k-mean clustering of segments with applications

We study the $k$-means problem for a set $\mathcal{S} \subseteq \mathbb{R}^d$ of $n$ segments, aiming to find $k$ centers $X \subseteq \mathbb{R}^d$ that minimize $D(\mathcal{S},X) := \sum_{S \in \mathcal{S}} \min_{x \in X} D(S,x)$, where…

Machine Learning · Computer Science 2025-11-21 David Denisov , Shlomi Dolev , Dan Felmdan , Michael Segal

Coresets for Clustering in Euclidean Spaces: Importance Sampling is Nearly Optimal

Given a collection of $n$ points in $\mathbb{R}^d$, the goal of the $(k,z)$-clustering problem is to find a subset of $k$ "centers" that minimizes the sum of the $z$-th powers of the Euclidean distance of each point to the closest center.…

Computational Geometry · Computer Science 2020-05-15 Lingxiao Huang , Nisheeth K. Vishnoi

Stable coresets: Unleashing the power of uniform sampling

Uniform sampling is a highly efficient method for data summarization. However, its effectiveness in producing coresets for clustering problems is not yet well understood, primarily because it generally does not yield a strong coreset, which…

Data Structures and Algorithms · Computer Science 2026-02-19 Amir Carmel , Robert Krauthgamer