English
Related papers

Related papers: Sublinear Time Quantum Sensitivity Sampling

200 papers

Coresets are arguably the most popular compression paradigm for center-based clustering objectives such as $k$-means. Given a point set $P$, a coreset $\Omega$ is a small, weighted summary that preserves the cost of all candidate solutions…

Data Structures and Algorithms · Computer Science 2024-05-03 Nikhil Bansal , Vincent Cohen-Addad , Milind Prabhu , David Saulpic , Chris Schwiegelshohn

$k$-Clustering in $\mathbb{R}^d$ (e.g., $k$-median and $k$-means) is a fundamental machine learning problem. While near-linear time approximation algorithms were known in the classical setting for a dataset with cardinality $n$, it remains…

Quantum Physics · Physics 2023-06-06 Yecheng Xue , Xiaoyu Chen , Tongyang Li , Shaofeng H. -C. Jiang

A coreset for a set of points is a small subset of weighted points that approximately preserves important properties of the original set. Specifically, if $P$ is a set of points, $Q$ is a set of queries, and $f:P\times Q\to\mathbb{R}$ is a…

Data Structures and Algorithms · Computer Science 2022-09-20 Vladimir Braverman , Dan Feldman , Harry Lang , Adiel Statman , Samson Zhou

A conditional sampling oracle for a probability distribution D returns samples from the conditional distribution of D restricted to a specified subset of the domain. A recent line of work (Chakraborty et al. 2013 and Cannone et al. 2014)…

Data Structures and Algorithms · Computer Science 2016-08-18 Themistoklis Gouleakis , Christos Tzamos , Manolis Zampetakis

Machine learning algorithms perform well on identifying patterns in many different datasets due to their versatility. However, as one increases the size of the dataset, the computation time for training and using these statistical models…

Quantum Physics · Physics 2024-09-19 Abhijat Sarma , Rupak Chatterjee , Kaitlin Gili , Ting Yu

We investigate quantum algorithms for classification, a fundamental problem in machine learning, with provable guarantees. Given $n$ $d$-dimensional data points, the state-of-the-art (and optimal) classical algorithm for training…

Quantum Physics · Physics 2019-05-28 Tongyang Li , Shouvanik Chakrabarti , Xiaodi Wu

To accelerate kernel methods, we propose a near input sparsity time algorithm for sampling the high-dimensional feature space implicitly defined by a kernel transformation. Our main contribution is an importance sampling method for…

Data Structures and Algorithms · Computer Science 2020-07-15 David P. Woodruff , Amir Zandieh

The $k$-means algorithm (Lloyd's algorithm) is a widely used method for clustering unlabeled data. A key bottleneck of the $k$-means algorithm is that each iteration requires time linear in the number of data points, which can be expensive…

We study the theoretical and practical runtime limits of k-means and k-median clustering on large datasets. Since effectively all clustering methods are slower than the time it takes to read the dataset, the fastest approach is to quickly…

Machine Learning · Computer Science 2024-04-03 Andrew Draganov , David Saulpic , Chris Schwiegelshohn

Quantum machine learning is one of the most promising applications of a full-scale quantum computer. Over the past few years, many quantum machine learning algorithms have been proposed that can potentially offer considerable speedups over…

Quantum Physics · Physics 2021-06-14 Iordanis Kerenidis , Jonas Landman , Alessandro Luongo , Anupam Prakash

Quantum computing, with its potential to enhance various machine learning tasks, allows significant advancements in kernel calculation and model precision. Utilizing the one-class Support Vector Machine alongside a quantum kernel, known for…

Quantum machine learning with quantum kernels for classification problems is a growing area of research. Recently, quantum kernel alignment techniques that parameterise the kernel have been developed, allowing the kernel to be trained and…

Motivated by practical generalizations of the classic $k$-median and $k$-means objectives, such as clustering with size constraints, fair clustering, and Wasserstein barycenter, we introduce a meta-theorem for designing coresets for…

Data Structures and Algorithms · Computer Science 2022-09-20 Vladimir Braverman , Vincent Cohen-Addad , Shaofeng H. -C. Jiang , Robert Krauthgamer , Chris Schwiegelshohn , Mads Bech Toftrup , Xuan Wu

We give a quantum approximation scheme (i.e., $(1 + \varepsilon)$-approximation for every $\varepsilon > 0$) for the classical $k$-means clustering problem in the QRAM model with a running time that has only polylogarithmic dependence on…

Quantum Physics · Physics 2025-05-27 Ragesh Jaiswal

An $\varepsilon$-coreset for Least-Mean-Squares (LMS) of a matrix $A\in{\mathbb{R}}^{n\times d}$ is a small weighted subset of its rows that approximates the sum of squared distances from its rows to every affine $k$-dimensional subspace of…

Machine Learning · Computer Science 2019-07-03 Alaa Maalouf , Adiel Statman , Dan Feldman

We describe and analyze a simple algorithm for sampling from the solution $\mathbf{x}^* := \mathbf{A}^+\mathbf{b}$ to a linear system $\mathbf{A}\mathbf{x} = \mathbf{b}$. We assume access to a sampler which allows us to draw indices…

Data Structures and Algorithms · Computer Science 2025-08-19 Tyler Chen , Junhyung Lyle Kim , Archan Ray , Shouvanik Chakrabarti , Dylan Herman , Niraj Kumar

We study the data selection problem, whose aim is to select a small representative subset of data that can be used to efficiently train a machine learning model. We present a new data selection approach based on $k$-means clustering and…

We study the $k$-means problem for a set $\mathcal{S} \subseteq \mathbb{R}^d$ of $n$ segments, aiming to find $k$ centers $X \subseteq \mathbb{R}^d$ that minimize $D(\mathcal{S},X) := \sum_{S \in \mathcal{S}} \min_{x \in X} D(S,x)$, where…

Machine Learning · Computer Science 2025-11-21 David Denisov , Shlomi Dolev , Dan Felmdan , Michael Segal

Given a collection of $n$ points in $\mathbb{R}^d$, the goal of the $(k,z)$-clustering problem is to find a subset of $k$ "centers" that minimizes the sum of the $z$-th powers of the Euclidean distance of each point to the closest center.…

Computational Geometry · Computer Science 2020-05-15 Lingxiao Huang , Nisheeth K. Vishnoi

Uniform sampling is a highly efficient method for data summarization. However, its effectiveness in producing coresets for clustering problems is not yet well understood, primarily because it generally does not yield a strong coreset, which…

Data Structures and Algorithms · Computer Science 2026-02-19 Amir Carmel , Robert Krauthgamer
‹ Prev 1 2 3 10 Next ›