Related papers: Coreset Clustering on Small Quantum Computers
Current quantum hardware prohibits any direct use of large classical datasets. Coresets allow for a succinct description of these large datasets and their solution in a computational task is competitive with the solution on the original…
Recent work has proposed solving the k-means clustering problem on quantum computers via the Quantum Approximate Optimization Algorithm (QAOA) and coreset techniques. Although the current method demonstrates the possibility of quantum…
Quantum computing is a promising paradigm based on quantum theory for performing fast computations. Quantum algorithms are expected to surpass their classical counterparts in terms of computational complexity for certain tasks, including…
We devise coresets for kernel $k$-Means with a general kernel, and use them to obtain new, more efficient, algorithms. Kernel $k$-Means has superior clustering capability compared to classical $k$-Means, particularly when clusters are…
$k$-Clustering in $\mathbb{R}^d$ (e.g., $k$-median and $k$-means) is a fundamental machine learning problem. While near-linear time approximation algorithms were known in the classical setting for a dataset with cardinality $n$, it remains…
Quantum computing is anticipated to offer immense computational capabilities which could provide efficient solutions to many data science problems. However, the current generation of quantum devices are small and noisy, which makes it…
Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive…
We introduce hybrid classical-quantum algorithms for problems involving a large classical data set X and a space of models Y such that a quantum computer has superposition access to Y but not X. These algorithms use data reduction…
Machine learning algorithms perform well on identifying patterns in many different datasets due to their versatility. However, as one increases the size of the dataset, the computation time for training and using these statistical models…
Clustering is a powerful machine learning technique that groups "similar" data points based on their characteristics. Many clustering algorithms work by approximating the minimization of an objective function, namely the sum of…
In the NISQ-era of quantum computing, we should not expect to see quantum devices that provide an exponential improvement in runtime for practical problems, due to the lack of error correction and small number of qubits available.…
In a recent work, [19] studied the following "fair" variants of classical clustering problems such as $k$-means and $k$-median: given a set of $n$ data points in $\mathbb{R}^d$ and a binary type associated to each data point, the goal is to…
We design coresets for Ordered k-Median, a generalization of classical clustering problems such as k-Median and k-Center, that offers a more flexible data analysis, like easily combining multiple objectives (e.g., to increase fairness or…
We present methods for k-means clustering on a stream with a focus on providing fast responses to clustering queries. Compared to the current state-of-the-art, our methods provide substantial improvement in the query time for cluster…
$\renewcommand{\Re}{{\rm I\!\hspace{-0.025em} R}} \newcommand{\eps}{{\varepsilon}} \newcommand{\Coreset}{{\mathcal{S}}} $ In this paper, we show the existence of small coresets for the problems of computing $k$-median and $k$-means…
Quantum algorithms can be used to perform unsupervised machine learning tasks like data clustering by mapping the distance between data points to a graph optimization problem (i.e. MAXCUT) and finding optimal solution through energy…
Hybrid quantum-classical algorithms such as the quantum approximate optimization algorithm (QAOA) are considered one of the most promising approaches for leveraging near-term quantum computers for practical applications. Such algorithms are…
Quantum machine learning is one of the most promising applications of a full-scale quantum computer. Over the past few years, many quantum machine learning algorithms have been proposed that can potentially offer considerable speedups over…
Clustering is one of the most important tools for analysis of large datasets, and perhaps the most popular clustering algorithm is Lloyd's algorithm for $k$-means. This algorithm takes $n$ vectors $V=[v_1,\dots,v_n]\in\mathbb{R}^{d\times…
Coresets are among the most popular paradigms for summarizing data. In particular, there exist many high performance coresets for clustering problems such as $k$-means in both theory and practice. Curiously, there exists no work on…