English
Related papers

Related papers: A Statistical Perspective on Coreset Density Estim…

200 papers

Coresets are among the most popular paradigms for summarizing data. In particular, there exist many high performance coresets for clustering problems such as $k$-means in both theory and practice. Curiously, there exists no work on…

Data Structures and Algorithms · Computer Science 2022-07-05 Chris Schwiegelshohn , Omar Ali Sheikh-Omar

Kernel regression is an essential and ubiquitous tool for non-parametric data analysis, particularly popular among time series and spatial data. However, the central operation which is performed many times, evaluating a kernel on the data…

Machine Learning · Computer Science 2017-06-01 Yan Zheng , Jeff M. Phillips

Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms. It strives to identify a small subset from large-scale data, so that training only on the subset practically…

Machine Learning · Computer Science 2024-03-01 Xiaobo Xia , Jiale Liu , Shaokun Zhang , Qingyun Wu , Hongxin Wei , Tongliang Liu

The size of large, geo-located datasets has reached scales where visualization of all data points is inefficient. Random sampling is a method to reduce the size of a dataset, yet it can introduce unwanted errors. We describe a method for…

Human-Computer Interaction · Computer Science 2017-09-14 Yan Zheng , Yi Ou , Alexander Lex , Jeff M. Phillips

Coreset is usually a small weighted subset of $n$ input points in $\mathbb{R}^d$, that provably approximates their loss function for a given set of queries (models, classifiers, etc.). Coresets become increasingly common in machine learning…

Machine Learning · Computer Science 2020-06-22 Murad Tukan , Alaa Maalouf , Dan Feldman

Bayesian coresets have emerged as a promising approach for implementing scalable Bayesian inference. The Bayesian coreset problem involves selecting a (weighted) subset of the data samples, such that the posterior inference using the…

Machine Learning · Statistics 2021-03-01 Jacky Y. Zhang , Rajiv Khanna , Anastasios Kyrillidis , Oluwasanmi Koyejo

The coresets approach, also called subsampling or subset selection, aims to select a subsample as a surrogate for the observed sample and has found extensive applications in large-scale data analysis. Existing coresets methods construct the…

Computation · Statistics 2024-09-17 Mengyu Li , Jun Yu , Tao Li , Cheng Meng

Specific data compression techniques, formalized by the concept of coresets, proved to be powerful for many optimization problems. In fact, while tightly controlling the approximation error, coresets may lead to significant speed up of the…

Optimization and Control · Mathematics 2022-04-05 Maximilian Fiedler , Peter Gritzmann , Fabian Klemm

In optimization or machine learning problems we are given a set of items, usually points in some metric space, and the goal is to minimize or maximize an objective function over some space of candidate solutions. For example, in clustering…

Machine Learning · Computer Science 2020-11-19 Dan Feldman

A coreset (or core-set) of a dataset is its semantic compression with respect to a set of queries, such that querying the (small) coreset provably yields an approximate answer to querying the original (full) dataset. In the last decade,…

Robotics · Computer Science 2017-12-19 Soliman Nasser , Ibrahim Jubran , Dan Feldman

A coreset for a set of points is a small subset of weighted points that approximately preserves important properties of the original set. Specifically, if $P$ is a set of points, $Q$ is a set of queries, and $f:P\times Q\to\mathbb{R}$ is a…

Data Structures and Algorithms · Computer Science 2022-09-20 Vladimir Braverman , Dan Feldman , Harry Lang , Adiel Statman , Samson Zhou

A coreset is a point set containing information about geometric properties of a larger point set. A series of previous works show that in many machine learning problems, especially in clustering problems, coreset could be very useful to…

Data Structures and Algorithms · Computer Science 2022-10-18 Yichuan Deng , Zhao Song , Yitan Wang , Yuanyuan Yang

A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as…

Machine Learning · Computer Science 2020-02-21 Pietro Barbiero , Giovanni Squillero , Alberto Tonda

Coresets are small data summaries that are sufficient for model training. They can be maintained online, enabling efficient handling of large data streams under resource constraints. However, existing constructions are limited to simple…

Machine Learning · Computer Science 2020-10-23 Zalán Borsos , Mojmír Mutný , Andreas Krause

A coreset (or core-set) of an input set is its small summation, such that solving a problem on the coreset as its input, provably yields the same result as solving the same problem on the original (full) set, for a given family of problems…

Machine Learning · Computer Science 2019-10-22 Ibrahim Jubran , Alaa Maalouf , Dan Feldman

The increasing availability of massive data sets poses a series of challenges for machine learning. Prominent among these is the need to learn models under hardware or human resource constraints. In such resource-constrained settings, a…

Machine Learning · Computer Science 2021-09-28 Zalán Borsos , Mojmír Mutný , Marco Tagliasacchi , Andreas Krause

Coresets are one of the central methods to facilitate the analysis of large data sets. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show a negative result, namely, that no strongly…

Data Structures and Algorithms · Computer Science 2021-03-09 Alexander Munteanu , Chris Schwiegelshohn , Christian Sohler , David P. Woodruff

A \emph{strong coreset} for the mean queries of a set $P$ in ${\mathbb{R}}^d$ is a small weighted subset $C\subseteq P$, which provably approximates its sum of squared distances to any center (point) $x\in {\mathbb{R}}^d$. A \emph{weak…

Machine Learning · Computer Science 2021-11-05 Alaa Maalouf , Ibrahim Jubran , Dan Feldman

We refine and generalize what is known about coresets for classification problems via the sensitivity sampling framework. Such coresets seek the smallest possible subsets of input data, so one can optimize a loss function on the coreset and…

Machine Learning · Computer Science 2024-07-24 Meysam Alishahi , Jeff M. Phillips

Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive…

Machine Learning · Statistics 2018-06-08 Olivier Bachem , Mario Lucic , Andreas Krause
‹ Prev 1 2 3 10 Next ›