English
Related papers

Related papers: Joint Coreset Construction and Quantization for Di…

200 papers

Coreset, which is a summary of the original dataset in the form of a small weighted set in the same sample space, provides a promising approach to enable machine learning over distributed data. Although viewed as a proxy of the original…

Machine Learning · Computer Science 2020-06-24 Hanlin Lu , Ming-Ju Li , Ting He , Shiqiang Wang , Vijaykrishnan Narayanan , Kevin S Chan

We investigate coresets - succinct, small summaries of large data sets - so that solutions found on the summary are provably competitive with solution found on the full data set. We provide an overview over the state-of-the-art in coreset…

Machine Learning · Statistics 2017-06-06 Olivier Bachem , Mario Lucic , Andreas Krause

Coreset of a given dataset and loss function is usually a small weighed set that approximates this loss for every query from a given set of queries. Coresets have shown to be very useful in many applications. However, coresets construction…

Machine Learning · Computer Science 2021-11-05 Alaa Maalouf , Gilad Eini , Ben Mussay , Dan Feldman , Margarita Osadchy

In optimization or machine learning problems we are given a set of items, usually points in some metric space, and the goal is to minimize or maximize an objective function over some space of candidate solutions. For example, in clustering…

Machine Learning · Computer Science 2020-11-19 Dan Feldman

A coreset for a set of points is a small subset of weighted points that approximately preserves important properties of the original set. Specifically, if $P$ is a set of points, $Q$ is a set of queries, and $f:P\times Q\to\mathbb{R}$ is a…

Data Structures and Algorithms · Computer Science 2022-09-20 Vladimir Braverman , Dan Feldman , Harry Lang , Adiel Statman , Samson Zhou

We study the problem of constructing coresets for $(k, z)$-clustering when the input dataset is corrupted by stochastic noise drawn from a known distribution. In this setting, evaluating the quality of a coreset is inherently challenging,…

Machine Learning · Computer Science 2025-10-28 Lingxiao Huang , Zhize Li , Nisheeth K. Vishnoi , Runkai Yang , Haoyu Zhao

A coreset is a point set containing information about geometric properties of a larger point set. A series of previous works show that in many machine learning problems, especially in clustering problems, coreset could be very useful to…

Data Structures and Algorithms · Computer Science 2022-10-18 Yichuan Deng , Zhao Song , Yitan Wang , Yuanyuan Yang

The increasing availability of massive data sets poses a series of challenges for machine learning. Prominent among these is the need to learn models under hardware or human resource constraints. In such resource-constrained settings, a…

Machine Learning · Computer Science 2021-09-28 Zalán Borsos , Mojmír Mutný , Marco Tagliasacchi , Andreas Krause

Efficient and scalable non-parametric or semi-parametric regression analysis and density estimation are of crucial importance to the fields of statistics and machine learning. However, available methods are limited in their ability to…

Machine Learning · Computer Science 2026-03-23 Zeyu Ding , Katja Ickstadt , Nadja Klein , Alexander Munteanu , Simon Omlor

A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as…

Machine Learning · Computer Science 2020-02-21 Pietro Barbiero , Giovanni Squillero , Alberto Tonda

Quantum neural networks (QNNs) and quantum kernels stand as prominent figures in the realm of quantum machine learning, poised to leverage the nascent capabilities of near-term quantum computers to surmount classical machine learning…

Quantum Physics · Physics 2023-12-14 Yiming Huang , Huiyuan Wang , Yuxuan Du , Xiao Yuan

Coreset (or core-set) is a small weighted \emph{subset} $Q$ of an input set $P$ with respect to a given \emph{monotonic} function $f:\mathbb{R}\to\mathbb{R}$ that \emph{provably} approximates its fitting loss $\sum_{p\in P}f(p\cdot x)$ to…

Machine Learning · Computer Science 2021-12-24 Elad Tolochinsky , Ibrahim Jubran , Dan Feldman

Specific data compression techniques, formalized by the concept of coresets, proved to be powerful for many optimization problems. In fact, while tightly controlling the approximation error, coresets may lead to significant speed up of the…

Optimization and Control · Mathematics 2022-04-05 Maximilian Fiedler , Peter Gritzmann , Fabian Klemm

We study the problem of distributed mean estimation and optimization under communication constraints. We propose a correlated quantization protocol whose leading term in the error guarantee depends on the mean deviation of data points…

Machine Learning · Computer Science 2022-07-12 Ananda Theertha Suresh , Ziteng Sun , Jae Hun Ro , Felix Yu

Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms. It strives to identify a small subset from large-scale data, so that training only on the subset practically…

Machine Learning · Computer Science 2024-03-01 Xiaobo Xia , Jiale Liu , Shaokun Zhang , Qingyun Wu , Hongxin Wei , Tongliang Liu

A coreset is a tiny weighted subset of an input set, that closely resembles the loss function, with respect to a certain set of queries. Coresets became prevalent in machine learning as they have shown to be advantageous for many…

Machine Learning · Computer Science 2023-05-23 Alaa Maalouf , Murad Tukan , Vladimir Braverman , Daniela Rus

Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive…

Machine Learning · Statistics 2018-06-08 Olivier Bachem , Mario Lucic , Andreas Krause

Recent trend towards increasing large machine learning models require both training and inference tasks to be distributed. Considering the huge cost of training these models, it is imperative to unlock optimizations in computation and…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-29 Abhinav Jangda , Jun Huang , Guodong Liu , Amir Hossein Nodehi Sabet , Saeed Maleki , Youshan Miao , Madanlal Musuvathi , Todd Mytkowicz , Olli Sarikivi

Coresets are small data summaries that are sufficient for model training. They can be maintained online, enabling efficient handling of large data streams under resource constraints. However, existing constructions are limited to simple…

Machine Learning · Computer Science 2020-10-23 Zalán Borsos , Mojmír Mutný , Andreas Krause

In the rapidly evolving research on artificial intelligence (AI) the demand for fast, computationally efficient, and scalable solutions has increased in recent years. The problem of optimizing the computing resources for distributed machine…

Machine Learning · Computer Science 2025-10-30 Mohammadreza Doostmohammadian , Zulfiya R. Gabidullina , Hamid R. Rabiee
‹ Prev 1 2 3 10 Next ›