Related papers: A Novel Sequential Coreset Method for Gradient Des…

Stable Coresets via Posterior Sampling: Aligning Induced and Full Loss Landscapes

As deep learning models continue to scale, the growing computational demands have amplified the need for effective coreset selection techniques. Coreset selection aims to accelerate training by identifying small, representative subsets of…

Machine Learning · Computer Science 2025-11-24 Wei-Kai Chang , Rajiv Khanna

Dataset Condensation with Gradient Matching

As the state-of-the-art machine learning methods in many fields rely on larger datasets, storing datasets and training models on them become significantly more expensive. This paper proposes a training set synthesis technique for…

Computer Vision and Pattern Recognition · Computer Science 2021-03-09 Bo Zhao , Konda Reddy Mopuri , Hakan Bilen

Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints

Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms. It strives to identify a small subset from large-scale data, so that training only on the subset practically…

Machine Learning · Computer Science 2024-03-01 Xiaobo Xia , Jiale Liu , Shaokun Zhang , Qingyun Wu , Hongxin Wei , Tongliang Liu

Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds

We present an efficient coresets-based neural network compression algorithm that sparsifies the parameters of a trained fully-connected neural network in a manner that provably approximates the network's output. Our approach is based on an…

Machine Learning · Computer Science 2019-05-21 Cenk Baykal , Lucas Liebenwein , Igor Gilitschenski , Dan Feldman , Daniela Rus

A Unified Approach to Coreset Learning

Coreset of a given dataset and loss function is usually a small weighed set that approximates this loss for every query from a given set of queries. Coresets have shown to be very useful in many applications. However, coresets construction…

Machine Learning · Computer Science 2021-11-05 Alaa Maalouf , Gilad Eini , Ben Mussay , Dan Feldman , Margarita Osadchy

Layered Sampling for Robust Optimization Problems

In real world, our datasets often contain outliers. Moreover, the outliers can seriously affect the final machine learning result. Most existing algorithms for handling outliers take high time complexities (e.g. quadratic or cubic…

Computational Geometry · Computer Science 2020-02-28 Hu Ding , Zixiu Wang

Data-Independent Neural Pruning via Coresets

Previous work showed empirically that large neural networks can be significantly reduced in size while preserving their accuracy. Model compression became a central research topic, as it is crucial for deployment of neural networks on…

Machine Learning · Computer Science 2020-01-06 Ben Mussay , Margarita Osadchy , Vladimir Braverman , Samson Zhou , Dan Feldman

Uncovering Coresets for Classification With Multi-Objective Evolutionary Algorithms

A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as…

Machine Learning · Computer Science 2020-02-21 Pietro Barbiero , Giovanni Squillero , Alberto Tonda

Delving into Effective Gradient Matching for Dataset Condensation

As deep learning models and datasets rapidly scale up, network training is extremely time-consuming and resource-costly. Instead of training on the entire dataset, learning with a small synthetic dataset becomes an efficient solution.…

Machine Learning · Computer Science 2022-08-02 Zixuan Jiang , Jiaqi Gu , Mingjie Liu , David Z. Pan

Gradient-matching coresets for continual learning

We devise a coreset selection method based on the idea of gradient matching: The gradients induced by the coreset should match, as closely as possible, those induced by the original training dataset. We evaluate the method in the context of…

Machine Learning · Computer Science 2021-12-10 Lukas Balles , Giovanni Zappella , Cédric Archambeau

Bayesian Coresets: Revisiting the Nonconvex Optimization Perspective

Bayesian coresets have emerged as a promising approach for implementing scalable Bayesian inference. The Bayesian coreset problem involves selecting a (weighted) subset of the data samples, such that the posterior inference using the…

Machine Learning · Statistics 2021-03-01 Jacky Y. Zhang , Rajiv Khanna , Anastasios Kyrillidis , Oluwasanmi Koyejo

Data Summarization via Bilevel Optimization

The increasing availability of massive data sets poses a series of challenges for machine learning. Prominent among these is the need to learn models under hardware or human resource constraints. In such resource-constrained settings, a…

Machine Learning · Computer Science 2021-09-28 Zalán Borsos , Mojmír Mutný , Marco Tagliasacchi , Andreas Krause

COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression

Training wide and deep neural networks (DNNs) require large amounts of storage resources such as memory because the intermediate activation data must be saved in the memory during forward propagation and then restored for backward…

Artificial Intelligence · Computer Science 2021-11-19 Sian Jin , Chengming Zhang , Xintong Jiang , Yunhe Feng , Hui Guan , Guanpeng Li , Shuaiwen Leon Song , Dingwen Tao

A Compressed Gradient Tracking Method for Decentralized Optimization with Linear Convergence

Communication compression techniques are of growing interests for solving the decentralized optimization problem under limited communication, where the global objective is to minimize the average of local cost functions over a multi-agent…

Optimization and Control · Mathematics 2022-05-26 Yiwei Liao , Zhuorui Li , Kun Huang , Shi Pu

Improved Distribution Matching for Dataset Condensation

Dataset Condensation aims to condense a large dataset into a smaller one while maintaining its ability to train a well-performing model, thus reducing the storage cost and training effort in deep learning applications. However, conventional…

Machine Learning · Computer Science 2023-07-20 Ganlong Zhao , Guanbin Li , Yipeng Qin , Yizhou Yu

Coordinate Descent Algorithms

Coordinate descent algorithms solve optimization problems by successively performing approximate minimization along coordinate directions or coordinate hyperplanes. They have been used in applications for many years, and their popularity…

Optimization and Control · Mathematics 2015-02-18 Stephen J. Wright

A Block Decomposition Algorithm for Sparse Optimization

Sparse optimization is a central problem in machine learning and computer vision. However, this problem is inherently NP-hard and thus difficult to solve in general. Combinatorial search methods find the global optimal solution but are…

Optimization and Control · Mathematics 2020-06-30 Ganzhao Yuan , Li Shen , Wei-Shi Zheng

Beyond Discrete Selection: Continuous Embedding Space Optimization for Generative Feature Selection

The goal of Feature Selection - comprising filter, wrapper, and embedded approaches - is to find the optimal feature subset for designated downstream tasks. Nevertheless, current feature selection methods are limited by: 1) the selection…

Machine Learning · Computer Science 2023-09-18 Meng Xiao , Dongjie Wang , Min Wu , Pengfei Wang , Yuanchun Zhou , Yanjie Fu

A Coreset Learning Reality Check

Subsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets. In recent years, several works have proposed methods for subsampling rows from a data matrix while maintaining relevant information…

Machine Learning · Computer Science 2023-01-18 Fred Lu , Edward Raff , James Holt

A Coreset Selection of Coreset Selection Literature: Introduction and Recent Advances

Coreset selection targets the challenge of finding a small, representative subset of a large dataset that preserves essential patterns for effective machine learning. Although several surveys have examined data reduction strategies before,…

Machine Learning · Computer Science 2026-01-30 Brian B. Moser , Arundhati S. Shanbhag , Stanislav Frolov , Federico Raue , Joachim Folz , Andreas Dengel