English
Related papers

Related papers: Efficient Dataset Distillation Using Random Featur…

200 papers

One of the most fundamental aspects of any machine learning algorithm is the training data used by the algorithm. We introduce the novel concept of $\epsilon$-approximation of datasets, obtaining datasets which are much smaller than or are…

Machine Learning · Computer Science 2021-03-24 Timothy Nguyen , Zhourong Chen , Jaehoon Lee

Dataset Distillation is the task of synthesizing small datasets from large ones while still retaining comparable predictive accuracy to the original uncompressed dataset. Despite significant empirical progress in recent years, there is…

Machine Learning · Computer Science 2023-05-24 Alaa Maalouf , Murad Tukan , Noel Loo , Ramin Hasani , Mathias Lechner , Daniela Rus

Dataset distillation aims to learn a small synthetic dataset that preserves most of the information from the original dataset. Dataset distillation can be formulated as a bi-level meta-learning problem where the outer loop optimizes the…

Machine Learning · Computer Science 2022-10-25 Yongchao Zhou , Ehsan Nezhadarya , Jimmy Ba

Dataset distillation has emerged as a powerful approach for reducing data requirements in deep learning. Among various methods, distribution matching-based approaches stand out for their balance of computational efficiency and strong…

Computer Vision and Pattern Recognition · Computer Science 2025-03-03 Shaobo Wang , Yicun Yang , Zhiyuan Liu , Chenghao Sun , Xuming Hu , Conghui He , Linfeng Zhang

Contemporary machine learning requires training large neural networks on massive datasets and thus faces the challenges of high computational demands. Dataset distillation, as a recent emerging strategy, aims to compress real-world datasets…

Computer Vision and Pattern Recognition · Computer Science 2024-03-20 Peng Sun , Bei Shi , Daiwei Yu , Tao Lin

The effectiveness of machine learning algorithms arises from being able to extract useful features from large amounts of data. As model and dataset sizes increase, dataset distillation methods that compress large datasets into significantly…

Machine Learning · Computer Science 2022-01-19 Timothy Nguyen , Roman Novak , Lechao Xiao , Jaehoon Lee

Dataset distillation provides an effective approach to reduce memory and computational costs by optimizing a compact dataset that achieves performance comparable to the full original. However, for large-scale datasets and complex deep…

Computer Vision and Pattern Recognition · Computer Science 2025-11-14 Xinhao Zhong , Shuoyang Sun , Xulin Gu , Zhaoyang Xu , Yaowei Wang , Min Zhang , Bin Chen

Dataset distillation or condensation aims to generate a smaller but representative subset from a large dataset, which allows a model to be trained more efficiently, meanwhile evaluating on the original testing data distribution to achieve…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Zeyuan Yin , Zhiqiang Shen

Dataset distillation is the technique of synthesizing smaller condensed datasets from large original datasets while retaining necessary information to persist the effect. In this paper, we approach the dataset distillation problem from a…

Computer Vision and Pattern Recognition · Computer Science 2023-12-15 Mingyang Chen , Bo Huang , Junda Lu , Bing Li , Yi Wang , Minhao Cheng , Wei Wang

We propose a new dataset distillation algorithm using reparameterization and convexification of implicit gradients (RCIG), that substantially improves the state-of-the-art. To this end, we first formulate dataset distillation as a bi-level…

Machine Learning · Computer Science 2023-11-13 Noel Loo , Ramin Hasani , Mathias Lechner , Daniela Rus

Dataset distillation aims to find a synthetic training set such that training on the synthetic data achieves similar performance to training on real data, with orders of magnitude less computational requirements. Existing methods can be…

Machine Learning · Computer Science 2026-02-09 Hong Ye Tan , Emma Slade

We propose Duality Gap KIP (DGKIP), an extension of the Kernel Inducing Points (KIP) method for dataset distillation. While existing dataset distillation methods often rely on bi-level optimization, DGKIP eliminates the need for such…

Dataset distillation has demonstrated strong performance on simple datasets like CIFAR, MNIST, and TinyImageNet but struggles to achieve similar results in more complex scenarios. In this paper, we propose EDF (emphasizes the discriminative…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Kai Wang , Zekai Li , Zhi-Qi Cheng , Samir Khaki , Ahmad Sajedi , Ramakrishna Vedantam , Konstantinos N Plataniotis , Alexander Hauptmann , Yang You

Data distillation aims to generate a small data set that closely mimics the performance of a given learning algorithm on the original data set. The distilled dataset is hence useful to simplify the training process thanks to its small data…

Machine Learning · Computer Science 2024-04-23 Margarita Vinaroz , Mi Jung Park

Dataset distillation seeks to synthesize a highly compact dataset that achieves performance comparable to the original dataset on downstream tasks. For the classification task that use pre-trained self-supervised models as backbones,…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Qianxin Xia , Jiawei Du , Xin Zhang , Yuhan Zhang , Jielei Wang , Guoming Lu

Dataset distillation has become a popular method for compressing large datasets into smaller, more efficient representations while preserving critical information for model training. Data features are broadly categorized into two types:…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Minh-Tuan Tran , Trung Le , Xuan-May Le , Thanh-Toan Do , Dinh Phung

Dataset distillation aims to synthesize small datasets with little information loss from original large-scale ones for reducing storage and training costs. Recent state-of-the-art methods mainly constrain the sample synthesis process by…

Computer Vision and Pattern Recognition · Computer Science 2023-08-31 Yanqing Liu , Jianyang Gu , Kai Wang , Zheng Zhu , Wei Jiang , Yang You

Dataset Distillation (DD) compresses large datasets into compact synthetic ones that maintain training performance. However, current methods mainly target sample reduction, with limited consideration of data precision and its impact on…

Computer Vision and Pattern Recognition · Computer Science 2026-03-04 My H. Dinh , Aditya Sant , Akshay Malhotra , Keya Patani , Shahab Hamidi-Rad

Dataset distillation enables the training of deep neural networks with comparable performance in significantly reduced time by compressing large datasets into small and representative ones. Although the introduction of generative models has…

Machine Learning · Computer Science 2025-05-27 Mingzhuo Li , Guang Li , Jiafeng Mao , Takahiro Ogawa , Miki Haseyama

Gaussian processes (GPs) are flexible models that can capture complex structure in large-scale dataset due to their non-parametric nature. However, the usage of GPs in real-world application is limited due to their high computational cost…

Machine Learning · Statistics 2018-11-06 Congzheng Song , Yiming Sun
‹ Prev 1 2 3 10 Next ›