Related papers: Dataset Distillation Using Parameter Pruning

Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning

Dataset distillation has gained significant interest in recent years, yet existing approaches typically distill from the entire dataset, potentially including non-beneficial samples. We introduce a novel "Prune First, Distill After"…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Brian B. Moser , Federico Raue , Tobias C. Nauen , Stanislav Frolov , Andreas Dengel

Hierarchical Features Matter: A Deep Exploration of Progressive Parameterization Method for Dataset Distillation

Dataset distillation is an emerging dataset reduction method, which condenses large-scale datasets while maintaining task accuracy. Current parameterization methods achieve enhanced performance under extremely high compression ratio by…

Computer Vision and Pattern Recognition · Computer Science 2025-03-20 Xinhao Zhong , Hao Fang , Bin Chen , Xulin Gu , Meikang Qiu , Shuhan Qi , Shu-Tao Xia

Dataset Distillation by Matching Training Trajectories

Dataset distillation is the task of synthesizing a small dataset such that a model trained on the synthetic set will match the test accuracy of the model trained on the full dataset. In this paper, we propose a new formulation that…

Computer Vision and Pattern Recognition · Computer Science 2022-03-23 George Cazenavette , Tongzhou Wang , Antonio Torralba , Alexei A. Efros , Jun-Yan Zhu

Dataset Distillation via Adversarial Prediction Matching

Dataset distillation is the technique of synthesizing smaller condensed datasets from large original datasets while retaining necessary information to persist the effect. In this paper, we approach the dataset distillation problem from a…

Computer Vision and Pattern Recognition · Computer Science 2023-12-15 Mingyang Chen , Bo Huang , Junda Lu , Bing Li , Yi Wang , Minhao Cheng , Wei Wang

Deep Neural Compression Via Concurrent Pruning and Self-Distillation

Pruning aims to reduce the number of parameters while maintaining performance close to the original network. This work proposes a novel \emph{self-distillation} based pruning strategy, whereby the representational similarity between the…

Machine Learning · Computer Science 2021-10-01 James O' Neill , Sourav Dutta , Haytham Assem

Improve Cross-Architecture Generalization on Dataset Distillation

Dataset distillation, a pragmatic approach in machine learning, aims to create a smaller synthetic dataset from a larger existing dataset. However, existing distillation methods primarily adopt a model-based paradigm, where the synthetic…

Machine Learning · Computer Science 2024-02-21 Binglin Zhou , Linhao Zhong , Wentao Chen

Parameterizing Dataset Distillation via Gaussian Splatting

Dataset distillation aims to compress training data while preserving training-aware knowledge, alleviating the reliance on large-scale datasets in modern model training. Dataset parameterization provides a more efficient storage structure…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Chenyang Jiang , Zhengcen Li , Hang Zhao , Qiben Shan , Shaocong Wu , Jingyong Su

A Comprehensive Survey of Dataset Distillation

Deep learning technology has developed unprecedentedly in the last decade and has become the primary choice in many application domains. This progress is mainly attributed to a systematic collaboration in which rapidly growing computing…

Machine Learning · Computer Science 2023-12-27 Shiye Lei , Dacheng Tao

Generative Dataset Distillation: Balancing Global Structure and Local Details

In this paper, we propose a new dataset distillation method that considers balancing global structure and local details when distilling the information from a large dataset into a generative model. Dataset distillation has been proposed to…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Longzhen Li , Guang Li , Ren Togo , Keisuke Maeda , Takahiro Ogawa , Miki Haseyama

New Properties of the Data Distillation Method When Working With Tabular Data

Data distillation is the problem of reducing the volume oftraining data while keeping only the necessary information. With thispaper, we deeper explore the new data distillation algorithm, previouslydesigned for image data. Our experiments…

Machine Learning · Computer Science 2020-10-21 Dmitry Medvedev , Alexander D'yakonov

Dataset Distillation Meets Provable Subset Selection

Deep learning has grown tremendously over recent years, yielding state-of-the-art results in various fields. However, training such models requires huge amounts of data, increasing the computational time and cost. To address this, dataset…

Machine Learning · Computer Science 2023-07-18 Murad Tukan , Alaa Maalouf , Margarita Osadchy

A Comprehensive Study on Dataset Distillation: Performance, Privacy, Robustness and Fairness

The aim of dataset distillation is to encode the rich features of an original dataset into a tiny dataset. It is a promising approach to accelerate neural network training and related studies. Different approaches have been proposed to…

Machine Learning · Computer Science 2023-05-30 Zongxiong Chen , Jiahui Geng , Derui Zhu , Herbert Woisetschlaeger , Qing Li , Sonja Schimmler , Ruben Mayer , Chunming Rong

Data Distillation: A Survey

The popularity of deep learning has led to the curation of a vast number of massive and multifarious datasets. Despite having close-to-human performance on individual tasks, training parameter-hungry models on large datasets poses…

Machine Learning · Computer Science 2023-09-27 Noveen Sachdeva , Julian McAuley

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

Dataset distillation aims to minimize the time and memory needed for training deep networks on large datasets, by creating a small set of synthetic images that has a similar generalization performance to that of the full dataset. However,…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Xuxi Chen , Yu Yang , Zhangyang Wang , Baharan Mirzasoleiman

Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation

Although larger datasets are crucial for training large deep models, the rapid growth of dataset size has brought a significant challenge in terms of considerable training costs, which even results in prohibitive computational expenses.…

Computer Vision and Pattern Recognition · Computer Science 2025-08-06 Sheng-Feng Yu , Jia-Jiun Yao , Wei-Chen Chiu

Generalizing Dataset Distillation via Deep Generative Prior

Dataset Distillation aims to distill an entire dataset's knowledge into a few synthetic images. The idea is to synthesize a small number of synthetic data points that, when given to a learning algorithm as training data, result in a model…

Computer Vision and Pattern Recognition · Computer Science 2023-05-05 George Cazenavette , Tongzhou Wang , Antonio Torralba , Alexei A. Efros , Jun-Yan Zhu

Towards Consistent and Efficient Dataset Distillation via Diffusion-Driven Selection

Dataset distillation provides an effective approach to reduce memory and computational costs by optimizing a compact dataset that achieves performance comparable to the full original. However, for large-scale datasets and complex deep…

Computer Vision and Pattern Recognition · Computer Science 2025-11-14 Xinhao Zhong , Shuoyang Sun , Xulin Gu , Zhaoyang Xu , Yaowei Wang , Min Zhang , Bin Chen

Efficient Dataset Distillation for Pre-Trained Self-Supervised Models via Statistical Flow Matching

Dataset distillation seeks to synthesize a highly compact dataset that achieves performance comparable to the original dataset on downstream tasks. For the classification task that use pre-trained self-supervised models as backbones,…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Qianxin Xia , Jiawei Du , Xin Zhang , Yuhan Zhang , Jielei Wang , Guoming Lu

Dataset Distillation for Pre-Trained Self-Supervised Vision Models

The task of dataset distillation aims to find a small set of synthetic images such that training a model on them reproduces the performance of the same model trained on a much larger dataset of real samples. Existing distillation methods…

Computer Vision and Pattern Recognition · Computer Science 2025-11-21 George Cazenavette , Antonio Torralba , Vincent Sitzmann

Prioritize Alignment in Dataset Distillation

Dataset Distillation aims to compress a large dataset into a significantly more compact, synthetic one without compromising the performance of the trained models. To achieve this, existing methods use the agent model to extract information…

Machine Learning · Computer Science 2024-10-15 Zekai Li , Ziyao Guo , Wangbo Zhao , Tianle Zhang , Zhi-Qi Cheng , Samir Khaki , Kaipeng Zhang , Ahmad Sajedi , Konstantinos N Plataniotis , Kai Wang , Yang You