Dataset distillation compresses large datasets into smaller synthetic coresets which retain performance with the aim of reducing the storage and computational burden of processing the entire dataset. Today's best-performing algorithm, \textit{Kernel Inducing Points} (KIP), which makes use of the correspondence between infinite-width neural networks and kernel-ridge regression, is prohibitively slow due to the exact computation of the neural tangent kernel matrix, scaling O(∣S∣2), with ∣S∣ being the coreset size. To improve this, we propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel, which reduces the kernel matrix computation to O(∣S∣). Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU. Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets, both in kernel regression and finite-width network training. We demonstrate the effectiveness of our approach on tasks involving model interpretability and privacy preservation.
@article{arxiv.2210.12067,
title = {Efficient Dataset Distillation Using Random Feature Approximation},
author = {Noel Loo and Ramin Hasani and Alexander Amini and Daniela Rus},
journal= {arXiv preprint arXiv:2210.12067},
year = {2022}
}
Comments
Accepted to the Conference on the Advances in Neural Information Processing Systems (NeurIPS) 2022