Datamodels: Predicting Predictions from Training Data
Abstract
We present a conceptual framework, datamodeling, for analyzing the behavior of a model class in terms of the training data. For any fixed "target" example , training set , and learning algorithm, a datamodel is a parameterized function that for any subset of -- using only information about which examples of are contained in -- predicts the outcome of training a model on and evaluating on . Despite the potential complexity of the underlying process being approximated (e.g., end-to-end training and evaluation of deep neural networks), we show that even simple linear datamodels can successfully predict model outputs. We then demonstrate that datamodels give rise to a variety of applications, such as: accurately predicting the effect of dataset counterfactuals; identifying brittle predictions; finding semantically similar examples; quantifying train-test leakage; and embedding data into a well-behaved and feature-rich representation space. Data for this paper (including pre-computed datamodels as well as raw predictions from four million trained deep neural networks) is available at https://github.com/MadryLab/datamodels-data .
Cite
@article{arxiv.2202.00622,
title = {Datamodels: Predicting Predictions from Training Data},
author = {Andrew Ilyas and Sung Min Park and Logan Engstrom and Guillaume Leclerc and Aleksander Madry},
journal= {arXiv preprint arXiv:2202.00622},
year = {2022}
}