Generalized massive optimal data compression
Abstract
Data compression has become one of the cornerstones of modern astronomical data analysis, with the vast majority of analyses compressing large raw datasets down to a manageable number of informative summaries. In this paper we provide a general procedure for optimally compressing data down to summary statistics, where is equal to the number of parameters of interest. We show that compression to the score function -- the gradient of the log-likelihood with respect to the parameters -- yields compressed statistics that are optimal in the sense that they preserve the Fisher information content of the data. Our method generalizes earlier work on linear Karhunen-Lo\'{e}ve compression for Gaussian data whilst recovering both lossless linear compression and quadratic estimation as special cases when they are optimal. We give a unified treatment that also includes the general non-Gaussian case as long as mild regularity conditions are satisfied, producing optimal non-linear summary statistics when appropriate. As a worked example, we derive explicitly the optimal compressed statistics for Gaussian data in the general case where both the mean and covariance depend on the parameters.
Keywords
Cite
@article{arxiv.1712.00012,
title = {Generalized massive optimal data compression},
author = {Justin Alsing and Benjamin Wandelt},
journal= {arXiv preprint arXiv:1712.00012},
year = {2018}
}
Comments
5 pages; updated to MNRAS Letters accepted version (3 Apr 2018)