Learning from data with structured missingness
Machine Learning
2023-04-05 v1 Machine Learning
Abstract
Missing data are an unavoidable complication in many machine learning tasks. When data are `missing at random' there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious, and seek to learn from ever-larger volumes of heterogeneous data, an increasingly encountered problem arises in which missing values exhibit an association or structure, either explicitly or implicitly. Such `structured missingness' raises a range of challenges that have not yet been systematically addressed, and presents a fundamental hindrance to machine learning at scale. Here, we outline the current literature and propose a set of grand challenges in learning from data with structured missingness.
Cite
@article{arxiv.2304.01429,
title = {Learning from data with structured missingness},
author = {Robin Mitra and Sarah F. McGough and Tapabrata Chakraborti and Chris Holmes and Ryan Copping and Niels Hagenbuch and Stefanie Biedermann and Jack Noonan and Brieuc Lehmann and Aditi Shenvi and Xuan Vinh Doan and David Leslie and Ginestra Bianconi and Ruben Sanchez-Garcia and Alisha Davies and Maxine Mackintosh and Eleni-Rosalina Andrinopoulou and Anahid Basiri and Chris Harbron and Ben D. MacArthur},
journal= {arXiv preprint arXiv:2304.01429},
year = {2023}
}