Learning from data with structured missingness
Abstract
Missing data are an unavoidable complication in many machine learning tasks. When data are `missing at random' there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious, and seek to learn from ever-larger volumes of heterogeneous data, an increasingly encountered problem arises in which missing values exhibit an association or structure, either explicitly or implicitly. Such `structured missingness' raises a range of challenges that have not yet been systematically addressed, and presents a fundamental hindrance to machine learning at scale. Here, we outline the current literature and propose a set of grand challenges in learning from data with structured missingness.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2023
- DOI:
- 10.48550/arXiv.2304.01429
- arXiv:
- arXiv:2304.01429
- Bibcode:
- 2023arXiv230401429M
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning