Related papers: Missing at random: a stochastic process perspectiv…
The concept of missing at random is central in the literature on statistical analysis with missing data. In general, inference using incomplete data should be based not only on observed data values but should also take account of the…
Practical problems with missing data are common, and statistical methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism…
Models for analyzing multivariate data sets with missing values require strong, often unassessable, assumptions. The most common of these is that the mechanism that created the missing data is ignorable - a twofold assumption dependent on…
This paper provides further insight into the key concept of missing at random (MAR) in incomplete data analysis. Following the usual selection modelling approach we envisage two models with separable parameters: a model for the response of…
We consider studies where multiple measures on an outcome variable are collected over time, but some subjects drop out before the end of follow up. Analyses of such data often proceed under either a 'last observation carried forward' or…
The regularization approach for variable selection was well developed for a completely observed data set in the past two decades. In the presence of missing values, this approach needs to be tailored to different missing data mechanisms. In…
We are concerned in clustering continuous data sets subject to non-ignorable missingness. We perform clustering with a specific semi-parametric mixture, under the assumption of conditional independence given the component. The mixture model…
Many real-world networks are known to exhibit facts that counter our knowledge prescribed by the theories on network creation and communication patterns. A common prerequisite in network analysis is that information on nodes and links will…
Missing data are an unavoidable complication in many machine learning tasks. When data are `missing at random' there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious,…
When data are missing due to at most one cause from some time to next time, we can make sampling distribution inferences about the parameter of the data by modeling the missing-data mechanism correctly. Proverbially, in case its mechanism…
Today, data analysts largely rely on intuition to determine whether missing or withheld rows of a dataset significantly affect their analyses. We propose a framework that can produce automatic contingency analysis, i.e., the range of values…
When data are incomplete, a random vector Y for the data process together with a binary random vector R for the process that causes missing data, are modelled jointly. We review conditions under which R can be ignored for drawing likelihood…
The existence of the {\em typical set} is key for data compression strategies and for the emergence of robust statistical observables in macroscopic physical systems. Standard approaches derive its existence from a restricted set of…
The current literature regarding generation of complex, realistic synthetic tabular data, particularly for randomized controlled trials (RCTs), often ignores missing data. However, missing data are common in RCT data and often are not…
We develop a study of ignorability and conditions thereof for likelihood inference in the framework of stochastic processes. We define a coarsening model for processes which includes discrete-time observations as well as censored…
Conditions ensuring optimal parameter estimation in the presence of missing data are well established in inference, typically relying on the Missing-at-Random (MAR) assumption. In prediction, similar principles are often assumed to apply.…
During the past few decades, missing-data problems have been studied extensively, with a focus on the ignorable missing case, where the missing probability depends only on observable quantities. By contrast, research into non-ignorable…
Missing data can lead to inefficiencies and biases in analyses, in particular when data are missing not at random (MNAR). It is thus vital to understand and correctly identify the missing data mechanism. Recovering missing values through a…
An approach to amputation, the process of introducing missing values to a complete dataset, is presented. It allows to construct missingness indicators in a flexible and principled way via copulas and Bernoulli margins and to incorporate…
When using ecological momentary assessment data (EMA), missing data is pervasive as participant attrition is a common issue. Thus, any EMA study must have a missing data plan. In this paper, we discuss missingness in time series analysis…