Related papers: Modern Multiple Imputation with Functional Data
This work presents a new approach, called MISFIT, for fitting generalized functional linear regression models with sparsely and irregularly sampled data. Current methods do not allow for consistent estimation unless one assumes that the…
Electronic health records (EHR) consist of longitudinal clinical observations portrayed with sparsity, irregularity, and high-dimensionality, which become major obstacles in drawing reliable downstream clinical outcomes. Although there…
The problem of missing values in multivariable time series is a key challenge in many applications such as clinical data mining. Although many imputation methods show their effectiveness in many applications, few of them are designed to…
Modern data acquisition based on high-throughput technology is often facing the problem of missing data. Algorithms commonly used in the analysis of such large-scale data often depend on a complete set. Missing value imputation offers a…
Multiple imputation is a highly recommended technique to deal with missing data, but the application to longitudinal datasets can be done in multiple ways. When a new wave of longitudinal data arrives, we can treat the combined data of…
Imputing data is a critical issue for machine learning practitioners, including in the life sciences domain, where missing clinical data is a typical situation and the reliability of the imputation is of great importance. Currently, there…
Electronic Medical Records (EHR) are extremely sparse. Only a small proportion of events (symptoms, diagnoses, and treatments) are observed in the lifetime of an individual. The high degree of missingness of EHR can be attributed to a large…
Sparsity is a common issue in many trajectory datasets, including human mobility data. This issue frequently brings more difficulty to relevant learning tasks, such as trajectory imputation and prediction. Nowadays, little existing work…
International comparisons of hierarchical time series data sets based on survey data, such as annual country-level estimates of school enrollment rates, can suffer from large amounts of missing data due to differing coverage of surveys…
Electronic health records (EHR) are characterized as non-stationary, heterogeneous, noisy, and sparse data; therefore, it is challenging to learn the regularities or patterns inherent within them. In particular, sparseness caused mostly by…
We present a comprehensive analysis of deep learning approaches for Electronic Health Record (EHR) time-series imputation, examining how architectural and framework biases combine to influence model performance. Our investigation reveals…
We consider inference for misaligned multivariate functional data that represents the same underlying curve, but where the functional samples have systematic differences in shape. In this paper we introduce a new class of generally…
From a model-building perspective, we propose a paradigm shift for fitting over-parameterized models. Philosophically, the mindset is to fit models to future observations rather than to the observed sample. Technically, given an imputation…
A reduced-rank mixed effects model is developed for robust modeling of sparsely observed paired functional data. In this model, the curves for each functional variable are summarized using a few functional principal components, and the…
Longitudinal studies are frequently used in medical research and involve collecting repeated measures on individuals over time. Observations from the same individual are invariably correlated and thus an analytic approach that accounts for…
We present a framework for generating multiple imputations for continuous data when the missing data mechanism is unknown. Imputations are generated from more than one imputation model in order to incorporate uncertainty regarding the…
Missing values in electronic health record (EHR) data pose a significant challenge for epidemiologic research. Traditional methods for handling missing data, like mean imputation, may introduce bias. Multiple imputation (MI) offers a…
Multiple imputation is a straightforward method for handling missing data in a principled fashion. This paper presents an overview of multiple imputation, including important theoretical results and their practical implications for…
The authors derive likelihood-based exact inference methods for the multivariate regression model, for singly imputed synthetic data generated via Posterior Predictive Sampling (PPS) and for multiply imputed synthetic data generated via a…
This article develops a novel data assimilation methodology, addressing challenges that are common in real-world settings, such as severe sparsity of observations, lack of reliable models, and non-stationarity of the system dynamics. These…