English
Related papers

Related papers: Imputation for High-Dimensional Linear Regression

200 papers

Advancements in data collection techniques and the heterogeneity of data resources can yield high percentages of missing observations on variables, such as block-wise missing data. Under missing-data scenarios, traditional methods such as…

Methodology · Statistics 2022-05-17 Wei Lan , Xuerong Chen , Tao Zou , Chih-Ling Tsai

This paper is concerned with inference on the regression function of a high-dimensional linear model when outcomes are missing at random. We propose an estimator which combines a Lasso pilot estimate of the regression function with a bias…

Methodology · Statistics 2024-12-11 Yikun Zhang , Alexander Giessing , Yen-Chi Chen

Although a majority of the theoretical literature in high-dimensional statistics has focused on settings which involve fully-observed data, settings with missing values and corruptions are common in practice. We consider the problems of…

Machine Learning · Statistics 2017-11-06 Yining Wang , Jialei Wang , Sivaraman Balakrishnan , Aarti Singh

We study regression discontinuity designs in which many predetermined covariates, possibly much more than the number of observations, can be used to increase the precision of treatment effect estimates. We consider a two-step estimator…

Econometrics · Economics 2022-05-06 Alexander Kreiß , Christoph Rothe

In this paper we recast the problem of missing values in the covariates of a regression model as a latent Gaussian Markov random field (GMRF) model in a fully Bayesian framework. Our proposed approach is based on the definition of the…

Computation · Statistics 2019-12-24 Virgilio Gómez-Rubio , Michela Cameletti , Marta Blangiardo

Sparse regression such as the Lasso has achieved great success in handling high-dimensional data. However, one of the biggest practical problems is that high-dimensional data often contain large amounts of missing values. Convex Conditioned…

Machine Learning · Statistics 2019-06-20 Masaaki Takada , Hironori Fujisawa , Takeichiro Nishikawa

This paper studies the inference of the regression coefficient matrix under multivariate response linear regressions in the presence of hidden variables. A novel procedure for constructing confidence intervals of entries of the coefficient…

Methodology · Statistics 2022-01-21 Xin Bing , Wei Cheng , Huijie Feng , Yang Ning

This paper studies inference in the high-dimensional linear regression model with outliers. Sparsity constraints are imposed on the vector of coefficients of the covariates. The number of outliers can grow with the sample size while their…

Statistics Theory · Mathematics 2021-02-08 Jad Beyhum

We propose a residual randomization procedure designed for robust Lasso-based inference in the high-dimensional setting. Compared to earlier work that focuses on sub-Gaussian errors, the proposed procedure is designed to work robustly in…

Methodology · Statistics 2021-08-20 Y. Samuel Wang , Si Kai Lee , Panos Toulis , Mladen Kolar

Missing data are frequently encountered in high-dimensional problems, but they are usually difficult to deal with using standard algorithms, such as the expectation-maximization (EM) algorithm and its variants. To tackle this difficulty,…

Methodology · Statistics 2018-02-08 Faming Liang , Bochao Jia , Jingnan Xue , Qizhai Li , Ye Luo

For statistical inference on regression models with a diverging number of covariates, the existing literature typically makes sparsity assumptions on the inverse of the Fisher information matrix. Such assumptions, however, are often…

Methodology · Statistics 2021-06-08 Lu Xia , Bin Nan , Yi Li

Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for…

Machine Learning · Statistics 2022-03-23 Burim Ramosaj , Justus Tulowietzki , Markus Pauly

This research deals with the estimation and imputation of missing data in longitudinal models with a Poisson response variable inflated with zeros. A methodology is proposed that is based on the use of maximum likelihood, assuming that data…

Methodology · Statistics 2024-09-18 D. S. Martinez-Lobo , O. O. Melo , N. A. Cruz

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data…

A basic principle in the design of observational studies is to approximate the randomized experiment that would have been conducted under controlled circumstances. Now, linear regression models are commonly used to analyze observational…

Methodology · Statistics 2022-07-08 Ambarish Chattopadhyay , Jose R. Zubizarreta

This paper develops a new framework, called modular regression, to utilize auxiliary information -- such as variables other than the original features or additional data sets -- in the training process of linear models. At a high level, our…

Methodology · Statistics 2023-11-27 Ying Jin , Dominik Rothenhäusler

Regression models with both high-dimensional responses and covariates have attracted growing attention. Standard multivariate regression models become inadequate when the response variables depend not only on observed covariates but also on…

Methodology · Statistics 2026-05-01 Jing Ouyang , Chengyu Cui , Yunxiao Chen , Kean Ming Tan , Gongjun Xu

Inferring causal relationships or related associations from observational data can be invalidated by the existence of hidden confounding. We focus on a high-dimensional linear regression setting, where the measured covariates are affected…

Methodology · Statistics 2021-07-22 Zijian Guo , Domagoj Ćevid , Peter Bühlmann

Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps…

Beta regression is commonly employed when the outcome variable is a proportion. Since its conception, the approach has been widely used in applications spanning various scientific fields. A series of extensions have been proposed over time,…

Methodology · Statistics 2025-07-29 Niloofar Ramezani , Martin Slawski
‹ Prev 1 2 3 10 Next ›