Related papers: Distributed Semi-Supervised Sparse Statistical Inf…

Dependable Exploitation of High-Dimensional Unlabeled Data in an Assumption-Lean Framework

Semi-supervised learning has attracted significant attention due to the proliferation of applications featuring limited labeled data but abundant unlabeled data. In this paper, we examine the statistical inference problem in an…

Methodology · Statistics 2026-03-31 Chao Ying , Siyi Deng , Yang Ning , Jiwei Zhao , Heping Zhang

Debiased distributed learning for sparse partial linear models in high dimensions

Although various distributed machine learning schemes have been proposed recently for pure linear models and fully nonparametric models, little attention has been paid on distributed optimization for semi-paramemetric models with…

Machine Learning · Statistics 2019-11-05 Shaogao Lv , Heng Lian

Semi-supervised linear regression: enhancing efficiency and robustness in high dimensions

In semi-supervised learning, the prevailing understanding suggests that observing additional unlabeled samples improves estimation accuracy for linear parameters only in the case of model misspecification. In this work, we challenge such a…

Methodology · Statistics 2025-09-03 Kai Chen , Yuqian Zhang

Debiased Prediction Inference with Non-sparse Loadings in Misspecified High-dimensional Regression Models

High-dimensional regression models with regularized sparse estimation are widely applied. For statistical inferences, debiased methods are available about single coefficients or predictions with sparse new covariate vectors (also called…

Statistics Theory · Mathematics 2025-07-16 Libin Liang , Zhiqiang Tan

Semi-supervised Inference for Explained Variance in High-dimensional Linear Regression and Its Applications

This paper considers statistical inference for the explained variance $\beta^{\intercal}\Sigma \beta$ under the high-dimensional linear model $Y=X\beta+\epsilon$ in the semi-supervised setting, where $\beta$ is the regression vector and…

Methodology · Statistics 2020-12-01 T. Tony Cai , Zijian Guo

Bayesian Semi-supervised Inference via a Debiased Modeling Approach

Inference in semi-supervised (SS) settings has gained substantial attention in recent years due to increased relevance in modern big-data problems. In a typical SS setting, there is a much larger-sized unlabeled data, containing only…

Methodology · Statistics 2025-09-23 Gözde Sert , Abhishek Chakrabortty , Anirban Bhattacharya

A Unified Framework for Semiparametrically Efficient Semi-Supervised Learning

We consider statistical inference under a semi-supervised setting where we have access to both a labeled dataset consisting of pairs $\{X_i, Y_i \}_{i=1}^n$ and an unlabeled dataset $\{ X_i \}_{i=n+1}^{n+N}$. We ask the question: under what…

Statistics Theory · Mathematics 2025-03-20 Zichun Xu , Daniela Witten , Ali Shojaie

Doubly Debiased Lasso: High-Dimensional Inference under Hidden Confounding

Inferring causal relationships or related associations from observational data can be invalidated by the existence of hidden confounding. We focus on a high-dimensional linear regression setting, where the measured covariates are affected…

Methodology · Statistics 2021-07-22 Zijian Guo , Domagoj Ćevid , Peter Bühlmann

Debiased Lasso After Sample Splitting for Estimation and Inference in High Dimensional Generalized Linear Models

We consider random sample splitting for estimation and inference in high dimensional generalized linear models, where we first apply the lasso to select a submodel using one subsample and then apply the debiased lasso to fit the selected…

Methodology · Statistics 2023-03-01 Omar Vazquez , Bin Nan

High-Dimensional Inference for Generalized Linear Models with Hidden Confounding

Statistical inferences for high-dimensional regression models have been extensively studied for their wide applications ranging from genomics, neuroscience, to economics. However, in practice, there are often potential unmeasured…

Methodology · Statistics 2023-09-12 Jing Ouyang , Kean Ming Tan , Gongjun Xu

Debiased regression adjustment in completely randomized experiments with moderately high-dimensional covariates

Completely randomized experiment is the gold standard for causal inference. When the covariate information for each experimental candidate is available, one typical way is to include them in covariate adjustments for more accurate treatment…

Methodology · Statistics 2025-06-10 Xin Lu , Fan Yang , Yuhao Wang

Statistical Inference on High Dimensional Gaussian Graphical Regression Models

Gaussian graphical regressions have emerged as a powerful approach for regressing the precision matrix of a Gaussian graphical model on covariates, which, unlike traditional Gaussian graphical models, can help determine how graphs are…

Methodology · Statistics 2025-01-17 Xuran Meng , Jingfei Zhang , Yi Li

A statistical mechanics approach to de-biasing and uncertainty estimation in LASSO for random measurements

In high-dimensional statistical inference in which the number of parameters to be estimated is larger than that of the holding data, regularized linear estimation techniques are widely used. These techniques have, however, some drawbacks.…

Methodology · Statistics 2025-08-06 Takashi Takahashi , Yoshiyuki Kabashima

Communication-efficient sparse regression: a one-shot approach

We devise a one-shot approach to distributed sparse regression in the high-dimensional setting. The key idea is to average "debiased" or "desparsified" lasso estimators. We show the approach converges at the same rate as the lasso as long…

Machine Learning · Statistics 2015-08-12 Jason D. Lee , Yuekai Sun , Qiang Liu , Jonathan E. Taylor

Confidence intervals for high-dimensional inverse covariance estimation

We propose methodology for statistical inference for low-dimensional parameters of sparse precision matrices in a high-dimensional setting. Our method leads to a non-sparse estimator of the precision matrix whose entries have a Gaussian…

Statistics Theory · Mathematics 2015-08-13 Jana Jankova , Sara van de Geer

Distributed Estimation and Inference for Semi-parametric Binary Response Models

The development of modern technology has enabled data collection of unprecedented size, which poses new challenges to many statistical estimation and inference problems. This paper studies the maximum score estimator of a semi-parametric…

Statistics Theory · Mathematics 2025-02-25 Xi Chen , Wenbo Jing , Weidong Liu , Yichen Zhang

Statistical inference using debiased group graphical lasso for multiple sparse precision matrices

Debiasing group graphical lasso estimates enables statistical inference when multiple Gaussian graphical models share a common sparsity pattern. We analyze the estimation properties of group graphical lasso, establishing convergence rates…

Statistics Theory · Mathematics 2025-10-07 Sayan Ranjan Bhowal , Debashis Paul , Gopal K Basak , Samarjit Das

Efficient Distributed Learning with Sparsity

We propose a novel, efficient approach for distributed sparse learning in high-dimensions, where observations are randomly partitioned across machines. Computationally, at each round our method only requires the master machine to solve a…

Machine Learning · Statistics 2016-05-26 Jialei Wang , Mladen Kolar , Nathan Srebro , Tong Zhang

Debiasing the Debiased Lasso with Bootstrap

We consider statistical inference for a single coordinate of regression coefficients in high-dimensional linear models. Recently, the debiased estimators are popularly used for constructing confidence intervals and hypothesis testing in…

Statistics Theory · Mathematics 2020-10-20 Sai Li

Semi-Supervised Sparse Gaussian Classification: Provable Benefits of Unlabeled Data

The premise of semi-supervised learning (SSL) is that combining labeled and unlabeled data yields significantly more accurate models. Despite empirical successes, the theoretical understanding of SSL is still far from complete. In this…

Machine Learning · Statistics 2024-09-06 Eyar Azar , Boaz Nadler