Related papers: Semi-supervised learning using copula-based regres…

Semi-Supervised learning with Density-Ratio Estimation

In this paper, we study statistical properties of semi-supervised learning, which is considered as an important problem in the community of machine learning. In the standard supervised learning, only the labeled data is observed. The…

Machine Learning · Statistics 2012-04-19 Masanori Kawakita , Takafumi Kanamori

Semi-supervised linear regression: enhancing efficiency and robustness in high dimensions

In semi-supervised learning, the prevailing understanding suggests that observing additional unlabeled samples improves estimation accuracy for linear parameters only in the case of model misspecification. In this work, we challenge such a…

Methodology · Statistics 2025-09-03 Kai Chen , Yuqian Zhang

Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning

We consider the estimation problem in high-dimensional semi-supervised learning. Our goal is to investigate when and how the unlabeled data can be exploited to improve the estimation of the regression parameters of linear model in light of…

Methodology · Statistics 2023-03-21 Siyi Deng , Yang Ning , Jiwei Zhao , Heping Zhang

Efficient semi-supervised inference for logistic regression under case-control studies

Semi-supervised learning has received increasingly attention in statistics and machine learning. In semi-supervised learning settings, a labeled data set with both outcomes and covariates and an unlabeled data set with covariates only are…

Machine Learning · Statistics 2024-02-26 Zhuojun Quan , Yuanyuan Lin , Kani Chen , Wen Yu

Semi-supervised learning and the question of true versus estimated propensity scores

A straightforward application of semi-supervised machine learning to the problem of treatment effect estimation would be to consider data as "unlabeled" if treatment assignment and covariates are observed but outcomes are unobserved.…

Methodology · Statistics 2020-09-15 Andrew Herren , P. Richard Hahn

Semi-supervised learning

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of not classified data, to perform classification, in situations when, typically, the labelled data are few. Even though this is not…

Statistics Theory · Mathematics 2017-12-18 Alejandro Cholaquidis , Ricardo Fraiman , Mariela Sued

Adaptive Semisupervised Inference

Semisupervised methods inevitably invoke some assumption that links the marginal distribution of the features to the regression function of the label. Most commonly, the cluster or manifold assumptions are used which imply that the…

Statistics Theory · Mathematics 2011-12-02 Martin Azizyan , Aarti Singh , Larry Wasserman

Semi-Supervised Learning by Augmented Distribution Alignment

In this work, we propose a simple yet effective semi-supervised learning approach called Augmented Distribution Alignment. We reveal that an essential sampling bias exists in semi-supervised learning due to the limited number of labeled…

Computer Vision and Pattern Recognition · Computer Science 2019-08-20 Qin Wang , Wen Li , Luc Van Gool

Semi-supervised learning for linear extremile regression

Extremile regression, as a least squares analog of quantile regression, is potentially useful tool for modeling and understanding the extreme tails of a distribution. However, existing extremile regression methods, as nonparametric…

Methodology · Statistics 2025-07-03 Rong Jiang , Keming Yu , Jiangfeng Wang

On semi-supervised learning

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of unclassified data, to perform a classification in situations when, typically, there is little labeled data. Even though this is not…

Machine Learning · Statistics 2020-12-11 Alejandro Cholaquidis , Ricardo Fraiman , Mariela Sued

Dependable Exploitation of High-Dimensional Unlabeled Data in an Assumption-Lean Framework

Semi-supervised learning has attracted significant attention due to the proliferation of applications featuring limited labeled data but abundant unlabeled data. In this paper, we examine the statistical inference problem in an…

Methodology · Statistics 2026-03-31 Chao Ying , Siyi Deng , Yang Ning , Jiwei Zhao , Heping Zhang

Improvability Through Semi-Supervised Learning: A Survey of Theoretical Results

Semi-supervised learning is a setting in which one has labeled and unlabeled data available. In this survey we explore different types of theoretical results when one uses unlabeled data in classification and regression tasks. Most methods…

Machine Learning · Computer Science 2020-07-31 Alexander Mey , Marco Loog

Asymptotic Analysis of Generative Semi-Supervised Learning

Semisupervised learning has emerged as a popular framework for improving modeling accuracy while controlling labeling cost. Based on an extension of stochastic composite likelihood we quantify the asymptotic accuracy of generative…

Machine Learning · Computer Science 2010-03-02 Joshua V Dillon , Krishnakumar Balasubramanian , Guy Lebanon

Extending the rank likelihood for semiparametric copula estimation

Quantitative studies in many fields involve the analysis of multivariate data of diverse types, including measurements that we may consider binary, ordinal and continuous. One approach to the analysis of such mixed data is to use a copula…

Statistics Theory · Mathematics 2007-06-13 Peter D. Hoff

Semiparametric copula-based quantile regression for semicontinuous outcomes with application to healthcare data

A semiparametric copula-based two-part quantile regression framework is developed for the analysis of semicontinuous outcomes characterized by a point mass at zero and a continuous positive component. The proposed approach models the…

Methodology · Statistics 2026-03-17 Guanjie Lyu , Mohamed Belalia , Abdulkadir Hussein

Imputations for High Missing Rate Data in Covariates via Semi-supervised Learning Approach

Advancements in data collection techniques and the heterogeneity of data resources can yield high percentages of missing observations on variables, such as block-wise missing data. Under missing-data scenarios, traditional methods such as…

Methodology · Statistics 2022-05-17 Wei Lan , Xuerong Chen , Tao Zou , Chih-Ling Tsai

Semi-supervised Deep Learning for Image Classification with Distribution Mismatch: A Survey

Deep learning methodologies have been employed in several different fields, with an outstanding success in image recognition applications, such as material quality control, medical imaging, autonomous driving, etc. Deep learning models rely…

Computer Vision and Pattern Recognition · Computer Science 2022-03-11 Saul Calderon-Ramirez , Shengxiang Yang , David Elizondo

Semi-Supervised linear regression

We study a regression problem where for some part of the data we observe both the label variable ($Y$) and the predictors (${\bf X}$), while for other part of the data only the predictors are given. Such a problem arises, for example, when…

Statistics Theory · Mathematics 2021-04-14 David Azriel , Lawrence D. Brown , Michael Sklar , Richard Berk , Andreas Buja , Linda Zhao

Semi-Supervised Empirical Risk Minimization: Using unlabeled data to improve prediction

We present a general methodology for using unlabeled data to design semi supervised learning (SSL) variants of the Empirical Risk Minimization (ERM) learning process. Focusing on generalized linear regression, we analyze of the…

Machine Learning · Statistics 2022-03-08 Oren Yuval , Saharon Rosset

Reliable Semi-Supervised Learning when Labels are Missing at Random

Semi-supervised learning methods are motivated by the availability of large datasets with unlabeled features in addition to labeled data. Unlabeled data is, however, not guaranteed to improve classification performance and has in fact been…

Machine Learning · Statistics 2019-10-25 Xiuming Liu , Dave Zachariah , Johan Wågberg , Thomas B. Schön