Related papers: Semi-Supervised U-statistics

A Unified Framework for Semiparametrically Efficient Semi-Supervised Learning

We consider statistical inference under a semi-supervised setting where we have access to both a labeled dataset consisting of pairs $\{X_i, Y_i \}_{i=1}^n$ and an unlabeled dataset $\{ X_i \}_{i=n+1}^{n+N}$. We ask the question: under what…

Statistics Theory · Mathematics 2025-03-20 Zichun Xu , Daniela Witten , Ali Shojaie

Dependable Exploitation of High-Dimensional Unlabeled Data in an Assumption-Lean Framework

Semi-supervised learning has attracted significant attention due to the proliferation of applications featuring limited labeled data but abundant unlabeled data. In this paper, we examine the statistical inference problem in an…

Methodology · Statistics 2026-03-31 Chao Ying , Siyi Deng , Yang Ning , Jiwei Zhao , Heping Zhang

Semi-supervised linear regression: enhancing efficiency and robustness in high dimensions

In semi-supervised learning, the prevailing understanding suggests that observing additional unlabeled samples improves estimation accuracy for linear parameters only in the case of model misspecification. In this work, we challenge such a…

Methodology · Statistics 2025-09-03 Kai Chen , Yuqian Zhang

Semi-supervised learning

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of not classified data, to perform classification, in situations when, typically, the labelled data are few. Even though this is not…

Statistics Theory · Mathematics 2017-12-18 Alejandro Cholaquidis , Ricardo Fraiman , Mariela Sued

Scaling Up Semi-supervised Learning with Unconstrained Unlabelled Data

We propose UnMixMatch, a semi-supervised learning framework which can learn effective representations from unconstrained unlabelled data in order to scale up performance. Most existing semi-supervised methods rely on the assumption that…

Machine Learning · Computer Science 2024-01-17 Shuvendu Roy , Ali Etemad

Improvability Through Semi-Supervised Learning: A Survey of Theoretical Results

Semi-supervised learning is a setting in which one has labeled and unlabeled data available. In this survey we explore different types of theoretical results when one uses unlabeled data in classification and regression tasks. Most methods…

Machine Learning · Computer Science 2020-07-31 Alexander Mey , Marco Loog

Exploring Self-Supervised Regularization for Supervised and Semi-Supervised Learning

Recent advances in semi-supervised learning have shown tremendous potential in overcoming a major barrier to the success of modern machine learning algorithms: access to vast amounts of human-labeled training data. Previous algorithms based…

Machine Learning · Computer Science 2019-11-22 Phi Vu Tran

Efficient semi-supervised inference for logistic regression under case-control studies

Semi-supervised learning has received increasingly attention in statistics and machine learning. In semi-supervised learning settings, a labeled data set with both outcomes and covariates and an unlabeled data set with covariates only are…

Machine Learning · Statistics 2024-02-26 Zhuojun Quan , Yuanyuan Lin , Kani Chen , Wen Yu

Semi-Supervised Data Programming with Subset Selection

The paradigm of data programming, which uses weak supervision in the form of rules/labelling functions, and semi-supervised learning, which augments small amounts of labelled data with a large unlabelled dataset, have shown great promise in…

Machine Learning · Computer Science 2021-06-15 Ayush Maheshwari , Oishik Chatterjee , KrishnaTeja Killamsetty , Ganesh Ramakrishnan , Rishabh Iyer

Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning

We consider the estimation problem in high-dimensional semi-supervised learning. Our goal is to investigate when and how the unlabeled data can be exploited to improve the estimation of the regression parameters of linear model in light of…

Methodology · Statistics 2023-03-21 Siyi Deng , Yang Ning , Jiwei Zhao , Heping Zhang

Deep Positive-Unlabeled Anomaly Detection for Contaminated Unlabeled Data

Semi-supervised anomaly detection, which aims to improve the anomaly detection performance by using a small amount of labeled anomaly data in addition to unlabeled data, has attracted attention. Existing semi-supervised approaches assume…

Machine Learning · Statistics 2025-02-11 Hiroshi Takahashi , Tomoharu Iwata , Atsutoshi Kumagai , Yuuki Yamanaka

On semi-supervised learning

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of unclassified data, to perform a classification in situations when, typically, there is little labeled data. Even though this is not…

Machine Learning · Statistics 2020-12-11 Alejandro Cholaquidis , Ricardo Fraiman , Mariela Sued

Semi-Supervised learning with Density-Ratio Estimation

In this paper, we study statistical properties of semi-supervised learning, which is considered as an important problem in the community of machine learning. In the standard supervised learning, only the labeled data is observed. The…

Machine Learning · Statistics 2012-04-19 Masanori Kawakita , Takafumi Kanamori

Universal Semi-Supervised Semantic Segmentation

In recent years, the need for semantic segmentation has arisen across several different applications and environments. However, the expense and redundancy of annotation often limits the quantity of labels available for training in any…

Computer Vision and Pattern Recognition · Computer Science 2019-09-25 Tarun Kalluri , Girish Varma , Manmohan Chandraker , C V Jawahar

When can unlabeled data improve the learning rate?

In semi-supervised classification, one is given access both to labeled and unlabeled data. As unlabeled data is typically cheaper to acquire than labeled data, this setup becomes advantageous as soon as one can exploit the unlabeled data in…

Machine Learning · Computer Science 2022-02-10 Christina Göpfert , Shai Ben-David , Olivier Bousquet , Sylvain Gelly , Ilya Tolstikhin , Ruth Urner

Semi-supervised Semantic Segmentation via Boosting Uncertainty on Unlabeled Data

We bring a new perspective to semi-supervised semantic segmentation by providing an analysis on the labeled and unlabeled distributions in training datasets. We first figure out that the distribution gap between labeled and unlabeled…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Daoan Zhang , Yunhao Luo , Jianguo Zhang

Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data

Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions such as the cluster assumption. In contrast, recently developed methods of…

Machine Learning · Computer Science 2017-06-19 Tomoya Sakai , Marthinus Christoffel du Plessis , Gang Niu , Masashi Sugiyama

Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce Discrimination

A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair. While research is already underway to formalize a machine-learning concept of fairness and to design frameworks for…

Machine Learning · Computer Science 2020-09-28 Tao Zhang , Tianqing Zhu , Jing Li , Mengde Han , Wanlei Zhou , Philip S. Yu

Semi-supervised Medical Image Segmentation via Query Distribution Consistency

Semi-supervised learning is increasingly popular in medical image segmentation due to its ability to leverage large amounts of unlabeled data to extract additional information. However, most existing semi-supervised segmentation methods…

Computer Vision and Pattern Recognition · Computer Science 2024-08-19 Rong Wu , Dehua Li , Cong Zhang

Semi-Supervised Deep Learning Using Improved Unsupervised Discriminant Projection

Deep learning demands a huge amount of well-labeled data to train the network parameters. How to use the least amount of labeled data to obtain the desired classification accuracy is of great practical significance, because for many…

Machine Learning · Computer Science 2019-12-20 Xiao Han , Zihao Wang , Enmei Tu , Gunnam Suryanarayana , Jie Yang