Related papers: Unsupervised Supervised Learning II: Training Marg…

Multi-class Classification from Multiple Unlabeled Datasets with Partial Risk Regularization

Recent years have witnessed a great success of supervised deep learning, where predictive models were trained from a large amount of fully labeled data. However, in practice, labeling such big data can be very costly and may not even be…

Machine Learning · Computer Science 2022-10-18 Yuting Tang , Nan Lu , Tianyi Zhang , Masashi Sugiyama

Unsupervised Sequence Classification using Sequential Output Statistics

We consider learning a sequence classifier without labeled data by using sequential output statistics. The problem is highly valuable since obtaining labels in training data is often costly, while the sequential output statistics (e.g.,…

Machine Learning · Computer Science 2017-05-30 Yu Liu , Jianshu Chen , Li Deng

Reliable Semi-Supervised Learning when Labels are Missing at Random

Semi-supervised learning methods are motivated by the availability of large datasets with unlabeled features in addition to labeled data. Unlabeled data is, however, not guaranteed to improve classification performance and has in fact been…

Machine Learning · Statistics 2019-10-25 Xiuming Liu , Dave Zachariah , Johan Wågberg , Thomas B. Schön

The Pessimistic Limits and Possibilities of Margin-based Losses in Semi-supervised Learning

Consider a classification problem where we have both labeled and unlabeled data available. We show that for linear classifiers defined by convex margin-based surrogate losses that are decreasing, it is impossible to construct any…

Machine Learning · Statistics 2019-01-09 Jesse H. Krijthe , Marco Loog

Semi-Supervised learning with Density-Ratio Estimation

In this paper, we study statistical properties of semi-supervised learning, which is considered as an important problem in the community of machine learning. In the standard supervised learning, only the labeled data is observed. The…

Machine Learning · Statistics 2012-04-19 Masanori Kawakita , Takafumi Kanamori

Evaluating multiple models using labeled and unlabeled data

It remains difficult to evaluate machine learning classifiers in the absence of a large, labeled dataset. While labeled data can be prohibitively expensive or impossible to obtain, unlabeled data is plentiful. Here, we introduce…

Machine Learning · Computer Science 2025-10-15 Divya Shanmugam , Shuvom Sadhuka , Manish Raghavan , John Guttag , Bonnie Berger , Emma Pierson

ARSM Gradient Estimator for Supervised Learning to Rank

We propose a new model for supervised learning to rank. In our model, the relevance labels are assumed to follow a categorical distribution whose probabilities are constructed based on a scoring function. We optimize the training objective…

Machine Learning · Computer Science 2020-02-19 Siamak Zamani Dadaneh , Shahin Boluki , Mingyuan Zhou , Xiaoning Qian

Semi-supervised learning and the question of true versus estimated propensity scores

A straightforward application of semi-supervised machine learning to the problem of treatment effect estimation would be to consider data as "unlabeled" if treatment assignment and covariates are observed but outcomes are unobserved.…

Methodology · Statistics 2020-09-15 Andrew Herren , P. Richard Hahn

Semi-supervised logistic discrimination via labeled data and unlabeled data from different sampling distributions

This article addresses the problem of classification method based on both labeled and unlabeled data, where we assume that a density function for labeled data is different from that for unlabeled data. We propose a semi-supervised logistic…

Machine Learning · Statistics 2014-02-20 Shuichi Kawano

Self-Training: A Survey

Semi-supervised algorithms aim to learn prediction functions from a small set of labeled observations and a large set of unlabeled observations. Because this framework is relevant in many applications, they have received a lot of interest…

Machine Learning · Computer Science 2025-02-17 Massih-Reza Amini , Vasilii Feofanov , Loic Pauletto , Lies Hadjadj , Emilie Devijver , Yury Maximov

Semi-Supervised Empirical Risk Minimization: Using unlabeled data to improve prediction

We present a general methodology for using unlabeled data to design semi supervised learning (SSL) variants of the Empirical Risk Minimization (ERM) learning process. Focusing on generalized linear regression, we analyze of the…

Machine Learning · Statistics 2022-03-08 Oren Yuval , Saharon Rosset

Efficient semi-supervised inference for logistic regression under case-control studies

Semi-supervised learning has received increasingly attention in statistics and machine learning. In semi-supervised learning settings, a labeled data set with both outcomes and covariates and an unlabeled data set with covariates only are…

Machine Learning · Statistics 2024-02-26 Zhuojun Quan , Yuanyuan Lin , Kani Chen , Wen Yu

Margin Maximization as Lossless Maximal Compression

The ultimate goal of a supervised learning algorithm is to produce models constructed on the training data that can generalize well to new examples. In classification, functional margin maximization -- correctly classifying as many training…

Machine Learning · Computer Science 2020-01-29 Nikolaos Nikolaou , Henry Reeve , Gavin Brown

On the Minimal Supervision for Training Any Binary Classifier from Only Unlabeled Data

Empirical risk minimization (ERM), with proper loss function and regularization, is the common practice of supervised classification. In this paper, we study training arbitrary (from linear to deep) binary classifier from only unlabeled (U)…

Machine Learning · Statistics 2019-03-13 Nan Lu , Gang Niu , Aditya Krishna Menon , Masashi Sugiyama

Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning

We consider the estimation problem in high-dimensional semi-supervised learning. Our goal is to investigate when and how the unlabeled data can be exploited to improve the estimation of the regression parameters of linear model in light of…

Methodology · Statistics 2023-03-21 Siyi Deng , Yang Ning , Jiwei Zhao , Heping Zhang

Self-Taught Metric Learning without Labels

We present a novel self-taught framework for unsupervised metric learning, which alternates between predicting class-equivalence relations between data through a moving average of an embedding model and learning the model with the predicted…

Computer Vision and Pattern Recognition · Computer Science 2022-05-05 Sungyeon Kim , Dongwon Kim , Minsu Cho , Suha Kwak

Gradient-based Label Binning in Multi-label Classification

In multi-label classification, where a single example may be associated with several class labels at the same time, the ability to model dependencies between labels is considered crucial to effectively optimize non-decomposable evaluation…

Machine Learning · Computer Science 2021-06-23 Michael Rapp , Eneldo Loza Mencía , Johannes Fürnkranz , Eyke Hüllermeier

Dependable Exploitation of High-Dimensional Unlabeled Data in an Assumption-Lean Framework

Semi-supervised learning has attracted significant attention due to the proliferation of applications featuring limited labeled data but abundant unlabeled data. In this paper, we examine the statistical inference problem in an…

Methodology · Statistics 2026-03-31 Chao Ying , Siyi Deng , Yang Ning , Jiwei Zhao , Heping Zhang

Discriminative Learning via Semidefinite Probabilistic Models

Discriminative linear models are a popular tool in machine learning. These can be generally divided into two types: The first is linear classifiers, such as support vector machines, which are well studied and provide state-of-the-art…

Machine Learning · Computer Science 2012-07-02 Koby Crammer , Amir Globerson

Active Learning Via Sequential Design and Uncertainty Sampling

Classification is an important task in many fields including biomedical research and machine learning. Traditionally, a classification rule is constructed based a bunch of labeled data. Recently, due to technological innovation and…

Methodology · Statistics 2014-06-19 Jing Wang , Eunsik Park , Yuan-chin Ivan Chang