Related papers: Assessing binary classifiers using only positive a…

Recovering True Classifier Performance in Positive-Unlabeled Learning

A common approach in positive-unlabeled learning is to train a classification model between labeled and unlabeled data. This strategy is in fact known to give an optimal classifier under mild conditions; however, it results in biased…

Machine Learning · Statistics 2017-02-03 Shantanu Jain , Martha White , Predrag Radivojac

Using theoretical ROC curves for analysing machine learning binary classifiers

Most binary classifiers work by processing the input to produce a scalar response and comparing it to a threshold value. The various measures of classifier performance assume, explicitly or implicitly, probability distributions $P_s$ and…

Machine Learning · Computer Science 2019-09-24 Luma Omar , Ioannis Ivrissimtzis

Learning from Positive and Unlabeled Data with Augmented Classes

Positive Unlabeled (PU) learning aims to learn a binary classifier from only positive and unlabeled data, which is utilized in many real-world scenarios. However, existing PU learning algorithms cannot deal with the real-world challenge in…

Machine Learning · Computer Science 2022-07-28 Zhongnian Li , Liutao Yang , Zhongchen Ma , Tongfeng Sun , Xinzheng Xu , Daoqiang Zhang

Leveraging Uncertainty Estimates To Improve Classifier Performance

Binary classification involves predicting the label of an instance based on whether the model score for the positive class exceeds a threshold chosen based on the application requirements (e.g., maximizing recall for a precision bound).…

Machine Learning · Computer Science 2023-11-21 Gundeep Arora , Srujana Merugu , Anoop Saladi , Rajeev Rastogi

Improving State-of-the-Art in One-Class Classification by Leveraging Unlabeled Data

When dealing with binary classification of data with only one labeled class data scientists employ two main approaches, namely One-Class (OC) classification and Positive Unlabeled (PU) learning. The former only learns from labeled positive…

Machine Learning · Computer Science 2022-03-15 Farid Bagirov , Dmitry Ivanov , Aleksei Shpilman

Learning from Positive and Unlabeled Data with Arbitrary Positive Shift

Positive-unlabeled (PU) learning trains a binary classifier using only positive and unlabeled data. A common simplifying assumption is that the positive data is representative of the target positive class. This assumption rarely holds in…

Machine Learning · Computer Science 2020-11-10 Zayd Hammoudeh , Daniel Lowd

Uncertainty-aware Pseudo-label Selection for Positive-Unlabeled Learning

Positive-unlabeled learning (PUL) aims at learning a binary classifier from only positive and unlabeled training data. Even though real-world applications often involve imbalanced datasets where the majority of examples belong to one class,…

Machine Learning · Statistics 2024-03-12 Emilio Dorigatti , Jann Goschenhofer , Benjamin Schubert , Mina Rezaei , Bernd Bischl

Learning Classifiers on Positive and Unlabeled Data with Policy Gradient

Existing algorithms aiming to learn a binary classifier from positive (P) and unlabeled (U) data generally require estimating the class prior or label noises ahead of building a classification model. However, the estimation and classifier…

Machine Learning · Computer Science 2020-09-01 Tianyu Li , Chien-Chih Wang , Yukun Ma , Patricia Ortal , Qifang Zhao , Bjorn Stenger , Yu Hirate

Soft Label PU Learning

PU learning refers to the classification problem in which only part of positive samples are labeled. Existing PU learning methods treat unlabeled samples equally. However, in many real tasks, from common sense or domain knowledge, some…

Machine Learning · Computer Science 2024-05-06 Puning Zhao , Jintao Deng , Xu Cheng

Joint empirical risk minimization for instance-dependent positive-unlabeled data

Learning from positive and unlabeled data (PU learning) is actively researched machine learning task. The goal is to train a binary classification model based on a training dataset containing part of positives which are labeled, and…

Machine Learning · Statistics 2023-12-29 Wojciech Rejchel , Paweł Teisseyre , Jan Mielniczuk

Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach

We propose an efficient method to estimate the accuracy of classifiers using only unlabeled data. We consider a setting with multiple classification problems where the target classes may be tied together through logical constraints. For…

Machine Learning · Computer Science 2017-05-22 Emmanouil A. Platanios , Hoifung Poon , Tom M. Mitchell , Eric Horvitz

A Robust Ensemble Approach to Learn From Positive and Unlabeled Data Using SVM Base Models

We present a novel approach to learn binary classifiers when only positive and unlabeled instances are available (PU learning). This problem is routinely cast as a supervised task with label noise in the negative set. We use an ensemble of…

Machine Learning · Statistics 2015-02-13 Marc Claesen , Frank De Smet , Johan A. K. Suykens , Bart De Moor

Learning from Positive and Unlabeled Data under the Selected At Random Assumption

For many interesting tasks, such as medical diagnosis and web page classification, a learner only has access to some positively labeled examples and many unlabeled examples. Learning from this type of data requires making assumptions about…

Machine Learning · Computer Science 2018-08-28 Jessa Bekker , Jesse Davis

Clinical Uncertainty Impacts Machine Learning Evaluations

Clinical dataset labels are rarely certain as annotators disagree and confidence is not uniform across cases. Typical aggregation procedures, such as majority voting, obscure this variability. In simple experiments on medical imaging…

Artificial Intelligence · Computer Science 2025-11-12 Simone Lionetti , Fabian Gröger , Philippe Gottfrois , Alvaro Gonzalez-Jimenez , Ludovic Amruthalingam , Alexander A. Navarini , Marc Pouly

Beyond Myopia: Learning from Positive and Unlabeled Data through Holistic Predictive Trends

Learning binary classifiers from positive and unlabeled data (PUL) is vital in many real-world applications, especially when verifying negative examples is difficult. Despite the impressive empirical performance of recent PUL methods,…

Machine Learning · Computer Science 2024-10-14 Xinrui Wang , Wenhai Wan , Chuanxin Geng , Shaoyuan LI , Songcan Chen

Focused PU learning from imbalanced data

We propose a new method of learning from positive and unlabeled (PU) examples in highly imbalanced datasets. Many real-world problems, such as disease gene identification, targeted marketing, fraud detection, and recommender systems, are…

Machine Learning · Computer Science 2026-05-15 Elias Zavitsanos , Georgios Paliouras

Constrained Classification and Ranking via Quantiles

In most machine learning applications, classification accuracy is not the primary metric of interest. Binary classifiers which face class imbalance are often evaluated by the $F_\beta$ score, area under the precision-recall curve, Precision…

Machine Learning · Computer Science 2018-03-02 Alan Mackey , Xiyang Luo , Elad Eban

Improving Positive Unlabeled Learning: Practical AUL Estimation and New Training Method for Extremely Imbalanced Data Sets

Positive Unlabeled (PU) learning is widely used in many applications, where a binary classifier is trained on the datasets consisting of only positive and unlabeled samples. In this paper, we improve PU learning over state-of-the-art from…

Machine Learning · Computer Science 2020-04-22 Liwei Jiang , Dan Li , Qisheng Wang , Shuai Wang , Songtao Wang

How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?

When sufficient labeled data are available, classical criteria based on Receiver Operating Characteristic (ROC) or Precision-Recall (PR) curves can be used to compare the performance of un-supervised anomaly detection algorithms. However ,…

Machine Learning · Statistics 2016-07-06 Nicolas Goix

Learning Constraint Network from Demonstrations via Positive-Unlabeled Learning with Memory Replay

Planning for a wide range of real-world tasks necessitates to know and write all constraints. However, instances exist where these constraints are either unknown or challenging to specify accurately. A possible solution is to infer the…

Machine Learning · Computer Science 2025-01-17 Baiyu Peng , Aude Billard