Related papers: Optimally Combining Classifiers Using Unlabeled Da…

Optimal Binary Classifier Aggregation for General Losses

We address the problem of aggregating an ensemble of predictors with known loss bounds in a semi-supervised binary classification setting, to minimize prediction loss incurred on the unlabeled data. We find the minimax optimal predictions…

Machine Learning · Computer Science 2016-11-08 Akshay Balsubramani , Yoav Freund

Unsupervised Evaluation and Weighted Aggregation of Ranked Predictions

Learning algorithms that aggregate predictions from an ensemble of diverse base classifiers consistently outperform individual methods. Many of these strategies have been developed in a supervised setting, where the accuracy of each base…

Machine Learning · Statistics 2018-02-14 Mehmet Eren Ahsen , Robert Vogel , Gustavo Stolovitzky

Scalable Semi-Supervised Aggregation of Classifiers

We present and empirically evaluate an efficient algorithm that learns to aggregate the predictions of an ensemble of binary classifiers. The algorithm uses the structure of the ensemble predictions on unlabeled data to yield significant…

Machine Learning · Computer Science 2015-11-12 Akshay Balsubramani , Yoav Freund

Clustering Unclustered Data: Unsupervised Binary Labeling of Two Datasets Having Different Class Balances

We consider the unsupervised learning problem of assigning labels to unlabeled data. A naive approach is to use clustering methods, but this works well only when data is properly clustered and each cluster corresponds to an underlying…

Machine Learning · Computer Science 2013-05-02 Marthinus Christoffel du Plessis , Masashi Sugiyama

Estimating the Accuracies of Multiple Classifiers Without Labeled Data

In various situations one is given only the predictions of multiple classifiers over a large unlabeled test data. This scenario raises the following questions: Without any labeled data and without any a-priori knowledge about the…

Machine Learning · Statistics 2014-10-31 Ariel Jaffe , Boaz Nadler , Yuval Kluger

Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach

We propose an efficient method to estimate the accuracy of classifiers using only unlabeled data. We consider a setting with multiple classification problems where the target classes may be tied together through logical constraints. For…

Machine Learning · Computer Science 2017-05-22 Emmanouil A. Platanios , Hoifung Poon , Tom M. Mitchell , Eric Horvitz

Applying an Ensemble Learning Method for Improving Multi-label Classification Performance

In recent years, multi-label classification problem has become a controversial issue. In this kind of classification, each sample is associated with a set of class labels. Ensemble approaches are supervised learning algorithms in which an…

Machine Learning · Computer Science 2018-01-09 Amirreza Mahdavi-Shahri , Mahboobeh Houshmand , Mahdi Yaghoobi , Mehrdad Jalali

A game-theoretic framework for classifier ensembles using weighted majority voting with local accuracy estimates

In this paper, a novel approach for the optimal combination of binary classifiers is proposed. The classifier combination problem is approached from a Game Theory perspective. The proposed framework of adapted weighted majority rules (WMR)…

Machine Learning · Computer Science 2013-02-05 Harris V. Georgiou , Michael E. Mavroforakis

A probabilistic methodology for multilabel classification

Multilabel classification is a relatively recent subfield of machine learning. Unlike to the classical approach, where instances are labeled with only one category, in multilabel classification, an arbitrary number of categories is chosen…

Artificial Intelligence · Computer Science 2013-03-01 Alfonso E. Romero , Luis M. de Campos

Minimax Lower Bounds for Realizable Transductive Classification

Transductive learning considers a training set of $m$ labeled samples and a test set of $u$ unlabeled samples, with the goal of best labeling that particular test set. Conversely, inductive learning considers a training set of $m$ labeled…

Machine Learning · Statistics 2016-02-10 Ilya Tolstikhin , David Lopez-Paz

The Multiplex Classification Framework: optimizing multi-label classifiers through problem transformation, ontology engineering, and model ensembling

Classification is a fundamental task in machine learning. While conventional methods-such as binary, multiclass, and multi-label classification-are effective for simpler problems, they may not adequately address the complexities of some…

Machine Learning · Computer Science 2024-12-20 Mauro Nievas Offidani , Facundo Roffet , Claudio Augusto Delrieux , Maria Carolina Gonzalez Galtier , Marcos Zarate

Multi-class Classification without Multi-class Labels

This work presents a new strategy for multi-class classification that requires no class-specific labels, but instead leverages pairwise similarity between examples, which is a weaker form of annotation. The proposed method, meta…

Machine Learning · Computer Science 2019-01-04 Yen-Chang Hsu , Zhaoyang Lv , Joel Schlosser , Phillip Odom , Zsolt Kira

Learning to Abstain from Binary Prediction

A binary classifier capable of abstaining from making a label prediction has two goals in tension: minimizing errors, and avoiding abstaining unnecessarily often. In this work, we exactly characterize the best achievable tradeoff between…

Machine Learning · Computer Science 2016-11-30 Akshay Balsubramani

Multilabel Consensus Classification

In the era of big data, a large amount of noisy and incomplete data can be collected from multiple sources for prediction tasks. Combining multiple models or data sources helps to counteract the effects of low data quality and the bias of…

Machine Learning · Statistics 2013-10-17 Sihong Xie , Xiangnan Kong , Jing Gao , Wei Fan , Philip S. Yu

Unsupervised Label Refinement Improves Dataless Text Classification

Dataless text classification is capable of classifying documents into previously unseen labels by assigning a score to any document paired with a label description. While promising, it crucially relies on accurate descriptions of the label…

Computation and Language · Computer Science 2020-12-09 Zewei Chu , Karl Stratos , Kevin Gimpel

An Optimization Framework for Semi-Supervised and Transfer Learning using Multiple Classifiers and Clusterers

Unsupervised models can provide supplementary soft constraints to help classify new, "target" data since similar instances in the target set are more likely to share the same class label. Such models can also help detect possible…

Machine Learning · Computer Science 2012-06-06 Ayan Acharya , Eduardo R. Hruschka , Joydeep Ghosh , Sreangsu Acharyya

Something for (almost) nothing: Improving deep ensemble calibration using unlabeled data

We present a method to improve the calibration of deep ensembles in the small training data regime in the presence of unlabeled data. Our approach is extremely simple to implement: given an unlabeled set, for each unlabeled data point, we…

Machine Learning · Computer Science 2023-10-05 Konstantinos Pitas , Julyan Arbel

Optimal Compression for Minimizing Classification Error Probability: an Information-Theoretic Approach

We formulate the problem of performing optimal data compression under the constraints that compressed data can be used for accurate classification in machine learning. We show that this translates to a problem of minimizing the mutual…

Signal Processing · Electrical Eng. & Systems 2022-11-04 Jingchao Gao , Ao Tang , Weiyu Xu

Undersampling is a Minimax Optimal Robustness Intervention in Nonparametric Classification

While a broad range of techniques have been proposed to tackle distribution shift, the simple baseline of training on an $\textit{undersampled}$ balanced dataset often achieves close to state-of-the-art-accuracy across several popular…

Machine Learning · Computer Science 2023-06-21 Niladri S. Chatterji , Saminul Haque , Tatsunori Hashimoto

Unsupervised Ranking and Aggregation of Label Descriptions for Zero-Shot Classifiers

Zero-shot text classifiers based on label descriptions embed an input text and a set of labels into the same space: measures such as cosine similarity can then be used to select the most similar label description to the input text as the…

Computation and Language · Computer Science 2022-05-25 Angelo Basile , Marc Franco-Salvador , Paolo Rosso