Related papers: Classification using Ensemble Learning under Weigh…
In biomedical and public health association studies, binary outcome variables may be subject to misclassification, resulting in substantial bias in effect estimates. The feasibility of addressing binary outcome misclassification in…
In observational studies, propensity scores are commonly estimated by maxi- mum likelihood but may fail to balance high-dimensional pre-treatment covariates even after specification search. We introduce a general framework that unifies and…
Binary classification involves predicting the label of an instance based on whether the model score for the positive class exceeds a threshold chosen based on the application requirements (e.g., maximizing recall for a precision bound).…
Misclassification of binary responses, if ignored, may severely bias the maximum likelihood estimators (MLE) of regression parameters. For such data, a binary regression model incorporating misclassification probabilities is extensively…
Set classification aims to classify a set of observations as a whole, as opposed to classifying individual observations separately. To formally understand the unfamiliar concept of binary set classification, we first investigate the optimal…
Learning binary classifiers only from positive and unlabeled (PU) data is an important and challenging task in many real-world applications, including web text classification, disease gene identification and fraud detection, where negative…
Complementary-label learning (CLL) is widely used in weakly supervised classification, but it faces a significant challenge in real-world datasets when confronted with class-imbalanced training samples. In such scenarios, the number of…
This paper deals with the binary classification task when the target class has the lower probability of occurrence. In such situation, it is not possible to build a powerful classifier by using standard methods such as logistic regression,…
We address the problem of aggregating an ensemble of predictors with known loss bounds in a semi-supervised binary classification setting, to minimize prediction loss incurred on the unlabeled data. We find the minimax optimal predictions…
Existing weighting methods for treatment effect estimation are often built upon the idea of propensity scores or covariate balance. They usually impose strong assumptions on treatment assignment or outcome model to obtain unbiased…
Multi-class classification is mandatory for real world problems and one of promising techniques for multi-class classification is Error Correcting Output Code. We propose a method for constructing the Error Correcting Output Code to obtain…
This paper considers binary and multilabel classification problems in a setting where labels are missing independently and with a known rate. Missing labels are a ubiquitous phenomenon in extreme multi-label classification (XMC) tasks, such…
Binary codes have been widely used in vision problems as a compact feature representation to achieve both space and time advantages. Various methods have been proposed to learn data-dependent hash functions which map a feature vector to a…
Joint misclassification of exposure and outcome variables can lead to considerable bias in epidemiological studies of causal exposure-outcome effects. In this paper, we present a new maximum likelihood based estimator for the marginal…
Mediation analyses allow researchers to quantify the effect of an exposure variable on an outcome variable through a mediator variable. If a binary mediator variable is misclassified, the resulting analysis can be severely biased.…
The large-scale multiple testing inherent to high throughput biological data necessitates very high statistical stringency and thus true effects in data are difficult to detect unless they have high effect sizes. One promising approach for…
In observational causal inference, in order to emulate a randomized experiment, weights are used to render treatments independent of observed covariates. This property is known as balance; in its absence, estimated causal effects may be…
Many binary classification problems minimize misclassification above (or below) a threshold. We show that instances of ranking problems, accuracy at the top or hypothesis testing may be written in this form. We propose a general framework…
Covariate adjustment is a general method for improving precision when estimating treatment effects in randomized trials and is recommended by the FDA in its 2023 guidance when baseline variables are prognostic for the primary outcome. We…
In supervised learning, we often face with ambiguous (A) samples that are difficult to label even by domain experts. In this paper, we consider a binary classification problem in the presence of such A samples. This problem is substantially…