Related papers: Semi-Parametric Uncertainty Bounds for Binary Clas…

Exact Distribution-Free Hypothesis Tests for the Regression Function of Binary Classification via Conditional Kernel Mean Embeddings

In this paper we suggest two statistical hypothesis tests for the regression function of binary classification based on conditional kernel mean embeddings. The regression function is a fundamental object in classification as it determines…

Machine Learning · Statistics 2022-06-22 Ambrus Tamás , Balázs Csanád Csáji

Resampled Confidence Regions with Exponential Shrinkage for the Regression Function of Binary Classification

The regression function is one of the key objects of binary classification, since it not only determines a Bayes optimal classifier, hence, defines an optimal decision boundary, but also encodes the conditional distribution of the output…

Machine Learning · Statistics 2025-06-03 Ambrus Tamás , Balázs Csanád Csáji

Inference problems in binary regression model with misclassified responses

Misclassification of binary responses, if ignored, may severely bias the maximum likelihood estimators (MLE) of regression parameters. For such data, a binary regression model incorporating misclassification probabilities is extensively…

Statistics Theory · Mathematics 2020-09-28 Arindam Chatterjee , Tathagata Bandyopadhyay , Sumanta Adhya

Adjusting Regression Models for Conditional Uncertainty Calibration

Conformal Prediction methods have finite-sample distribution-free marginal coverage guarantees. However, they generally do not offer conditional coverage guarantees, which can be important for high-stakes decisions. In this paper, we…

Machine Learning · Statistics 2024-09-27 Ruijiang Gao , Mingzhang Yin , James McInerney , Nathan Kallus

Self-Training: A Survey

Semi-supervised algorithms aim to learn prediction functions from a small set of labeled observations and a large set of unlabeled observations. Because this framework is relevant in many applications, they have received a lot of interest…

Machine Learning · Computer Science 2025-02-17 Massih-Reza Amini , Vasilii Feofanov , Loic Pauletto , Lies Hadjadj , Emilie Devijver , Yury Maximov

Selection consistency of Lasso-based procedures for misspecified high-dimensional binary model and random regressors

We consider selection of random predictors for high-dimensional regression problem with binary response for a general loss function. Important special case is when the binary model is semiparametric and the response function is misspecified…

Statistics Theory · Mathematics 2020-02-19 Mariusz Kubkowski , Jan Mielniczuk

Nonparametric semi-supervised learning of class proportions

The problem of developing binary classifiers from positive and unlabeled data is often encountered in machine learning. A common requirement in this setting is to approximate posterior probabilities of positive and negative classes for a…

Machine Learning · Statistics 2016-01-11 Shantanu Jain , Martha White , Michael W. Trosset , Predrag Radivojac

Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning

We consider the estimation problem in high-dimensional semi-supervised learning. Our goal is to investigate when and how the unlabeled data can be exploited to improve the estimation of the regression parameters of linear model in light of…

Methodology · Statistics 2023-03-21 Siyi Deng , Yang Ning , Jiwei Zhao , Heping Zhang

Binary Classifier Calibration: Bayesian Non-Parametric Approach

A set of probabilistic predictions is well calibrated if the events that are predicted to occur with probability p do in fact occur about p fraction of the time. Well calibrated predictions are particularly important when machine learning…

Machine Learning · Statistics 2014-01-14 Mahdi Pakdaman Naeini , Gregory F. Cooper , Milos Hauskrecht

Leveraging Uncertainty Estimates To Improve Classifier Performance

Binary classification involves predicting the label of an instance based on whether the model score for the positive class exceeds a threshold chosen based on the application requirements (e.g., maximizing recall for a precision bound).…

Machine Learning · Computer Science 2023-11-21 Gundeep Arora , Srujana Merugu , Anoop Saladi , Rajeev Rastogi

From Uncertainty to Precision: Enhancing Binary Classifier Performance through Calibration

The assessment of binary classifier performance traditionally centers on discriminative ability using metrics, such as accuracy. However, these metrics often disregard the model's inherent uncertainty, especially when dealing with sensitive…

Machine Learning · Computer Science 2024-02-13 Agathe Fernandes Machado , Arthur Charpentier , Emmanuel Flachaire , Ewen Gallic , François Hu

Binary Regression and Classification with Covariates in Metric Spaces

Inspired by logistic regression, we introduce a regression model for data tuples consisting of a binary response and a set of covariates residing in a metric space without vector structures. Based on the proposed model we also develop a…

Methodology · Statistics 2024-02-15 Yinan Lin , Zhenhua Lin

Optimal Binary Classification Beyond Accuracy

The vast majority of statistical theory on binary classification characterizes performance in terms of accuracy. However, accuracy is known in many cases to poorly reflect the practical consequences of classification error, most famously in…

Statistics Theory · Mathematics 2022-09-27 Shashank Singh , Justin Khim

Binary Classification in Unstructured Space With Hypergraph Case-Based Reasoning

Binary classification is one of the most common problem in machine learning. It consists in predicting whether a given element belongs to a particular class. In this paper, a new algorithm for binary classification is proposed using a…

Machine Learning · Computer Science 2019-03-12 Alexandre Quemy

Semiparametric Penalized Spline Regression

In this paper, we propose a new semiparametric regression estimator by using a hybrid technique of a parametric approach and a nonparametric penalized spline method. The overall shape of the true regression function is captured by the…

Statistics Theory · Mathematics 2012-02-17 Takuma Yoshida , Kanta Naito

Binary Classification: Counterbalancing Class Imbalance by Applying Regression Models in Combination with One-Sided Label Shifts

In many real-world pattern recognition scenarios, such as in medical applications, the corresponding classification tasks can be of an imbalanced nature. In the current study, we focus on binary, imbalanced classification tasks, i.e.~binary…

Machine Learning · Computer Science 2020-12-01 Peter Bellmann , Heinke Hihn , Daniel A. Braun , Friedhelm Schwenker

Is distribution-free inference possible for binary regression?

For a regression problem with a binary label response, we examine the problem of constructing confidence intervals for the label probability conditional on the features. In a setting where we do not have any information about the underlying…

Statistics Theory · Mathematics 2020-10-09 Rina Foygel Barber

On regression and classification with possibly missing response variables in the data

This paper considers the problem of kernel regression and classification with possibly unobservable response variables in the data, where the mechanism that causes the absence of information is unknown and can depend on both predictors and…

Statistics Theory · Mathematics 2022-12-07 Majid Mojirsheibani , William Pouliot , Andre Shakhbandaryan

Conditional Transformation Models

The ultimate goal of regression analysis is to obtain information about the conditional distribution of a response given a set of explanatory variables. This goal is, however, seldom achieved because most established regression models only…

Methodology · Statistics 2017-12-13 Torsten Hothorn , Thomas Kneib , Peter Bühlmann

Bayesian bandwidth estimation and semi-metric selection for a functional partial linear model with unknown error density

This study examines the optimal selections of bandwidth and semi-metric for a functional partial linear model. Our proposed method begins by estimating the unknown error density using a kernel density estimator of residuals, where the…

Methodology · Statistics 2020-11-17 Han Lin Shang