Related papers: Optimal Binary Classification Beyond Accuracy

Nearest Neighbor Classification based on Imbalanced Data: A Statistical Approach

When the competing classes in a classification problem are not of comparable size, many popular classifiers exhibit a bias towards larger classes, and the nearest neighbor classifier is no exception. To take care of this problem, we develop…

Methodology · Statistics 2023-11-02 Anvit Garg , Anil K. Ghosh , Soham Sarkar

The Interplay between Distribution Parameters and the Accuracy-Robustness Tradeoff in Classification

Adversarial training tends to result in models that are less accurate on natural (unperturbed) examples compared to standard models. This can be attributed to either an algorithmic shortcoming or a fundamental property of the training data…

Machine Learning · Computer Science 2021-07-02 Alireza Mousavi Hosseini , Amir Mohammad Abouei , Mohammad Hossein Rohban

Leveraging Uncertainty Estimates To Improve Classifier Performance

Binary classification involves predicting the label of an instance based on whether the model score for the positive class exceeds a threshold chosen based on the application requirements (e.g., maximizing recall for a precision bound).…

Machine Learning · Computer Science 2023-11-21 Gundeep Arora , Srujana Merugu , Anoop Saladi , Rajeev Rastogi

Beyond Rebalancing: Benchmarking Binary Classifiers Under Class Imbalance Without Rebalancing Techniques

Class imbalance poses a significant challenge to supervised classification, particularly in critical domains like medical diagnostics and anomaly detection where minority class instances are rare. While numerous studies have explored…

Machine Learning · Computer Science 2025-09-10 Ali Nawaz , Amir Ahmad , Shehroz S. Khan

A Bayesian Approach for Accurate Classification-Based Aggregates

In this paper, we study the accuracy of values aggregated over classes predicted by a classification algorithm. The problem is that the resulting aggregates (e.g., sums of a variable) are known to be biased. The bias can be large even for…

Machine Learning · Statistics 2019-12-02 Q. A. Meertens , C. G. H. Diks , H. J. van den Herik , F W Takes

Linear and Order Statistics Combiners for Pattern Classification

Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical…

Neural and Evolutionary Computing · Computer Science 2007-05-23 Kagan Tumer , Joydeep Ghosh

PAC-Bayes Analysis for Recalibration in Classification

Nonparametric estimation using uniform-width binning is a standard approach for evaluating the calibration performance of machine learning models. However, existing theoretical analyses of the bias induced by binning are limited to binary…

Machine Learning · Computer Science 2025-07-14 Masahiro Fujisawa , Futoshi Futami

Striking the Right Balance with Uncertainty

Learning unbiased models on imbalanced datasets is a significant challenge. Rare classes tend to get a concentrated representation in the classification space which hampers the generalization of learned boundaries to new test examples. In…

Computer Vision and Pattern Recognition · Computer Science 2019-04-11 Salman Khan , Munawar Hayat , Waqas Zamir , Jianbing Shen , Ling Shao

Binary Classifier Calibration: Bayesian Non-Parametric Approach

A set of probabilistic predictions is well calibrated if the events that are predicted to occur with probability p do in fact occur about p fraction of the time. Well calibrated predictions are particularly important when machine learning…

Machine Learning · Statistics 2014-01-14 Mahdi Pakdaman Naeini , Gregory F. Cooper , Milos Hauskrecht

Efficient Set-Valued Prediction in Multi-Class Classification

In cases of uncertainty, a multi-class classifier preferably returns a set of candidate classes instead of predicting a single class label with little guarantee. More precisely, the classifier should strive for an optimal balance between…

Machine Learning · Computer Science 2020-05-28 Thomas Mortier , Marek Wydmuch , Krzysztof Dembczyński , Eyke Hüllermeier , Willem Waegeman

Provable tradeoffs in adversarially robust classification

It is well known that machine learning methods can be vulnerable to adversarially-chosen perturbations of their inputs. Despite significant progress in the area, foundational open problems remain. In this paper, we address several key…

Machine Learning · Computer Science 2024-10-30 Edgar Dobriban , Hamed Hassani , David Hong , Alexander Robey

Learning Confidence Bounds for Classification with Imbalanced Data

Class imbalance poses a significant challenge in classification tasks, where traditional approaches often lead to biased models and unreliable predictions. Undersampling and oversampling techniques have been commonly employed to address…

Machine Learning · Computer Science 2025-10-22 Matt Clifford , Jonathan Erskine , Alexander Hepburn , Raúl Santos-Rodríguez , Dario Garcia-Garcia

Practical estimation of the optimal classification error with soft labels and calibration

While the performance of machine learning systems has experienced significant improvement in recent years, relatively little attention has been paid to the fundamental question: to what extent can we improve our models? This paper provides…

Machine Learning · Computer Science 2026-05-13 Ryota Ushio , Takashi Ishida , Masashi Sugiyama

Data organization limits the predictability of binary classification

The structure of data organization is widely recognized as having a substantial influence on the efficacy of machine learning algorithms, particularly in binary classification tasks. Our research provides a theoretical framework suggesting…

Machine Learning · Computer Science 2024-07-15 Fei Jing , Zi-Ke Zhang , Yi-Cheng Zhang , Qingpeng Zhang

Fair Decisions from Calibrated Scores: Achieving Optimal Classification While Satisfying Sufficiency

Binary classification based on predicted probabilities (scores) is a fundamental task in supervised machine learning. While thresholding scores is Bayes-optimal in the unconstrained setting, using a single threshold generally violates…

Machine Learning · Computer Science 2026-02-10 Etam Benger , Katrina Ligett

Balancing the Scales: A Comprehensive Study on Tackling Class Imbalance in Binary Classification

Class imbalance in binary classification tasks remains a significant challenge in machine learning, often resulting in poor performance on minority classes. This study comprehensively evaluates three widely-used strategies for handling…

Machine Learning · Computer Science 2024-10-01 Mohamed Abdelhamid , Abhyuday Desai

Ask for More Than Bayes Optimal: A Theory of Indecisions for Classification

Selective classification is a powerful tool for automated decision-making in high-risk scenarios, allowing classifiers to act only when confident and abstain when uncertainty is high. Given a target accuracy, our goal is to minimize…

Statistics Theory · Mathematics 2025-10-28 Mohamed Ndaoud , Peter Radchenko , Bradley Rava

A method for classification of data with uncertainty using hypothesis testing

Binary classification is a task that involves the classification of data into one of two distinct classes. It is widely utilized in various fields. However, conventional classifiers tend to make overconfident predictions for data that…

Machine Learning · Computer Science 2025-03-13 Shoma Yokura , Akihisa Ichiki

Accuracy vs. Accuracy: Computational Tradeoffs Between Classification Rates and Utility

We revisit the foundations of fairness and its interplay with utility and efficiency in settings where the training data contain richer labels, such as individual types, rankings, or risk estimates, rather than just binary outcomes. In this…

Machine Learning · Computer Science 2025-05-23 Noga Amit , Omer Reingold , Guy N. Rothblum

Binary Classification: Counterbalancing Class Imbalance by Applying Regression Models in Combination with One-Sided Label Shifts

In many real-world pattern recognition scenarios, such as in medical applications, the corresponding classification tasks can be of an imbalanced nature. In the current study, we focus on binary, imbalanced classification tasks, i.e.~binary…

Machine Learning · Computer Science 2020-12-01 Peter Bellmann , Heinke Hihn , Daniel A. Braun , Friedhelm Schwenker