Related papers: Multi-category Angle-based Classifier Refit
In many real applications of statistical learning, a decision made from misclassification can be too costly to afford; in this case, a reject option, which defers the decision until further investigation is conducted, is often preferred. In…
Multilabel classification is a relatively recent subfield of machine learning. Unlike to the classical approach, where instances are labeled with only one category, in multilabel classification, an arbitrary number of categories is chosen…
In this work, we study a new approach to optimizing the margin distribution realized by binary classifiers. The classical approach to this problem is simply maximization of the expected margin, while more recent proposals consider…
Many classification applications require accurate probability estimates in addition to good class separation but often classifiers are designed focusing only on the latter. Calibration is the process of improving probability estimates by…
Binary classification is a common statistical learning problem in which a model is estimated on a set of covariates for some outcome indicating the membership of one of two classes. In the literature, there exists a distinction between hard…
Advanced classification algorithms are being increasingly used in safety-critical applications like health-care, engineering, etc. In such applications, miss-classifications made by ML algorithms can result in substantial financial or…
This paper provides both an introduction to and a detailed overview of the principles and practice of classifier calibration. A well-calibrated classifier correctly quantifies the level of uncertainty or confidence associated with its…
Binary classification is a task that involves the classification of data into one of two distinct classes. It is widely utilized in various fields. However, conventional classifiers tend to make overconfident predictions for data that…
Accurate and reliable probability predictions are essential for multi-class supervised learning tasks, where well-calibrated models enable rational decision-making. While isotonic regression has proven effective for binary calibration, its…
Probabilistic models must be well calibrated to support reliable decision-making. While calibration in single-output regression is well studied, defining and achieving multivariate calibration in multi-output regression remains considerably…
Classification is a fundamental task in machine learning. While conventional methods-such as binary, multiclass, and multi-label classification-are effective for simpler problems, they may not adequately address the complexities of some…
We study the problem of supervised learning for both binary and multiclass classification from a unified geometric perspective. In particular, we propose a geometric regularization technique to find the submanifold corresponding to a robust…
Minimizing inconsistencies across successive versions of an AI system is as crucial as reducing the overall error. In image classification, such inconsistencies manifest as negative flips, where an updated model misclassifies test samples…
Latent class models are widely used for identifying unobserved subgroups from multivariate categorical data in social sciences, with binary data as a particularly popular example. However, accurately recovering individual latent class…
We investigate the problem of multiclass classification with rejection, where a classifier can choose not to make a prediction to avoid critical misclassification. First, we consider an approach based on simultaneous training of a…
The number of possible methods of generalizing binary classification to multi-class classification increases exponentially with the number of class labels. Often, the best method of doing so will be highly problem dependent. Here we present…
We develop a new classification framework based on the theory of coherent risk measures and systemic risk. The proposed approach is suitable for multi-class problems when the data is noisy, scarce (relative to the dimension of the problem),…
In most machine learning applications, classification accuracy is not the primary metric of interest. Binary classifiers which face class imbalance are often evaluated by the $F_\beta$ score, area under the precision-recall curve, Precision…
In classification problems, sampling bias between training data and testing data is critical to the ranking performance of classification scores. Such bias can be both unintentionally introduced by data collection and intentionally…
Excessive reuse of holdout data can lead to overfitting. However, there is little concrete evidence of significant overfitting due to holdout reuse in popular multiclass benchmarks today. Known results show that, in the worst-case,…