Related papers: $F_\beta$-plot -- a visual tool for evaluating imb…
Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest. The standard set of proper evaluation metrics is well-known but the usual assumption is that the test dataset imbalance equals…
With a plethora of available classification performance measures, choosing the right metric for the right task requires careful thought. To make this decision in an informed manner, one should study and compare general properties of…
For research to go in the right direction, it is essential to be able to compare and quantify performance of different algorithms focused on the same problem. Choosing a suitable evaluation metric requires deep understanding of the pursued…
Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes, making fair assessment of classifiers a challenging task. Metrics such as Balanced Accuracy are commonly used to evaluate a…
Imbalanced classification tasks are widespread in many real-world applications. For such classification tasks, in comparison with the accuracy rate, it is usually much more appropriate to use non-decomposable performance measures such as…
The $F_\beta$ score is a commonly used measure of classification performance, which plays crucial roles in classification tasks with imbalanced data sets. However, the $F_\beta$ score cannot be used as a loss function by gradient-based…
Class-level evaluation can conceal substantial performance disparities across subconcepts within the same class, causing models that perform well on average to fail on specific subpopulations. Prior work has shown that common evaluation…
Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by…
Logistic regression is a widely used method in several fields. When applying logistic regression to imbalanced data, for which majority classes dominate over minority classes, all class labels are estimated as `majority class.' In this…
Imbalanced problems can arise in different real-world situations, and to address this, certain strategies in the form of resampling or balancing algorithms are proposed. This issue has largely been studied in the context of classification,…
The problem of class imbalance is extensive for focusing on numerous applications in the real world. In such a situation, nearly all of the examples are labeled as one class called majority class, while far fewer examples are labeled as the…
The multi-class prediction had gained popularity over recent years. Thus measuring fit goodness becomes a cardinal question that researchers often have to deal with. Several metrics are commonly used for this task. However, when one has to…
Several performance measures can be used for evaluating classification results: accuracy, F-measure, and many others. Can we say that some of them are better than others, or, ideally, choose one measure that is best in all situations? To…
Classification data sets with skewed class proportions are called imbalanced. Class imbalance is a problem since most machine learning classification algorithms are built with an assumption of equal representation of all classes in the…
The class-imbalance issue is intrinsic to many real-world machine learning tasks, particularly to the rare-event classification problems. Although the impact and treatment of imbalanced data is widely known, the magnitude of a metric's…
Learning classifiers using skewed or imbalanced datasets can occasionally lead to classification issues; this is a serious issue. In some cases, one class contains the majority of examples while the other, which is frequently the more…
Many real world data mining applications involve obtaining predictive models using data sets with strongly imbalanced distributions of the target variable. Frequently, the least common values of this target variable are associated with…
Learning from imbalanced data is one of the most significant challenges in real-world classification tasks. In such cases, neural networks performance is substantially impaired due to preference towards the majority class. Existing…
In most machine learning applications, classification accuracy is not the primary metric of interest. Binary classifiers which face class imbalance are often evaluated by the $F_\beta$ score, area under the precision-recall curve, Precision…
The separation of performance metrics from gradient based loss functions may not always give optimal results and may miss vital aggregate information. This paper investigates incorporating a performance metric alongside differentiable loss…