Related papers: A Structured Reasoning Framework for Unbalanced Da…

A Skew-Sensitive Evaluation Framework for Imbalanced Data Classification

Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes, making fair assessment of classifiers a challenging task. Metrics such as Balanced Accuracy are commonly used to evaluate a…

Machine Learning · Computer Science 2023-11-20 Min Du , Nesime Tatbul , Brian Rivers , Akhilesh Kumar Gupta , Lucas Hu , Wei Wang , Ryan Marcus , Shengtian Zhou , Insup Lee , Justin Gottschlich

Addressing Class Imbalance with Probabilistic Graphical Models and Variational Inference

This study proposes a method for imbalanced data classification based on deep probabilistic graphical models (DPGMs) to solve the problem that traditional methods have insufficient learning ability for minority class samples. To address the…

Machine Learning · Computer Science 2025-04-09 Yujia Lou , Jie Liu , Yuan Sheng , Jiawei Wang , Yiwei Zhang , Yaokun Ren

A comparison of Deep Learning performances with other machine learning algorithms on credit scoring unbalanced data

Training models on highly unbalanced data is admitted to be a challenging task for machine learning algorithms. Current studies on deep learning mainly focus on data sets with balanced class labels or unbalanced data, but with massive…

Machine Learning · Computer Science 2020-02-27 Louis Marceau , Lingling Qiu , Nick Vandewiele , Eric Charton

Skew-Probabilistic Neural Networks for Learning from Imbalanced Data

Real-world datasets often exhibit imbalanced data distribution, where certain class levels are severely underrepresented. In such cases, traditional pattern classifiers have shown a bias towards the majority class, impeding accurate…

Machine Learning · Statistics 2025-08-12 Shraddha M. Naik , Tanujit Chakraborty , Madhurima Panja , Abdenour Hadid , Bibhas Chakraborty

Box Drawings for Learning with Imbalanced Data

The vast majority of real world classification problems are imbalanced, meaning there are far fewer data from the class of interest (the positive class) than from other classes. We propose two machine learning algorithms to handle highly…

Machine Learning · Statistics 2014-06-10 Siong Thye Goh , Cynthia Rudin

An Adaptive Cost-Sensitive Learning and Recursive Denoising Framework for Imbalanced SVM Classification

Category imbalance is one of the most popular and important issues in the domain of classification. Emotion classification model trained on imbalanced datasets easily leads to unreliable prediction. The traditional machine learning method…

Computer Vision and Pattern Recognition · Computer Science 2025-01-27 Lu Jiang , Qi Wang , Yuhang Chang , Jianing Song , Haoyue Fu , Xiaochun Yang

Deep Reinforcement Learning for Imbalanced Classification

Data in real-world application often exhibit skewed class distribution which poses an intense challenge for machine learning. Conventional classification algorithms are not effective in the case of imbalanced data distribution, and may fail…

Machine Learning · Computer Science 2019-01-08 Enlu Lin , Qiong Chen , Xiaoming Qi

Learning Confidence Bounds for Classification with Imbalanced Data

Class imbalance poses a significant challenge in classification tasks, where traditional approaches often lead to biased models and unreliable predictions. Undersampling and oversampling techniques have been commonly employed to address…

Machine Learning · Computer Science 2025-10-22 Matt Clifford , Jonathan Erskine , Alexander Hepburn , Raúl Santos-Rodríguez , Dario Garcia-Garcia

Structure Learning of Contextual Markov Networks using Marginal Pseudo-likelihood

Markov networks are popular models for discrete multivariate systems where the dependence structure of the variables is specified by an undirected graph. To allow for more expressive dependence structures, several generalizations of Markov…

Methodology · Statistics 2021-03-30 Johan Pensar , Henrik Nyman , Jukka Corander

CRCEN: A Generalized Cost-sensitive Neural Network Approach for Imbalanced Classification

Classification on imbalanced datasets is a challenging task in real-world applications. Training conventional classification algorithms directly by minimizing classification error in this scenario can compromise model performance for…

Machine Learning · Computer Science 2020-03-05 Xiangrui Li , Dongxiao Zhu

A Descriptive Study of Variable Discretization and Cost-Sensitive Logistic Regression on Imbalanced Credit Data

Training classification models on imbalanced data tends to result in bias towards the majority class. In this paper, we demonstrate how variable discretization and cost-sensitive logistic regression help mitigate this bias on an imbalanced…

Applications · Statistics 2019-07-29 Lili Zhang , Herman Ray , Jennifer Priestley , Soon Tan

Tackling Diverse Minorities in Imbalanced Classification

Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. When working with large datasets, the imbalanced issue can be further exacerbated, making it…

Machine Learning · Computer Science 2023-08-30 Kwei-Herng Lai , Daochen Zha , Huiyuan Chen , Mangesh Bendre , Yuzhong Chen , Mahashweta Das , Hao Yang , Xia Hu

Extrapolated Markov Chain Oversampling Method for Imbalanced Text Classification

Text classification is the task of automatically assigning text documents correct labels from a predefined set of categories. In real-life (text) classification tasks, observations and misclassification costs are often unevenly distributed…

Machine Learning · Computer Science 2025-09-03 Aleksi Avela , Pauliina Ilmonen

Empirical study of Machine Learning Classifier Evaluation Metrics behavior in Massively Imbalanced and Noisy data

With growing credit card transaction volumes, the fraud percentages are also rising, including overhead costs for institutions to combat and compensate victims. The use of machine learning into the financial sector permits more effective…

Machine Learning · Computer Science 2022-08-26 Gayan K. Kulatilleke , Sugandika Samarakoon

Discriminative Probabilistic Models for Relational Data

In many supervised learning tasks, the entities to be labeled are related to each other in complex ways and their labels are not independent. For example, in hypertext classification, the labels of linked pages are highly correlated. A…

Machine Learning · Computer Science 2013-01-07 Ben Taskar , Pieter Abbeel , Daphne Koller

On Learning Prediction-Focused Mixtures

Probabilistic models help us encode latent structures that both model the data and are ideally also useful for specific downstream tasks. Among these, mixture models and their time-series counterparts, hidden Markov models, identify…

Machine Learning · Computer Science 2021-10-29 Abhishek Sharma , Catherine Zeng , Sanjana Narayanan , Sonali Parbhoo , Finale Doshi-Velez

Handling Class Imbalance in Link Prediction using Learning to Rank Techniques

We consider the link prediction problem in a partially observed network, where the objective is to make predictions in the unobserved portion of the network. Many existing methods reduce link prediction to binary classification problem.…

Machine Learning · Statistics 2016-02-23 Bopeng Li , Sougata Chaudhuri , Ambuj Tewari

Balanced Split: A new train-test data splitting strategy for imbalanced datasets

Classification data sets with skewed class proportions are called imbalanced. Class imbalance is a problem since most machine learning classification algorithms are built with an assumption of equal representation of all classes in the…

Machine Learning · Computer Science 2022-12-22 Azal Ahmad Khan

Risk-averse Fair Multi-class Classification

We develop a new classification framework based on the theory of coherent risk measures and systemic risk. The proposed approach is suitable for multi-class problems when the data is noisy, scarce (relative to the dimension of the problem),…

Machine Learning · Statistics 2026-05-29 Darinka Dentcheva , Xiangyu Tian

Markov Decision Processes of the Third Kind: Learning Distributions by Policy Gradient Descent

The goal of this paper is to analyze distributional Markov Decision Processes as a class of control problems in which the objective is to learn policies that steer the distribution of a cumulative reward toward a prescribed target law,…

Optimization and Control · Mathematics 2026-02-09 Nicole Bäuerle , Athanasios Vasileiadis