Related papers: Classifying extremely imbalanced data sets

Multi Instance Learning For Unbalanced Data

In the context of Multi Instance Learning, we analyze the Single Instance (SI) learning objective. We show that when the data is unbalanced and the family of classifiers is sufficiently rich, the SI method is a useful learning algorithm. In…

Machine Learning · Computer Science 2018-12-19 Mark Kozdoba , Edward Moroshko , Lior Shani , Takuya Takagi , Takashi Katoh , Shie Mannor , Koby Crammer

Box Drawings for Learning with Imbalanced Data

The vast majority of real world classification problems are imbalanced, meaning there are far fewer data from the class of interest (the positive class) than from other classes. We propose two machine learning algorithms to handle highly…

Machine Learning · Statistics 2014-06-10 Siong Thye Goh , Cynthia Rudin

Adversarial Classifier for Imbalanced Problems

Adversarial approach has been widely used for data generation in the last few years. However, this approach has not been extensively utilized for classifier training. In this paper, we propose an adversarial framework for classifier…

Machine Learning · Computer Science 2018-11-22 Ehsan Montahaei , Mahsa Ghorbani , Mahdieh Soleymani Baghshah , Hamid R. Rabiee

Imbalanced Classification via Explicit Gradient Learning From Augmented Data

Learning from imbalanced data is one of the most significant challenges in real-world classification tasks. In such cases, neural networks performance is substantially impaired due to preference towards the majority class. Existing…

Machine Learning · Computer Science 2022-11-13 Bronislav Yasinnik , Moshe Salhov , Ofir Lindenbaum , Amir Averbuch

Application of the rule-growing algorithm RIPPER to particle physics analysis

A large hadron machine like the LHC with its high track multiplicities always asks for powerful tools that drastically reduce the large background while selecting signal events efficiently. Actually such tools are widely needed and used in…

Data Analysis, Statistics and Probability · Physics 2014-11-20 Markward Britsch , Nikolai Gagunashvili , Michael Schmelling

An Empirical Analysis of the Efficacy of Different Sampling Techniques for Imbalanced Classification

Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by…

Machine Learning · Computer Science 2022-08-26 Asif Newaz , Shahriar Hassan , Farhan Shahriyar Haq

Deep Reinforcement Learning for Multi-class Imbalanced Training

With the rapid growth of memory and computing power, datasets are becoming increasingly complex and imbalanced. This is especially severe in the context of clinical data, where there may be one rare event for many cases in the majority…

Machine Learning · Computer Science 2022-05-25 Jenny Yang , Rasheed El-Bouri , Odhran O'Donoghue , Alexander S. Lachapelle , Andrew A. S. Soltan , David A. Clifton

Review of Methods for Handling Class-Imbalanced in Classification Problems

Learning classifiers using skewed or imbalanced datasets can occasionally lead to classification issues; this is a serious issue. In some cases, one class contains the majority of examples while the other, which is frequently the more…

Machine Learning · Computer Science 2022-11-11 Satyendra Singh Rawat , Amit Kumar Mishra

Optimizing for ROC Curves on Class-Imbalanced Data by Training over a Family of Loss Functions

Although binary classification is a well-studied problem in computer vision, training reliable classifiers under severe class imbalance remains a challenging problem. Recent work has proposed techniques that mitigate the effects of training…

Machine Learning · Computer Science 2024-06-06 Kelsey Lieberman , Shuai Yuan , Swarna Kamlam Ravindran , Carlo Tomasi

A comparison of Deep Learning performances with other machine learning algorithms on credit scoring unbalanced data

Training models on highly unbalanced data is admitted to be a challenging task for machine learning algorithms. Current studies on deep learning mainly focus on data sets with balanced class labels or unbalanced data, but with massive…

Machine Learning · Computer Science 2020-02-27 Louis Marceau , Lingling Qiu , Nick Vandewiele , Eric Charton

Learning Classifiers for Imbalanced and Overlapping Data

This study is about inducing classifiers using data that is imbalanced, with a minority class being under-represented in relation to the majority classes. The first section of this research focuses on the main characteristics of data that…

Machine Learning · Computer Science 2022-10-25 Shivaditya Shivganesh , Nitin Narayanan N , Pranav Murali , Ajaykumar M

A Survey of Methods for Managing the Classification and Solution of Data Imbalance Problem

The problem of class imbalance is extensive for focusing on numerous applications in the real world. In such a situation, nearly all of the examples are labeled as one class called majority class, while far fewer examples are labeled as the…

Machine Learning · Computer Science 2020-12-23 Khan Md. Hasib , Md. Sadiq Iqbal , Faisal Muhammad Shah , Jubayer Al Mahmud , Mahmudul Hasan Popel , Md. Imran Hossain Showrov , Shakil Ahmed , Obaidur Rahman

Tradeoffs in Resampling and Filtering for Imbalanced Classification

Imbalanced classification problems are extremely common in natural language processing and are solved using a variety of resampling and filtering techniques, which often involve making decisions on how to select training data or decide…

Computation and Language · Computer Science 2022-09-02 Ryan Muther , David Smith

Analyzing the Effects of Handling Data Imbalance on Learned Features from Medical Images by Looking Into the Models

One challenging property lurking in medical datasets is the imbalanced data distribution, where the frequency of the samples between the different classes is not balanced. Training a model on an imbalanced dataset can introduce unique…

Image and Video Processing · Electrical Eng. & Systems 2022-04-06 Ashkan Khakzar , Yawei Li , Yang Zhang , Mirac Sanisoglu , Seong Tae Kim , Mina Rezaei , Bernd Bischl , Nassir Navab

Bridging the Gap: Simultaneous Fine Tuning for Data Re-Balancing

There are many real-world classification problems wherein the issue of data imbalance (the case when a data set contains substantially more samples for one/many classes than the rest) is unavoidable. While under-sampling the problematic…

Computer Vision and Pattern Recognition · Computer Science 2018-01-09 John McKay , Isaac Gerg , Vishal Monga

A Review of Machine Learning Techniques in Imbalanced Data and Future Trends

For over two decades, detecting rare events has been a challenging task among researchers in the data mining and machine learning domain. Real-life problems inspire researchers to navigate and further improve data processing and algorithmic…

Machine Learning · Computer Science 2025-09-09 Elaheh Jafarigol , Theodore Trafalis , Neshat Mohammadi

Class Imbalance Problem in Data Mining Review

In last few years there are major changes and evolution has been done on classification of data. As the application area of technology is increases the size of data also increases. Classification of data becomes difficult because of…

Machine Learning · Computer Science 2013-05-09 Rushi Longadge , Snehalata Dongre

A Skew-Sensitive Evaluation Framework for Imbalanced Data Classification

Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes, making fair assessment of classifiers a challenging task. Metrics such as Balanced Accuracy are commonly used to evaluate a…

Machine Learning · Computer Science 2023-11-20 Min Du , Nesime Tatbul , Brian Rivers , Akhilesh Kumar Gupta , Lucas Hu , Wei Wang , Ryan Marcus , Shengtian Zhou , Insup Lee , Justin Gottschlich

A survey on learning from imbalanced data streams: taxonomy, challenges, empirical study, and reproducible experimental framework

Class imbalance poses new challenges when it comes to classifying data streams. Many algorithms recently proposed in the literature tackle this problem using a variety of data-level, algorithm-level, and ensemble approaches. However, there…

Machine Learning · Computer Science 2023-07-19 Gabriel Aguiar , Bartosz Krawczyk , Alberto Cano

Latent Vector Expansion using Autoencoder for Anomaly Detection

Deep learning methods can classify various unstructured data such as images, language, and voice as input data. As the task of classifying anomalies becomes more important in the real world, various methods exist for classifying using deep…

Computer Vision and Pattern Recognition · Computer Science 2022-01-06 UJu Gim , YeongHyeon Park