Learning Classifiers for Imbalanced and Overlapping Data

Shivaditya Shivganesh; Nitin Narayanan N; Pranav Murali; Ajaykumar M

Learning Classifiers for Imbalanced and Overlapping Data

Machine Learning 2022-10-25 v1 Information Theory math.IT

Authors: Shivaditya Shivganesh , Nitin Narayanan N , Pranav Murali , Ajaykumar M

Abstract

This study is about inducing classifiers using data that is imbalanced, with a minority class being under-represented in relation to the majority classes. The first section of this research focuses on the main characteristics of data that generate this problem. Following a study of previous, relevant research, a variety of artificial, imbalanced data sets influenced by important elements were created. These data sets were used to create decision trees and rule-based classifiers. The second section of this research looks into how to improve classifiers by pre-processing data with resampling approaches. The results of the following trials are compared to the performance of distinct pre-processing re-sampling methods: two variants of random over-sampling and focused under-sampling NCR. This paper further optimises class imbalance with a new method called Sparsity. The data is made more sparse from its class centers, hence making it more homogenous.

Keywords

class imbalance learning classification machine learning theory

Cite

@article{arxiv.2210.12446,
  title  = {Learning Classifiers for Imbalanced and Overlapping Data},
  author = {Shivaditya Shivganesh and Nitin Narayanan N and Pranav Murali and Ajaykumar M},
  journal= {arXiv preprint arXiv:2210.12446},
  year   = {2022}
}

Learning Classifiers for Imbalanced and Overlapping Data

Abstract

Keywords

Cite

Related papers