Related papers: IMBENS: Ensemble Class-imbalanced Learning in Pyth…
Imbalanced-learn is an open-source python toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced dataset frequently encountered in machine learning and pattern recognition. The implemented…
Imbalanced learning (IL), i.e., learning unbiased models from class-imbalanced data, is a challenging problem. Typical IL methods including resampling and reweighting were designed based on some heuristic assumptions. They often suffer from…
Class-imbalance is a common problem in machine learning practice. Typical Imbalanced Learning (IL) methods balance the data via intuitive class-wise resampling or reweighting. However, previous studies suggest that beyond class-imbalance,…
Class-imbalanced learning (CIL) on tabular data is important in many real-world applications where the minority class holds the critical but rare outcomes. In this paper, we present CLIMB, a comprehensive benchmark for class-imbalanced…
This research addresses the challenges of handling unbalanced datasets for binary classification tasks. In such scenarios, standard evaluation metrics are often biased by the disproportionate representation of the minority class. Conducting…
Many real-world applications reveal difficulties in learning classifiers from imbalanced data. The rising big data era has been witnessing more classification tasks with large-scale but extremely imbalance and low-quality datasets. Most of…
Class imbalance (CI) in classification problems arises when the number of observations belonging to one class is lower than the other. Ensemble learning combines multiple models to obtain a robust model and has been prominently used with…
Ensemble technique and under-sampling technique are both effective tools used for imbalanced dataset classification problems. In this paper, a novel ensemble method combining the advantages of both ensemble learning for biasing classifiers…
Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One…
To accelerate learning process with few samples, meta-learning resorts to prior knowledge from previous tasks. However, the inconsistent task distribution and heterogeneity is hard to be handled through a global sharing model…
lcensemble is a high-performing, scalable and user-friendly Python package for the general tasks of classification and regression. The package implements Local Cascade Ensemble (LCE), a machine learning method that further enhances the…
The class imbalance problem is important and challenging. Ensemble approaches are widely used to tackle this problem because of their effectiveness. However, existing ensemble methods are always applied into original samples, while not…
Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the…
In this paper, we introduce eipy--an open-source Python package for developing effective, multi-modal heterogeneous ensembles for classification. eipy simultaneously provides both a rigorous, and user-friendly framework for comparing and…
Private business schools in India face a common problem of selecting quality students for their MBA programs to achieve the desired placement percentage. Generally, such data sets are biased towards one class, i.e., imbalanced in nature.…
Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the…
The aim of Active Learning is to select the most informative samples from an unlabelled set of data. This is useful in cases where the amount of data is large and labelling is expensive, such as in machine vision or medical imaging. Two…
Despite increasing popularity in empirical studies, the integration of machine learning generated variables into regression models for statistical inference suffers from the measurement error problem, which can bias estimation and threaten…
We introduce a new library named abess that implements a unified framework of best-subset selection for solving diverse machine learning problems, e.g., linear regression, classification, and principal component analysis. Particularly, the…
Class imbalance remains a major challenge in machine learning, especially in multi-class problems with long-tailed distributions. Existing methods, such as data resampling, cost-sensitive techniques, and logistic loss modifications, though…