English
Related papers

Related papers: Smart Data driven Decision Trees Ensemble Methodol…

200 papers

Decision trees and random forest remain highly competitive for classification on medium-sized, standard datasets due to their robustness, minimal preprocessing requirements, and interpretability. However, a single tree suffers from high…

Machine Learning · Statistics 2025-12-02 Cencheng Shen , Yuexiao Dong , Carey E. Priebe

Many real-world applications reveal difficulties in learning classifiers from imbalanced data. The rising big data era has been witnessing more classification tasks with large-scale but extremely imbalance and low-quality datasets. Most of…

Machine Learning · Computer Science 2020-10-20 Zhining Liu , Wei Cao , Zhifeng Gao , Jiang Bian , Hechang Chen , Yi Chang , Tie-Yan Liu

Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the…

Machine Learning · Statistics 2018-11-30 Rafael M. O. Cruz , Robert Sabourin , George D. C. Cavalcanti

We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble…

Machine Learning · Statistics 2017-10-27 Rajiv Sambasivan , Sourish Das

Data imbalance is ubiquitous when applying machine learning to real-world problems, particularly regression problems. If training data are imbalanced, the learning is dominated by the densely covered regions of the target distribution and…

Machine Learning · Computer Science 2024-10-29 Yuchang Jiang , Vivien Sainte Fare Garnot , Konrad Schindler , Jan Dirk Wegner

In any knowledge discovery process the value of extracted knowledge is directly related to the quality of the data used. Big Data problems, generated by massive growth in the scale of data observed in recent years, also follow the same…

Databases · Computer Science 2017-07-31 Diego García-Gil , Julián Luengo , Salvador García , Francisco Herrera

Data analysis and machine learning have become an integrative part of the modern scientific methodology, providing automated techniques to predict further information based on observations. One of these classification and regression…

Computer Vision and Pattern Recognition · Computer Science 2019-01-07 Mario Amrehn , Firas Mualla , Elli Angelopoulou , Stefan Steidl , Andreas Maier

Imbalanced Data (ID) is a problem that deters Machine Learning (ML) models for achieving satisfactory results. ID is the occurrence of a situation where the quantity of the samples belonging to one class outnumbers that of the other by a…

Class imbalance is a frequently occurring scenario in classification tasks. Learning from imbalanced data poses a major challenge, which has instigated a lot of research in this area. Data preprocessing using sampling techniques is a…

Machine Learning · Computer Science 2022-08-23 Asif Newaz , Farhan Shahriyar Haq

Modern streaming data categorization faces significant challenges from concept drift and class imbalanced data. This negatively impacts the output of the classifier, leading to improper classification. Furthermore, other factors such as the…

Machine Learning · Computer Science 2023-09-29 Priya. S , Haribharathi Sivakumar , Vijay Arvind. R

The performance of classification algorithms with a massive and highly imbalanced data stream depends upon efficient balancing strategy. Some techniques of balancing strategy have been applied in the past with Batch data to resolve the…

Machine Learning · Computer Science 2019-10-22 Rafiq Ahmed Mohammed , Kok-Wai Wong , Mohd Fairuz Shiratuddin , Xuequn Wang

Data imbalance, that is the disproportion between the number of training observations coming from different classes, remains one of the most significant challenges affecting contemporary machine learning. The negative impact of data…

Machine Learning · Computer Science 2021-11-30 Michał Koziarski

Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the…

Machine Learning · Computer Science 2018-11-30 Rafael M. O. Cruz , Mariana A. Souza , Robert Sabourin , George D. C. Cavalcanti

Despite extensive research spanning several decades, class imbalance is still considered a profound difficulty for both machine learning and deep learning models. While data oversampling is the foremost technique to address this issue,…

Machine Learning · Computer Science 2025-02-12 Sukumar Kishanthan , Asela Hevapathige

Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific…

Machine Learning · Computer Science 2024-03-13 Sascha Marton , Stefan Lüdtke , Christian Bartelt , Heiner Stuckenschmidt

Imbalanced datasets, where one class significantly outnumbers others, remain a persistent challenge in machine learning, often biasing predictions toward the majority class and degrading classifier performance. This paper provides a…

We propose ODTE, a new ensemble that uses oblique decision trees as base classifiers. Additionally, we introduce STree, the base algorithm for growing oblique decision trees, which leverages support vector machines to define hyperplanes…

Machine Learning · Computer Science 2025-03-18 Ricardo Montañana , José A. Gámez , José M. Puerta

Class imbalance (CI) in classification problems arises when the number of observations belonging to one class is lower than the other. Ensemble learning combines multiple models to obtain a robust model and has been prominently used with…

Machine Learning · Computer Science 2023-11-28 Azal Ahmad Khan , Omkar Chaudhari , Rohitash Chandra

Classification data sets with skewed class proportions are called imbalanced. Class imbalance is a problem since most machine learning classification algorithms are built with an assumption of equal representation of all classes in the…

Machine Learning · Computer Science 2022-12-22 Azal Ahmad Khan

In the era of big data, the utilization of credit-scoring models to determine the credit risk of applicants accurately becomes a trend in the future. The conventional machine learning on credit scoring data sets tends to have poor…

Machine Learning · Statistics 2021-02-10 Xiaofan Liua , Zuoquan Zhanga , Di Wanga
‹ Prev 1 2 3 10 Next ›