Related papers: Binary Classification for High Dimensional Data us…

Combining Varied Learners for Binary Classification using Stacked Generalization

The Machine Learning has various learning algorithms that are better in some or the other aspect when compared with each other but a common error that all algorithms will suffer from is training data with very high dimensional feature set.…

Machine Learning · Computer Science 2022-02-21 Sruthi Nair , Abhishek Gupta , Raunak Joshi , Vidya Chitre

Residual-Concatenate Neural Network with Deep Regularization Layers for Binary Classification

Many complex Deep Learning models are used with different variations for various prognostication tasks. The higher learning parameters not necessarily ensure great accuracy. This can be solved by considering changes in very deep models with…

Machine Learning · Computer Science 2022-06-15 Abhishek Gupta , Sruthi Nair , Raunak Joshi , Vidya Chitre

Data-Driven Logistic Regression Ensembles With Applications in Genomics

Advances in data collecting technologies in genomics have significantly increased the need for tools designed to study the genetic basis of many diseases. Effective statistical methods should excel in both prediction accuracy and biomarker…

Methodology · Statistics 2025-11-13 Anthony-Alexander Christidis , Stefan Van Aelst , Ruben Zamar

Binary Classification in Unstructured Space With Hypergraph Case-Based Reasoning

Binary classification is one of the most common problem in machine learning. It consists in predicting whether a given element belongs to a particular class. In this paper, a new algorithm for binary classification is proposed using a…

Machine Learning · Computer Science 2019-03-12 Alexandre Quemy

Binary Stochastic Representations for Large Multi-class Classification

Classification with a large number of classes is a key problem in machine learning and corresponds to many real-world applications like tagging of images or textual documents in social networks. If one-vs-all methods usually reach top…

Machine Learning · Computer Science 2019-06-25 Thomas Gerald , Aurélia Léon , Nicolas Baskiotis , Ludovic Denoyer

Evaluating Nonlinear Decision Trees for Binary Classification Tasks with Other Existing Methods

Classification of datasets into two or more distinct classes is an important machine learning task. Many methods are able to classify binary classification tasks with a very high accuracy on test data, but cannot provide any easily…

Machine Learning · Computer Science 2020-08-26 Yashesh Dhebar , Sparsh Gupta , Kalyanmoy Deb

Fast Supervised Hashing with Decision Trees for High-Dimensional Data

Supervised hashing aims to map the original features to compact binary codes that are able to preserve label based similarity in the Hamming space. Non-linear hash functions have demonstrated the advantage over linear ones due to their…

Computer Vision and Pattern Recognition · Computer Science 2016-11-17 Guosheng Lin , Chunhua Shen , Qinfeng Shi , Anton van den Hengel , David Suter

Big Data Classification Using Augmented Decision Trees

We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble…

Machine Learning · Statistics 2017-10-27 Rajiv Sambasivan , Sourish Das

Estimating a sharp convergence bound for randomized ensembles

When randomized ensembles such as bagging or random forests are used for binary classification, the prediction error of the ensemble tends to decrease and stabilize as the number of classifiers increases. However, the precise relationship…

Probability · Mathematics 2019-05-01 Miles E. Lopes

Discriminant Analysis in Contrasting Dimensions for Polycystic Ovary Syndrome Prognostication

A lot of prognostication methodologies have been formulated for early detection of Polycystic Ovary Syndrome also known as PCOS using Machine Learning. PCOS is a binary classification problem. Dimensionality Reduction methods impact the…

Machine Learning · Computer Science 2022-01-11 Abhishek Gupta , Himanshu Soni , Raunak Joshi , Ronald Melwin Laban

Feature uncertainty bounding schemes for large robust nonlinear SVM classifiers

We consider the binary classification problem when data are large and subject to unknown but bounded uncertainties. We address the problem by formulating the nonlinear support vector machine training problem with robust optimization. To do…

Machine Learning · Statistics 2017-06-30 Nicolas Couellan , Sophie Jan

Unsupervised Decision Forest for Data Clustering and Density Estimation

An algorithm to improve performance parameter for unsupervised decision forest clustering and density estimation is presented. Specifically, a dual assignment parameter is introduced as a density estimator by combining Random Forest and…

Computer Vision and Pattern Recognition · Computer Science 2015-07-19 Hayder Albehadili , Naz Islam

A Probabilistic Optimum-Path Forest Classifier for Binary Classification Problems

Probabilistic-driven classification techniques extend the role of traditional approaches that output labels (usually integer numbers) only. Such techniques are more fruitful when dealing with problems where one is not interested in…

Computer Vision and Pattern Recognition · Computer Science 2016-09-06 Silas E. N. Fernandes , Danillo R. Pereira , Caio C. O. Ramos , Andre N. Souza , Joao P. Papa

Binary Classification: Is Boosting stronger than Bagging?

Random Forests have been one of the most popular bagging methods in the past few decades, especially due to their success at handling tabular datasets. They have been extensively studied and compared to boosting models, like XGBoost, which…

Machine Learning · Computer Science 2024-10-28 Dimitris Bertsimas , Vasiliki Stoumpou

High Dimensional Classification through $\ell_0$-Penalized Empirical Risk Minimization

We consider a high dimensional binary classification problem and construct a classification procedure by minimizing the empirical misclassification risk with a penalty on the number of selected features. We derive non-asymptotic probability…

Methodology · Statistics 2018-11-26 Le-Yu Chen , Sokbae Lee

Heterogeneous Oblique Double Random Forest

The decision tree ensembles use a single data feature at each node for splitting the data. However, splitting in this manner may fail to capture the geometric properties of the data. Thus, oblique decision trees generate the oblique…

Machine Learning · Computer Science 2023-04-17 M. A. Ganaie , M. Tanveer , I. Beheshti , N. Ahmad , P. N. Suganthan

High-dimensional classification by sparse logistic regression

We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic…

Statistics Theory · Mathematics 2018-11-20 Felix Abramovich , Vadim Grinshtein

Dimensionality Reduction Ensembles

Ensemble learning has had many successes in supervised learning, but it has been rare in unsupervised learning and dimensionality reduction. This study explores dimensionality reduction ensembles, using principal component analysis and…

Machine Learning · Statistics 2017-10-13 Colleen M. Farrelly

Random Forests for Big Data

Big Data is one of the major challenges of statistical science and has numerous consequences from algorithmic and theoretical viewpoints. Big Data always involve massive data but they also often include online data and data heterogeneity.…

Machine Learning · Statistics 2017-03-23 Robin Genuer , Jean-Michel Poggi , Christine Tuleau-Malot , Nathalie Villa-Vialaneix

High dimensional gaussian classification

High dimensional data analysis is known to be as a challenging problem. In this article, we give a theoretical analysis of high dimensional classification of Gaussian data which relies on a geometrical analysis of the error measure. It…

Statistics Theory · Mathematics 2008-07-10 Robin Girard