Related papers: Statistical hypothesis testing versus machine-lear…

A method for classification of data with uncertainty using hypothesis testing

Binary classification is a task that involves the classification of data into one of two distinct classes. It is widely utilized in various fields. However, conventional classifiers tend to make overconfident predictions for data that…

Machine Learning · Computer Science 2025-03-13 Shoma Yokura , Akihisa Ichiki

Data scientists and statisticians are often at odds when determining the best approach, machine learning or statistical modeling, to solve an analytics challenge. However, machine learning and statistical modeling are more cousins than…

Machine Learning · Computer Science 2022-01-10 Michele Bennett , Karin Hayes , Ewa J. Kleczyk , Rajesh Mehta

Evaluating Nonlinear Decision Trees for Binary Classification Tasks with Other Existing Methods

Classification of datasets into two or more distinct classes is an important machine learning task. Many methods are able to classify binary classification tasks with a very high accuracy on test data, but cannot provide any easily…

Machine Learning · Computer Science 2020-08-26 Yashesh Dhebar , Sparsh Gupta , Kalyanmoy Deb

Instance-Based Classification through Hypothesis Testing

Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the…

Machine Learning · Computer Science 2019-04-23 Zengyou He , Chaohua Sheng , Yan Liu , Quan Zou

How to Control the Error Rates of Binary Classifiers

The traditional binary classification framework constructs classifiers which may have good accuracy, but whose false positive and false negative error rates are not under users' control. In many cases, one of the errors is more severe and…

Machine Learning · Statistics 2020-10-22 Miloš Simić

Rank discriminants for predicting phenotypes from RNA expression

Statistical methods for analyzing large-scale biomolecular data are commonplace in computational biology. A notable example is phenotype prediction from gene expression data, for instance, detecting human cancers, differentiating subtypes…

Genomics · Quantitative Biology 2014-11-24 Bahman Afsari , Ulisses M. Braga-Neto , Donald Geman

Bias Mitigation Methods for Binary Classification Decision-Making Systems: Survey and Recommendations

Bias mitigation methods for binary classification decision-making systems have been widely researched due to the ever-growing importance of designing fair machine learning processes that are impartial and do not discriminate against…

Machine Learning · Computer Science 2023-06-01 Madeleine Waller , Odinaldo Rodrigues , Oana Cocarascu

Binary classification models with "Uncertain" predictions

Binary classification models which can assign probabilities to categories such as "the tissue is 75% likely to be tumorous" or "the chemical is 25% likely to be toxic" are well understood statistically, but their utility as an input to…

Applications · Statistics 2017-12-05 Damjan Krstajic , Ljubomir Buturovic , Simon Thomas , David E Leahy

Learning to Abstain from Binary Prediction

A binary classifier capable of abstaining from making a label prediction has two goals in tension: minimizing errors, and avoiding abstaining unnecessarily often. In this work, we exactly characterize the best achievable tradeoff between…

Machine Learning · Computer Science 2016-11-30 Akshay Balsubramani

Hierarchical Classification using Binary Data

In classification problems, especially those that categorize data into a large number of classes, the classes often naturally follow a hierarchical structure. That is, some classes are likely to share similar structures and features. Those…

Machine Learning · Computer Science 2018-07-25 Denali Molitor , Deanna Needell

Probabilistic Prediction for Binary Treatment Choice: with focus on personalized medicine

This paper extends my research applying statistical decision theory to treatment choice with sample data, using maximum regret to evaluate the performance of treatment rules. The specific new contribution is to study as-if optimization…

Econometrics · Economics 2021-10-05 Charles F. Manski

Large-Margin Classification with Multiple Decision Rules

Binary classification is a common statistical learning problem in which a model is estimated on a set of covariates for some outcome indicating the membership of one of two classes. In the literature, there exists a distinction between hard…

Machine Learning · Statistics 2014-11-20 Patrick K. Kimes , D. Neil Hayes , J. S. Marron , Yufeng Liu

Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning

This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree,…

cmp-lg · Computer Science 2008-02-03 Raymond J. Mooney

The Utility of Abstaining in Binary Classification

We explore the problem of binary classification in machine learning, with a twist - the classifier is allowed to abstain on any datum, professing ignorance about the true class label without committing to any prediction. This is directly…

Machine Learning · Computer Science 2015-12-29 Akshay Balsubramani

General Framework for Binary Classification on Top Samples

Many binary classification problems minimize misclassification above (or below) a threshold. We show that instances of ranking problems, accuracy at the top or hypothesis testing may be written in this form. We propose a general framework…

Machine Learning · Computer Science 2020-02-26 Lukáš Adam , Václav Mácha , Václav Šmídl , Tomáš Pevný

Comparing decision mining approaches with regard to the meaningfulness of their results

Decisions and the underlying rules are indispensable for driving process execution during runtime, i.e., for routing process instances at alternative branches based on the values of process data. Decision rules can comprise unary data…

Machine Learning · Computer Science 2021-09-16 Beate Scheibel , Stefanie Rinderle-Ma

Advanced Tutorial: Label-Efficient Two-Sample Tests

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…

Machine Learning · Computer Science 2025-01-08 Weizhi Li , Visar Berisha , Gautam Dasarathy

Algorithmic statistics, prediction and machine learning

Algorithmic statistics considers the following problem: given a binary string $x$ (e.g., some experimental data), find a "good" explanation of this data. It uses algorithmic information theory to define formally what is a good explanation.…

Machine Learning · Computer Science 2015-09-21 Alexey Milovanov

Calibration of Machine Learning Classifiers for Probability of Default Modelling

Binary classification is highly used in credit scoring in the estimation of probability of default. The validation of such predictive models is based both on rank ability, and also on calibration (i.e. how accurately the probabilities…

Econometrics · Economics 2017-10-25 Pedro G. Fonseca , Hugo D. Lopes

Machine Learning Approaches for Binary Classification to Discover Liver Diseases using Clinical Data

For a medical diagnosis, health professionals use different kinds of pathological ways to make a decision for medical reports in terms of patients medical condition. In the modern era, because of the advantage of computers and technologies,…

Machine Learning · Statistics 2021-06-08 Fahad B. Mostafa , Md Easin Hasan