Related papers: A Bayesian Approach for Accurate Classification-Ba…

Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference

Selection bias arises when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification. For example, in…

Machine Learning · Statistics 2026-04-21 Jonas Arruda , Sophie Chervet , Paula Staudt , Andreas Wieser , Michael Hoelscher , Isabelle Sermet-Gaudelus , Nadine Binder , Lulla Opatowski , Jan Hasenauer

Optimal Binary Classification Beyond Accuracy

The vast majority of statistical theory on binary classification characterizes performance in terms of accuracy. However, accuracy is known in many cases to poorly reflect the practical consequences of classification error, most famously in…

Statistics Theory · Mathematics 2022-09-27 Shashank Singh , Justin Khim

Toward Optimal Probabilistic Active Learning Using a Bayesian Approach

Gathering labeled data to train well-performing machine learning models is one of the critical challenges in many applications. Active learning aims at reducing the labeling costs by an efficient and effective allocation of costly labeling…

Machine Learning · Computer Science 2020-06-03 Daniel Kottke , Marek Herde , Christoph Sandrock , Denis Huseljic , Georg Krempl , Bernhard Sick

Binary Classifier Calibration: Bayesian Non-Parametric Approach

A set of probabilistic predictions is well calibrated if the events that are predicted to occur with probability p do in fact occur about p fraction of the time. Well calibrated predictions are particularly important when machine learning…

Machine Learning · Statistics 2014-01-14 Mahdi Pakdaman Naeini , Gregory F. Cooper , Milos Hauskrecht

Sparse Functional Data Classification via Bayesian Aggregation

Sparse functional data frequently arise in real-world applications, posing significant challenges for accurate classification. To address this, we propose a novel classification method that integrates functional principal component analysis…

Computation · Statistics 2025-03-17 Ahmad Talafha

Sampling Bias Correction for Supervised Machine Learning: A Bayesian Inference Approach with Practical Applications

Given a supervised machine learning problem where the training set has been subject to a known sampling bias, how can a model be trained to fit the original dataset? We achieve this through the Bayesian inference framework by altering the…

Machine Learning · Statistics 2022-03-16 Max Sklar

A Fully Bayesian, Logistic Regression Tracking Algorithm for Mitigating Disparate Misclassification

We develop a fully Bayesian, logistic tracking algorithm with the purpose of providing classification results that are unbiased when applied uniformly to individuals with differing sensitive variable values. Here, we consider bias in the…

Applications · Statistics 2020-12-02 Martin B. Short , George O. Mohler

Bayesian Aggregation

A general challenge in statistics is prediction in the presence of multiple candidate models or learning algorithms. Model aggregation tries to combine all predictive distributions from individual models, which is more stable and flexible…

Methodology · Statistics 2021-09-28 Yuling Yao

Efficient estimation and correction of selection-induced bias with order statistics

Model selection aims to identify a sufficiently well performing model that is possibly simpler than the most complex model among a pool of candidates. However, the decision-making process itself can inadvertently introduce non-negligible…

Methodology · Statistics 2024-08-08 Yann McLatchie , Aki Vehtari

Practical estimation of the optimal classification error with soft labels and calibration

While the performance of machine learning systems has experienced significant improvement in recent years, relatively little attention has been paid to the fundamental question: to what extent can we improve our models? This paper provides…

Machine Learning · Computer Science 2026-05-13 Ryota Ushio , Takashi Ishida , Masashi Sugiyama

Bayesian Model Averaging with Exponentiated Least Square Loss

The model averaging problem is to average multiple models to achieve a prediction accuracy not much worse than that of the best single model in terms of mean squared error. It is known that if the models are misspecified, model averaging is…

Statistics Theory · Mathematics 2018-02-28 Dong Dai , Lei Han , Ting Yang , Tong Zhang

Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification

There is a fundamental limitation in the prediction performance that a machine learning model can achieve due to the inevitable uncertainty of the prediction target. In classification problems, this can be characterized by the Bayes error,…

Machine Learning · Computer Science 2023-03-14 Takashi Ishida , Ikko Yamane , Nontawat Charoenphakdee , Gang Niu , Masashi Sugiyama

Bayesian Batch Active Learning as Sparse Subset Approximation

Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models. When the cost of acquiring labels is high, probabilistic active learning methods can be used to greedily select the…

Machine Learning · Statistics 2021-02-09 Robert Pinsler , Jonathan Gordon , Eric Nalisnick , José Miguel Hernández-Lobato

Data aggregation can lead to biased inferences in Bayesian linear mixed models and Bayesian ANOVA: A simulation study

Bayesian linear mixed-effects models and Bayesian ANOVA are increasingly being used in the cognitive sciences to perform null hypothesis tests, where a null hypothesis that an effect is zero is compared with an alternative hypothesis that…

Methodology · Statistics 2023-08-15 Daniel J. Schad , Bruno Nicenboim , Shravan Vasishth

Robust Approximate Bayesian Computation: An Adjustment Approach

We propose a novel approach to approximate Bayesian computation (ABC) that seeks to cater for possible misspecification of the assumed model. This new approach can be equally applied to rejection-based ABC and to popular regression…

Methodology · Statistics 2020-08-11 David T. Frazier , Christopher Drovandi , Ruben Loaiza-Maya

Error-Bounded Correction of Noisy Labels

To collect large scale annotated data, it is inevitable to introduce label noise, i.e., incorrect class labels. To be robust against label noise, many successful methods rely on the noisy classifiers (i.e., models trained on the noisy…

Computer Vision and Pattern Recognition · Computer Science 2020-11-23 Songzhu Zheng , Pengxiang Wu , Aman Goswami , Mayank Goswami , Dimitris Metaxas , Chao Chen

Iterative Updating of Model Error for Bayesian Inversion

In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when…

Methodology · Statistics 2018-02-14 Daniela Calvetti , Matthew M. Dunlop , Erkki Somersalo , Andrew M. Stuart

Bayesian approach to clustering real value, categorical and network data: solution via variational methods

Data clustering, including problems such as finding network communities, can be put into a systematic framework by means of a Bayesian approach. The application of Bayesian approaches to real problems can be, however, quite challenging. In…

Data Analysis, Statistics and Probability · Physics 2008-09-28 Alexei Vazquez

Progressive Sampling-Based Bayesian Optimization for Efficient and Automatic Machine Learning Model Selection

Purpose: Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting…

Machine Learning · Computer Science 2018-12-10 Xueqiang Zeng , Gang Luo

A Generative Bayesian Model for Aggregating Experts' Probabilities

In order to improve forecasts, a decisionmaker often combines probabilities given by various sources, such as human experts and machine learning classifiers. When few training data are available, aggregation can be improved by incorporating…

Machine Learning · Computer Science 2012-07-19 Joseph Kahn