Related papers: A Model-Agnostic Algorithm for Bayes Error Determi…

Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification

There is a fundamental limitation in the prediction performance that a machine learning model can achieve due to the inevitable uncertainty of the prediction target. In classification problems, this can be characterized by the Bayes error,…

Machine Learning · Computer Science 2023-03-14 Takashi Ishida , Ikko Yamane , Nontawat Charoenphakdee , Gang Niu , Masashi Sugiyama

Intrinsic Dimensionality Estimation within Tight Localities: A Theoretical and Experimental Analysis

Accurate estimation of Intrinsic Dimensionality (ID) is of crucial importance in many data mining and machine learning tasks, including dimensionality reduction, outlier detection, similarity search and subspace clustering. However, since…

Machine Learning · Computer Science 2022-09-30 Laurent Amsaleg , Oussama Chelly , Michael E. Houle , Ken-ichi Kawarabayashi , Miloš Radovanović , Weeris Treeratanajaru

Practical estimation of the optimal classification error with soft labels and calibration

While the performance of machine learning systems has experienced significant improvement in recent years, relatively little attention has been paid to the fundamental question: to what extent can we improve our models? This paper provides…

Machine Learning · Computer Science 2026-05-13 Ryota Ushio , Takashi Ishida , Masashi Sugiyama

Linear classifier design under heteroscedasticity in Linear Discriminant Analysis

Under normality and homoscedasticity assumptions, Linear Discriminant Analysis (LDA) is known to be optimal in terms of minimising the Bayes error for binary classification. In the heteroscedastic case, LDA is not guaranteed to minimise…

Machine Learning · Computer Science 2017-03-27 Kojo Sarfo Gyamfi , James Brusey , Andrew Hunt , Elena Gaura

AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms

Approximate probabilistic inference algorithms are central to many fields. Examples include sequential Monte Carlo inference in robotics, variational inference in machine learning, and Markov chain Monte Carlo inference in statistics. A key…

Machine Learning · Statistics 2017-11-07 Marco F. Cusumano-Towner , Vikash K. Mansinghka

Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy

This work invokes the notion of $f$-divergence to introduce a novel upper bound on the Bayes error rate of a general classification task. We show that the proposed bound can be computed by sampling from the output of a parameterized model.…

Machine Learning · Computer Science 2025-01-15 Mohammadreza Tavasoli Naeini , Ali Bereyhi , Morteza Noshad , Ben Liang , Alfred O. Hero

Calibrated Bayesian Deep Learning for Explainable Decision Support Systems Based on Medical Imaging

In critical decision support systems based on medical imaging, the reliability of AI-assisted decision-making is as relevant as predictive accuracy. Although deep learning models have demonstrated significant accuracy, they frequently…

Computer Vision and Pattern Recognition · Computer Science 2026-02-13 Hua Xu , Julián D. Arias-Londoño , Juan I. Godino-Llorente

lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits

The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples. The procedure cannot be…

Machine Learning · Statistics 2013-12-30 Kevin Jamieson , Matthew Malloy , Robert Nowak , Sébastien Bubeck

Tuning parameter selection for penalized likelihood estimation of inverse covariance matrix

In a Gaussian graphical model, the conditional independence between two variables are characterized by the corresponding zero entries in the inverse covariance matrix. Maximum likelihood method using the smoothly clipped absolute deviation…

Methodology · Statistics 2009-09-07 Xin Gao , Daniel Q. Pu , Yuehua Wu , Hong Xu

Subspace Determination through Local Intrinsic Dimensional Decomposition: Theory and Experimentation

Axis-aligned subspace clustering generally entails searching through enormous numbers of subspaces (feature combinations) and evaluation of cluster quality within each subspace. In this paper, we tackle the problem of identifying subsets of…

Machine Learning · Computer Science 2019-07-17 Ruben Becker , Imane Hafnaoui , Michael E. Houle , Pan Li , Arthur Zimek

Classification Error Bound for Low Bayes Error Conditions in Machine Learning

In statistical classification and machine learning, classification error is an important performance measure, which is minimized by the Bayes decision rule. In practice, the unknown true distribution is usually replaced with a model…

Machine Learning · Computer Science 2025-01-28 Zijian Yang , Vahe Eminyan , Ralf Schlüter , Hermann Ney

Bayesian Model Selection for Misspecified Models in Linear Regression

While the Bayesian Information Criterion (BIC) and Akaike Information Criterion (AIC) are powerful tools for model selection in linear regression, they are built on different prior assumptions and thereby apply to different data generation…

Methodology · Statistics 2017-12-15 MB de Kock , HC Eggers

Optimal Binary Classification Beyond Accuracy

The vast majority of statistical theory on binary classification characterizes performance in terms of accuracy. However, accuracy is known in many cases to poorly reflect the practical consequences of classification error, most famously in…

Statistics Theory · Mathematics 2022-09-27 Shashank Singh , Justin Khim

Adaptive Label Error Detection: A Bayesian Approach to Mislabeled Data Detection

Machine learning classification systems are susceptible to poor performance when trained with incorrect ground truth labels, even when data is well-curated by expert annotators. As machine learning becomes more widespread, it is…

Machine Learning · Computer Science 2026-01-16 Zan Chaudhry , Noam H. Rotenberg , Brian Caffo , Craig K. Jones , Haris I. Sair

Stochastic Learning Approach to Binary Optimization for Optimal Design of Experiments

We present a novel stochastic approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the…

Optimization and Control · Mathematics 2022-06-28 Ahmed Attia , Sven Leyffer , Todd Munson

A PAC-Bayesian Perspective on the Interpolating Information Criterion

Deep learning is renowned for its theory-practice gap, whereby principled theory typically fails to provide much beneficial guidance for implementation in practice. This has been highlighted recently by the benign overfitting phenomenon:…

Machine Learning · Statistics 2023-11-14 Liam Hodgkinson , Chris van der Heide , Robert Salomone , Fred Roosta , Michael W. Mahoney

Unsupervised Intrinsic Image Decomposition with LiDAR Intensity Enhanced Training

Unsupervised intrinsic image decomposition (IID) is the process of separating a natural image into albedo and shade without these ground truths. A recent model employing light detection and ranging (LiDAR) intensity demonstrated impressive…

Computer Vision and Pattern Recognition · Computer Science 2024-03-22 Shogo Sato , Takuhiro Kaneko , Kazuhiko Murasaki , Taiga Yoshida , Ryuichi Tanida , Akisato Kimura

Information Leakage Detection through Approximate Bayes-optimal Prediction

In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via…

Machine Learning · Statistics 2025-06-02 Pritha Gupta , Marcel Wever , Eyke Hüllermeier

Limited-Information Maximum Likelihood based Model Selection Procedures for Binary Outcomes

Unmeasured covariates constitute one of the important problems in causal inference. Even if there are some unmeasured covariates, some instrumental variable methods such as a two-stage residual inclusion (2SRI) estimator, or a…

Methodology · Statistics 2021-12-30 Shunichiro Orihara

LUCID: Exposing Algorithmic Bias through Inverse Design

AI systems can create, propagate, support, and automate bias in decision-making processes. To mitigate biased decisions, we both need to understand the origin of the bias and define what it means for an algorithm to make fair decisions.…

Machine Learning · Computer Science 2022-08-29 Carmen Mazijn , Carina Prunkl , Andres Algaba , Jan Danckaert , Vincent Ginis