Related papers: Equations of States in Singular Statistical Estima…

Biased Generalization in Diffusion Models

Generalization in generative modeling is defined as the ability to learn an underlying distribution from a finite dataset and produce novel samples, with evaluation largely driven by held-out performance and perceived sample quality. In…

Machine Learning · Computer Science 2026-03-05 Jerome Garnier-Brun , Luca Biggio , Davide Beltrame , Marc Mézard , Luca Saglietti

Asymptotic Model Selection for Naive Bayesian Networks

We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features. This formula deviates from the standard BIC score. Our work provides a…

Artificial Intelligence · Computer Science 2013-01-07 Dmitry Rusakov , Dan Geiger

Statistical Models for the Analysis of Optimization Algorithms with Benchmark Functions

Frequentist statistical methods, such as hypothesis testing, are standard practice in papers that provide benchmark comparisons. Unfortunately, these methods have often been misused, e.g., without testing for their statistical test…

Methodology · Statistics 2021-05-18 David Issa Mattos , Jan Bosch , Helena Holmström Olsson

Learning From Simulators: A Theory of Simulation-Grounded Learning

Simulation-Grounded Neural Networks (SGNNs) are predictive models trained entirely on synthetic data from mechanistic simulations. They have achieved state-of-the-art performance in domains where real-world labels are limited or unobserved,…

Machine Learning · Computer Science 2025-10-03 Carson Dudley , Marisa Eisenberg

Gaussian Variational State Estimation for Nonlinear State-Space Models

In this paper, the problem of state estimation, in the context of both filtering and smoothing, for nonlinear state-space models is considered. Due to the nonlinear nature of the models, the state estimation problem is generally intractable…

Machine Learning · Statistics 2021-11-24 Jarrad Courts , Adrian Wills , Thomas B. Schön

Non-parametric Bayesian inference via loss functions under model misspecification

In the usual Bayesian setting, a full probabilistic model is required to link the data and parameters, and the form of this model and the inference and prediction mechanisms are specified via de Finetti's representation. In general, such a…

Methodology · Statistics 2026-01-21 Yu Luo , David A. Stephens , Daniel J. Graham , Emma J. McCoy

An Analysis of Model Robustness across Concurrent Distribution Shifts

Machine learning models, meticulously optimized for source data, often fail to predict target data when faced with distribution shifts (DSs). Previous benchmarking studies, though extensive, have mainly focused on simple DSs. Recognizing…

Machine Learning · Computer Science 2025-01-09 Myeongho Jeon , Suhwan Choi , Hyoje Lee , Teresa Yeo

Bounding the generalization error of convex combinations of classifiers: balancing the dimensionality and the margins

A problem of bounding the generalization error of a classifier f in H, where H is a "base" class of functions (classifiers), is considered. This problem frequently occurs in computer learning, where efficient algorithms of combining simple…

Probability · Mathematics 2007-06-13 Vladimir Koltchinskii , Dmitry Panchenko , Fernando Lozano

On Last-Layer Algorithms for Classification: Decoupling Representation from Uncertainty Estimation

Uncertainty quantification for deep learning is a challenging open problem. Bayesian statistics offer a mathematically grounded framework to reason about uncertainties; however, approximate posteriors for modern neural networks still…

Machine Learning · Statistics 2020-01-23 Nicolas Brosse , Carlos Riquelme , Alice Martin , Sylvain Gelly , Éric Moulines

Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory

In regular statistical models, the leave-one-out cross-validation is asymptotically equivalent to the Akaike information criterion. However, since many learning machines are singular statistical models, the asymptotic behavior of the…

Machine Learning · Computer Science 2010-10-15 Sumio Watanabe

About the posterior distribution in hidden Markov models with unknown number of states

We consider finite state space stationary hidden Markov models (HMMs) in the situation where the number of hidden states is unknown. We provide a frequentist asymptotic evaluation of Bayesian analysis methods. Our main result gives…

Statistics Theory · Mathematics 2014-10-27 Elisabeth Gassiat , Judith Rousseau

Linear Regression with Distributed Learning: A Generalization Error Perspective

Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear…

Machine Learning · Statistics 2021-11-03 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén

Generalization bounds for mixing processes via delayed online-to-PAC conversions

We study the generalization error of statistical learning algorithms in a non-i.i.d. setting, where the training data is sampled from a stationary mixing process. We develop an analytic framework for this scenario based on a reduction to…

Machine Learning · Computer Science 2025-02-20 Baptiste Abeles , Eugenio Clerico , Gergely Neu

Learning with Statistical Equality Constraints

As machine learning applications grow increasingly ubiquitous and complex, they face an increasing set of requirements beyond accuracy. The prevalent approach to handle this challenge is to aggregate a weighted combination of requirement…

Machine Learning · Computer Science 2026-01-07 Aneesh Barthakur , Luiz F. O. Chamon

A Bayesian Approach for Accurate Classification-Based Aggregates

In this paper, we study the accuracy of values aggregated over classes predicted by a classification algorithm. The problem is that the resulting aggregates (e.g., sums of a variable) are known to be biased. The bias can be large even for…

Machine Learning · Statistics 2019-12-02 Q. A. Meertens , C. G. H. Diks , H. J. van den Herik , F W Takes

On the Generalization Error of Differentially Private Algorithms via Typicality

We study the generalization error of stochastic learning algorithms from an information-theoretic perspective, with a particular emphasis on deriving sharper bounds for differentially private algorithms. It is well known that the…

Information Theory · Computer Science 2026-04-20 Yanxiao Liu , Chun Hei Michael Shiu , Lele Wang , Deniz Gündüz

The Illusion of Learning from Observational Data: An Empirical Bayes Perspective

Randomized experiments have long been the gold standard for scientists seeking to learn about cause and effect. When randomized experiments are infeasible, scientists often resort to observational studies, which are widely available and…

Methodology · Statistics 2026-04-13 Bohan Wu , Sebastian Salazar , Donald P. Green , David M. Blei

Quantifying uncertainty in the numerical integration of evolution equations based on Bayesian isotonic regression

This paper presents a new Bayesian framework for quantifying discretization errors in numerical solutions of ordinary differential equations. By modelling the errors as random variables, we impose a monotonicity constraint on the variances,…

Numerical Analysis · Mathematics 2024-11-14 Yuto Miyatake , Kaoru Irie , Takeru Matsuda

Variational Bayes algorithm and posterior consistency of Ising model parameter estimation

Ising models originated in statistical physics and are widely used in modeling spatial data and computer vision problems. However, statistical inference of this model remains challenging due to intractable nature of the normalizing constant…

Methodology · Statistics 2021-09-06 Minwoo Kim , Shrijita Bhattacharya , Tapabrata Maiti

Neural State-Space Models: Empirical Evaluation of Uncertainty Quantification

Effective quantification of uncertainty is an essential and still missing step towards a greater adoption of deep-learning approaches in different applications, including mission-critical ones. In particular, investigations on the…

Machine Learning · Computer Science 2023-04-14 Marco Forgione , Dario Piga