Related papers: Nonparametric logistic regression with deep learni…

A Nonparametric Maximum Likelihood Approach to Mixture of Regression

We study mixture of linear regression (random coefficient) models, which capture population heterogeneity by allowing the regression coefficients to follow an unknown distribution $G^*$. In contrast to common parametric methods that fix the…

Methodology · Statistics 2025-07-01 Hansheng Jiang , Adityanand Guntuboyina

In-Context Learning as Nonparametric Conditional Probability Estimation: Risk Bounds and Optimality

This paper investigates the expected excess risk of in-context learning (ICL) for multiclass classification. We formalize each task as a sequence of labeled examples followed by a query input; a pretrained model then estimates the query's…

Machine Learning · Statistics 2025-09-03 Chenrui Liu , Falong Tan , Chuanlong Xie , Yicheng Zeng , Lixing Zhu

Gaussian mixtures and non-parametric likelihoods through the lens of statistical mechanics

In this work, we investigate Gaussian Mixture Models ({\it abbrv} GMM) and the related problem of non parametric maximum likelihood estimation ({\it abbrv} NPMLE) from the perspective of statistical mechanics. In particular, we establish…

Statistics Theory · Mathematics 2026-03-25 Subhroshekhar Ghosh , Adityanand Guntuboyina , Satyaki Mukherjee , Hoang-Son Tran

A Neural Network Algorithm for KL Divergence Estimation with Quantitative Error Bounds

Estimating the Kullback-Leibler (KL) divergence between random variables is a fundamental problem in statistical analysis. For continuous random variables, traditional information-theoretic estimators scale poorly with dimension and/or…

Machine Learning · Computer Science 2025-10-08 Mikil Foss , Andrew Lamperski

The Hellinger Bounds on the Kullback-Leibler Divergence and the Bernstein Norm

The Kullback-Leibler divergence, the Kullback-Leibler variation, and the Bernstein "norm" are used to quantify discrepancies among probability distributions in likelihood models such as nonparametric maximum likelihood and nonparametric…

Statistics Theory · Mathematics 2026-01-27 Tetsuya Kaji

Unified Perspective on Probability Divergence via Maximum Likelihood Density Ratio Estimation: Bridging KL-Divergence and Integral Probability Metrics

This paper provides a unified perspective for the Kullback-Leibler (KL)-divergence and the integral probability metrics (IPMs) from the perspective of maximum likelihood density-ratio estimation (DRE). Both the KL-divergence and the IPMs…

Machine Learning · Computer Science 2022-02-01 Masahiro Kato , Masaaki Imaizumi , Kentaro Minami

A modern maximum-likelihood theory for high-dimensional logistic regression

Every student in statistics or data science learns early on that when the sample size largely exceeds the number of variables, fitting a logistic model produces estimates that are approximately unbiased. Every student also learns that there…

Statistics Theory · Mathematics 2022-06-08 Pragya Sur , Emmanuel J. Candes

Parametric convergence rate of some nonparametric estimators in mixtures of power series distributions

We consider the problem of estimating a mixture of power series distributions with infinite support, to which belong very well-known models such as Poisson, Geometric, Logarithmic or Negative Binomial probability mass functions. We consider…

Statistics Theory · Mathematics 2025-08-04 Fadoua Balabdaoui , Harald Besdziek , Yong Wang

Optimal Kullback-Leibler Aggregation in Mixture Density Estimation by Maximum Likelihood

We study the maximum likelihood estimator of density of $n$ independent observations, under the assumption that it is well approximated by a mixture with a large number of components. The main focus is on statistical properties with respect…

Statistics Theory · Mathematics 2017-01-19 Arnak S. Dalalyan , Mehdi Sebbar

Adaptive Symmetrization of the KL Divergence

The forward Kullback-Leibler (KL) divergence is a ubiquitous objective for fitting a parameterized distribution to samples due to its tractability and equivalence to maximum likelihood estimation (MLE). Its inherent asymmetry, however, may…

Machine Learning · Computer Science 2026-05-12 Omri Ben-Dov , Luiz F. O. Chamon

Finite-sample performance of the maximum likelihood estimator in logistic regression

Logistic regression is a classical model for describing the probabilistic dependence of binary responses to multivariate covariates. We consider the predictive performance of the maximum likelihood estimator (MLE) for logistic regression,…

Statistics Theory · Mathematics 2026-02-20 Hugo Chardon , Matthieu Lerasle , Jaouad Mourtada

Estimating the logistic regression equation when the model is incorrect

Protesting mildly against the notion of an exactly correct parametric model the view is adopted that the logistic regression equation is merely an approximation to the underlying, true function. The behaviour of likelihood based estimators…

Statistics Theory · Mathematics 2026-05-27 Nils Lid Hjort

On the existence of the maximum likelihood estimate and convergence rate under gradient descent for multi-class logistic regression

We revisit the problem of the existence of the maximum likelihood estimate for multi-class logistic regression. We show that one method of ensuring its existence is by assigning positive probability to every class in the sample dataset. The…

Machine Learning · Computer Science 2024-05-09 Dwight Nwaigwe , Marek Rychlik

Distributionally Robust Parametric Maximum Likelihood Estimation

We consider the parameter estimation problem of a probabilistic generative model prescribed using a natural exponential family of distributions. For this problem, the typical maximum likelihood estimator usually overfits under limited…

Machine Learning · Statistics 2020-10-13 Viet Anh Nguyen , Xuhui Zhang , Jose Blanchet , Angelos Georghiou

MLE convergence speed to information projection of exponential family: Criterion for model dimension and sample size -- complete proof version--

For a parametric model of distributions, the closest distribution in the model to the true distribution located outside the model is considered. Measuring the closeness between two distributions with the Kullback-Leibler (K-L) divergence,…

Statistics Theory · Mathematics 2025-10-14 Yo Sheena

Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability

We consider the fundamental problem of estimating a discrete distribution on a domain of size $K$ with high probability in Kullback-Leibler divergence. We provide upper and lower bounds on the minimax estimation rate, which show that the…

Machine Learning · Statistics 2026-02-23 Dirk van der Hoeven , Julia Olkhovskaia , Tim van Erven

Tailoring Language Generation Models under Total Variation Distance

The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method. From a distributional view, MLE in fact minimizes the Kullback-Leibler divergence (KLD) between the distribution of the…

Computation and Language · Computer Science 2023-02-28 Haozhe Ji , Pei Ke , Zhipeng Hu , Rongsheng Zhang , Minlie Huang

Nonparametric empirical Bayes and maximum likelihood estimation for high-dimensional data analysis

Nonparametric empirical Bayes methods provide a flexible and attractive approach to high-dimensional data analysis. One particularly elegant empirical Bayes methodology, involving the Kiefer-Wolfowitz nonparametric maximum likelihood…

Methodology · Statistics 2014-07-11 Lee H. Dicker , Sihai D. Zhao

Kullback Proximal Algorithms for Maximum Likelihood Estimation

Accelerated algorithms for maximum likelihood image reconstruction are essential for emerging applications such as 3D tomography, dynamic tomographic imaging, and other high dimensional inverse problems. In this paper, we introduce and…

Computation · Statistics 2012-01-31 Stéphane Chrétien , Alfred O. Hero

Maximum likelihood estimators uniformly minimize distribution variance among distribution unbiased estimators in exponential families

We employ a parameter-free distribution estimation framework where estimators are random distributions and utilize the Kullback-Leibler (KL) divergence as a loss function. Wu and Vos [J. Statist. Plann. Inference 142 (2012) 1525-1536] show…

Statistics Theory · Mathematics 2015-09-21 Paul Vos , Qiang Wu