Related papers: Statistics of extremes by oracle estimation
We study the maximum likelihood estimator of density of $n$ independent observations, under the assumption that it is well approximated by a mixture with a large number of components. The main focus is on statistical properties with respect…
We study the problem of model selection type aggregation with respect to the Kullback-Leibler divergence for various probabilistic models. Rather than considering a convex combination of the initial estimators $f_1, \ldots, f_N$, our…
We fit the exponent of the Pareto distribution, that is equivalent or can approximate the continuous power law distribution given a cutoff point, using linear regression (LR). We use LR on the logged variables of the empirical tail (one…
In a regression setup with deterministic design, we study the pure aggregation problem and introduce a natural extension from the Gaussian distribution to distributions in the exponential family. While this extension bears strong…
Testing whether two multivariate samples exhibit the same extremal behavior is an important problem in various fields including environmental and climate sciences. While several ad-hoc approaches exist in the literature, they often lack…
The characteristic function of the folded normal distribution and its moment function are derived. The entropy of the folded normal distribution and the Kullback--Leibler from the normal and half normal distributions are approximated using…
In this technical report, we consider conditional density estimation with a maximum likelihood approach. Under weak assumptions, we obtain a theoretical bound for a Kullback-Leibler type loss for a single model maximum likelihood estimate.…
Our primary aim is to find an estimate of the expected shortfall in various situations: (1) Nonparametric situation, when the probability distribution of the incurred loss is unknown, only satisfying some general conditions. Then, following…
We consider a finite mixture of Gaussian regression model for high- dimensional data, where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by a maximum…
We develop two surprising new results regarding the use of proper scoring rules for evaluating the predictive quality of two alternative sequential forecast distributions. Both of the proponents prefer to be awarded a score derived from the…
Extreme Value Theory plays an important role to provide approximation results for the extremes of a sequence of independent random variables when their distribution is unknown. An important one is given by the {generalised Pareto…
The upper tail of a claim size distribution of a property line of business is frequently modelled by Pareto distribution. However, the upper tail does not need to be Pareto distributed, extraordinary shapes are possible. Here, the…
Aggregation methods have emerged as a powerful and flexible framework in statistical learning, providing unified solutions across diverse problems such as regression, classification, and density estimation. In the context of generalized…
We propose a general algorithm for approximating nonstandard Bayesian posterior distributions. The algorithm minimizes the Kullback-Leibler divergence of an approximating distribution to the intractable posterior distribution. Our method…
We introduce a new characterization of Pareto distribution and construct integral and supremum type goodness-of-fit tests based on it. Limiting distribution and large deviations of new statistics are described and their local Bahadur…
We study the problem of estimating a distribution over a finite alphabet from an i.i.d. sample, with accuracy measured in relative entropy (Kullback-Leibler divergence). While optimal bounds on the expected risk are known, high-probability…
This paper consider penalized empirical loss minimization of convex loss functions with unknown non-linear target functions. Using the elastic net penalty we establish a finite sample oracle inequality which bounds the loss of our estimator…
Discrete normal distributions are defined as the distributions with prescribed means and covariance matrices which maximize entropy on the integer lattice support. The set of discrete normal distributions form an exponential family with…
We consider a multivariate finite mixture of Gaussian regression models for high-dimensional data, where the number of covariates and the size of the response may be much larger than the sample size. We provide an $\ell_1$-oracle inequality…
We present novel bounds for estimating discrete probability distributions under the $\ell_\infty$ norm. These are nearly optimal in various precise senses, including a kind of instance-optimality. Our data-dependent convergence guarantees…