Related papers: Data Amplification: Instance-Optimal Property Esti…
Estimating properties of discrete distributions is a fundamental problem in statistical learning. We design the first unified, linear-time, competitive, property estimator that for a wide class of properties and for all underlying…
The statistical analysis of data stemming from dynamical systems, including, but not limited to, time series, routinely relies on the estimation of information theoretical quantities, most notably Shannon entropy. To this purpose, possibly…
The advent of data science has spurred interest in estimating properties of distributions over large alphabets. Fundamental symmetric properties such as support size, support coverage, entropy, and proximity to uniformity, received most…
Estimating symmetric properties of a distribution, e.g. support size, coverage, entropy, distance to uniformity, are among the most fundamental problems in algorithmic statistics. While each of these properties have been studied extensively…
Shannon's entropy is one of the building blocks of information theory and an essential aspect of Machine Learning methods (e.g., Random Forests). Yet, it is only finitely defined for distributions with fast decaying tails on a countable…
We study three fundamental statistical-learning problems: distribution estimation, property estimation, and property testing. We establish the profile maximum likelihood (PML) estimator as the first unified sample-optimal approach to a wide…
In this study an attempt has been made to propose a way to develop new distribution. For this purpose, we need only idea about distribution function. Some important statistical properties of the new distribution like moments, cumulants,…
Shannon entropy is often a quantity of interest to linguists studying the communicative capacity of human language. However, entropy must typically be estimated from observed data because researchers do not have access to the underlying…
Recent years have witnessed the success of adaptive (or unified) approaches in estimating symmetric properties of discrete distributions, where one first obtains a distribution estimator independent of the target property, and then plugs…
We consider the fundamental learning problem of estimating properties of distributions over large domains. Using a novel piecewise-polynomial approximation technique, we derive the first unified methodology for constructing sample- and…
In this article, we construct semiparametrically efficient estimators of linear functionals of a probability measure in the presence of side information using an easy empirical likelihood approach. We use estimated constraint functions and…
The principle of maximum entropy is a broadly applicable technique for computing a distribution with the least amount of information possible constrained to match empirical data, for instance, feature expectations. We seek to generalize…
We provide an efficient unified plug-in approach for estimating symmetric properties of distributions given $n$ independent samples. Our estimator is based on profile-maximum-likelihood (PML) and is sample optimal for estimating various…
Modern statistical estimation is often performed in a distributed setting where each sample belongs to a single user who shares their data with a central server. Users are typically concerned with preserving the privacy of their samples,…
In this paper we provide a new efficient algorithm for approximately computing the profile maximum likelihood (PML) distribution, a prominent quantity in symmetric property estimation. We provide an algorithm which matches the previous best…
This paper proposes a new method of bandwidth selection in kernel estimation of density and distribution functions motivated by the connection between maximisation of the entropy of probability integral transforms and maximum likelihood in…
Quantile estimation in deconvolution problems is studied comprehensively. In particular, the more realistic setup of unknown error distributions is covered. Our plug-in method is based on a deconvolution density estimator and is minimax…
We revisit the well-studied problem of estimating the Shannon entropy of a probability distribution, now given access to a probability-revealing conditional sampling oracle. In this model, the oracle takes as input the representation of a…
In this paper, we introduce a new distribution generated by Lindley random variable which offers a more flexible model for modelling lifetime data. Various statistical properties like distribution function, survival function, moments,…
Random sampling is an essential tool in the processing and transmission of data. It is used to summarize data too large to store or manipulate and meet resource constraints on bandwidth or battery power. Estimators that are applied to the…