Related papers: Generalized Probability Smoothing
Probability estimation is essential for every statistical data compression algorithm. In practice probability estimation should be adaptive, recent observations should receive a higher weight than older observations. We present a…
We study universal compression of sequences generated by monotonic distributions. We show that for a monotonic distribution over an alphabet of size $k$, each probability parameter costs essentially $0.5 \log (n/k^3)$ bits, where $n$ is the…
The Poisson-sampling technique eliminates dependencies among symbol appearances in a random sequence. It has been used to simplify the analysis and strengthen the performance guarantees of randomized algorithms. Applying this method to…
Exchangeable random partition processes are the basis for Bayesian approaches to statistical inference in large alphabet settings. On the other hand, the notion of the pattern of a sequence provides an information-theoretic framework for…
The growing prevalence of nonsmooth optimization problems in machine learning has spurred significant interest in generalized smoothness assumptions. Among these, the (L0, L1)-smoothness assumption has emerged as one of the most prominent.…
Probabilistic smoothing is a standard tool for global optimization, but existing methods rely on Gaussian kernels and specific transforms, often resulting in strong hyperparameter sensitivity and limited robustness. We propose a general…
In this paper, we investigate the redundancy of universal coding schemes on smooth parametric sources in the finite-length regime. We derive an upper bound on the probability of the event that a sequence of length $n$, chosen using…
We study the problem of efficient compression of a stochastic source of probability distributions. It can be viewed as a generalization of Shannon's source coding problem. It has relation to the theory of common randomness, as well as to…
Smoothing operation to make continuous density field from observed point-like distribution of galaxies is crucially important for topological or morphological analysis of the large-scale structure, such as, the genus statistics or the area…
The minimum average number of bits need to describe a random variable is its entropy, assuming knowledge of the underlying statistics On the other hand, universal compression supposes that the distribution of the random variable, while…
Probability estimation is an elementary building block of every statistical data compression algorithm. In practice probability estimation is often based on relative letter frequencies which get scaled down, when their sum is too large.…
We present a general probabilistic perspective on Gaussian filtering and smoothing. This allows us to show that common approaches to Gaussian filtering/smoothing can be distinguished solely by their methods of computing/approximating the…
A general piecewise (including pointwise) probability distribution with space-saving notation and its hierarchical particular cases are considered. The explicit closed-form normalization, expectation, and variance formulas along with the…
Establishing the convergence of splines can be cast as a variational problem which is amenable to a $\Gamma$-convergence approach. We consider the case in which the regularization coefficient scales with the number of observations, $n$, as…
Describing statistical dependencies is foundational to empirical scientific research. For uncovering intricate and possibly non-linear dependencies between a single target variable and several source variables within a system, a principled…
We initiate the study of smoothed analysis for the sequential probability assignment problem with contexts. We study information-theoretically optimal minmax rates as well as a framework for algorithmic reduction involving the maximum…
We propose and study a family of universal sequential probability assignments on individual sequences, based on the incremental parsing procedure of the Lempel-Ziv (LZ78) compression algorithm. We show that the normalized log loss under any…
Sequential probability assignment and universal compression go hand in hand. We propose sequential probability assignment for non-binary (and large alphabet) sequences with empirical distributions whose parameters are known to be bounded…
Segmental structure is a common pattern in many types of sequences such as phrases in human languages. In this paper, we present a probabilistic model for sequences via their segmentations. The probability of a segmented sequence is…
Using a perturbation technique, we derive a new approximate filtering and smoothing methodology generalizing along different directions several existing approaches to robust filtering based on the score and the Hessian matrix of the…