Related papers: On convergence rates equivalency and sampling stra…
We extend deconvolution in a periodic setting to deal with functional data. The resulting functional deconvolution model can be viewed as a generalization of a multitude of inverse problems in mathematical physics where one needs to recover…
The subject of this paper is the problem of nonparametric estimation of a continuous distribution function from observations with measurement errors. We study minimax complexity of this problem when unknown distribution has a density…
We study dropout regularization in continuous-time models through the lens of random-batch methods -- a family of stochastic sampling schemes originally devised to reduce the computational cost of interacting particle systems. We construct…
While efficient distribution learning is no doubt behind the groundbreaking success of diffusion modeling, its theoretical guarantees are quite limited. In this paper, we provide the first rigorous analysis on approximation and…
When applying a stochastic algorithm, one must choose an order to draw samples. The practical choices are without-replacement sampling orders, which are empirically faster and more cache-friendly than uniform-iid-sampling but often have…
Diffusion models accomplish remarkable success in data generation tasks across various domains. However, the iterative sampling process is computationally expensive. Consistency models are proposed to learn consistency functions to map from…
Diffusion models have achieved huge empirical success in data generation tasks. Recently, some efforts have been made to adapt the framework of diffusion models to discrete state space, providing a more natural approach for modeling…
Limit distributions for the greatest convex minorant and its derivative are considered for a general class of stochastic processes including partial sum processes and empirical processes, for independent, weakly dependent and long range…
We develop and analyze $M$-estimation methods for divergence functionals and the likelihood ratios of two probability distributions. Our method is based on a non-asymptotic variational characterization of $f$-divergences, which allows the…
We study the problem of training neural stochastic differential equations, or diffusion models, to sample from a Boltzmann distribution without access to target samples. Existing methods for training such models enforce time-reversal of the…
We consider the problem of estimating the unknown response function in the multichannel deconvolution model with long-range dependent Gaussian errors. We do not limit our consideration to a specific type of long-range dependence rather we…
This paper studies a class of exponential family models whose canonical parameters are specified as linear functionals of an unknown infinite-dimensional slope function. The optimal minimax rates of convergence for slope function estimation…
Data scarcity drives the need for more sample-efficient large language models. In this work, we use the double descent phenomenon to holistically compare the sample efficiency of discrete diffusion and autoregressive models. We show that…
Sparse learning is a very important tool for mining useful information and patterns from high dimensional data. Non-convex non-smooth regularized learning problems play essential roles in sparse learning, and have drawn extensive attentions…
We propose simple active sampling and reweighting strategies for optimizing min-max fairness that can be applied to any classification or regression model learned via loss minimization. The key intuition behind our approach is to use at…
Distributed minimax estimation and distributed adaptive estimation under communication constraints for Gaussian sequence model and white noise model are studied. The minimax rate of convergence for distributed estimation over a given Besov…
We analyze the coordinate descent method with a new coordinate selection strategy, called volume sampling. This strategy prescribes selecting subsets of variables of certain size proportionally to the determinants of principal submatrices…
In nonparametric statistics an optimality criterion for estimation procedures is provided by the minimax rate of convergence. However this classical point of view is subject to controversy as it requires to look for the worst behaviour…
We propose new continuous-time formulations for first-order stochastic optimization algorithms such as mini-batch gradient descent and variance-reduced methods. We exploit these continuous-time models, together with simple Lyapunov analysis…
This paper examines a stochastic deconvolution problem on compact symmetric spaces which is referred to as decompounding. This involves estimating the step distributions of a random walk, where in addition the number of steps between…