Related papers: Scalable Multivariate Histograms
The proliferation of science and technology has led to the prevalence of voluminous data sets that are distributed across multiple machines. It is an established fact that conventional statistical methodologies may be unfeasible in the…
Designing scalable estimation algorithms is a core challenge in modern statistics. Here we introduce a framework to address this challenge based on parallel approximants, which yields estimators with provable properties that operate on the…
The distributed Hill estimator is a divide-and-conquer algorithm for estimating the extreme value index when data are stored in multiple machines. In applications, estimates based on the distributed Hill estimator can be sensitive to the…
The multivariate version of the Mixed Tempered Stable is proposed. It is a generalization of the Normal Variance Mean Mixtures. Characteristics of this new distribution and its capacity in fitting tails and capturing dependence structure…
Structured additive distributional regression models offer a versatile framework for estimating complete conditional distributions by relating all parameters of a parametric distribution to covariates. Although these models efficiently…
Distributed statistical inference has recently attracted enormous attention. Many existing work focuses on the averaging estimator. We propose a one-step approach to enhance a simple-averaging based distributed estimator. We derive the…
A new approach of obtaining stratified random samples from statistically dependent random variables is described. The proposed method can be used to obtain samples from the input space of a computer forward model in estimating expectations…
A multivariate distribution can be described by a triangular transport map from the target distribution to a simple reference distribution. We propose Bayesian nonparametric inference on the transport map by modeling its components using…
In this paper, we propose a stratified sampling algorithm in which the random drawings made in the strata to compute the expectation of interest are also used to adaptively modify the proportion of further drawings in each stratum. These…
Inference and prediction of routes have become of interest over the past decade owing to a dramatic increase in package delivery and ride-sharing services. Given the underlying combinatorial structure and the incorporation of probabilities,…
Sampling is often a necessary evil to reduce the processing and storage costs of distributed tracing. In this work, we describe a scalable and adaptive sampling approach that can preserve events of interest better than the widely used…
The multivariate extended skew-normal distribution allows for accommodating raw data which are skewed and heavy tailed, and has at least three appealing statistical properties, namely closure under conditioning, affine transformations, and…
Although there is ample work in the literature dealing with skewness in the multivariate setting, there is a relative paucity of work in the matrix variate paradigm. Such work is, for example, useful for modelling three-way data. A matrix…
This paper studies a distributed state estimation problem for both continuous- and discrete-time linear systems. A simply structured distributed estimator (comprising interconnected local estimators) is first described for estimating the…
We propose a new weighted average estimator for the high dimensional parameters under the distributed learning system, in which the weight assigned to each coordinate is precisely proportional to the inverse of the variance of the local…
We exploit the link between the transport equation and derivatives of expectations to construct efficient pathwise gradient estimators for multivariate distributions. We focus on two main threads. First, we use null solutions of the…
We introduce an adaptive scattered data fitting scheme as extension of local least squares approximations to hierarchical spline spaces. To efficiently deal with non-trivial data configurations, the local solutions are described in terms of…
We develop sampling algorithms to fit Bayesian hierarchical models, the computational complexity of which scales linearly with the number of observations and the number of parameters in the model. We focus on crossed random effect and…
The increased availability of massive data sets provides a unique opportunity to discover subtle patterns in their distributions, but also imposes overwhelming computational challenges. To fully utilize the information contained in big…
We construct an adaptive independent Metropolis-Hastings sampler that uses a mixture of normals as a proposal distribution. To take full advantage of the potential of adaptive sampling our algorithm updates the mixture of normals…