Related papers: Classification methods for Hilbert data based on s…
The method of constrained randomisation is applied to three-dimensional simulated galaxy distributions. With this technique we generate for a given data set surrogate data sets which have the same linear properties as the original data…
When evaluating the effectiveness of a treatment, policy, or intervention, the desired measure of effectiveness may be expensive to collect, not routinely available, or may take a long time to occur. In these cases, it is sometimes possible…
Asymptotic factorizations for the small-ball probability (SmBP) of a Hilbert valued random element $X$ are rigorously established and discussed. In particular, given the first $d$ principal components (PCs) and as the radius $\varepsilon$…
This paper introduces a practical sampling method for training surrogate models in the context of uncertainty propagation. We propose a heuristic method to uniformly draw samples within highest density regions of the density given by the…
Machine learning classification techniques have been used widely to recognize the feasible design domain and discover hidden patterns in engineering design. An accurate classification model needs a large dataset; however, generating a large…
In this work we are interested in the problems of supervised learning and variable selection when the input-output dependence is described by a nonlinear function depending on a few variables. Our goal is to consider a sparse nonparametric…
This chapter deals with kernel methods as a special class of techniques for surrogate modeling. Kernel methods have proven to be efficient in machine learning, pattern recognition and signal analysis due to their flexibility, excellent…
In supervised learning with distributional inputs in the two-stage sampling setup, relevant to applications like learning-based medical screening or causal learning, the inputs (which are probability distributions) are not accessible in the…
The method of surrogate data provides a framework for testing observed data against a hierarchy of alternative hypotheses. The aim of applying this method is to exclude the possibility that the data are consistent with simple linear…
The data-centric construction of inexpensive surrogates for fine-grained, physical models has been at the forefront of computational physics due to its significant utility in many-query tasks such as uncertainty quantification. Recent…
We consider a class of stochastic programming problems where the implicitly decision-dependent random variable follows a nonparametric regression model with heteroscedastic error. The Clarke subdifferential and surrogate functions are not…
Diffusion kernels over graphs have been widely utilized as effective tools in various applications due to their ability to accurately model the flow of information through nodes and edges. However, there is a notable gap in the literature…
This paper considers the problem of kernel regression and classification with possibly unobservable response variables in the data, where the mechanism that causes the absence of information is unknown and can depend on both predictors and…
The increasing use of stochastic models for describing complex phenomena warrants surrogate models that capture the reference model characteristics at a fraction of the computational cost, foregoing potentially expensive Monte Carlo…
In multicenter research, individual-level data are often protected against sharing across sites. To overcome the barrier of data sharing, many distributed algorithms, which only require sharing aggregated information, have been developed.…
Penalized empirical risk minimization with a surrogate loss function is often used to learn a high-dimensional linear decision rule in classification problems. Although much of the literature focus on the generalization error, there is a…
Efficient surrogate modelling is a key requirement for uncertainty quantification in data-driven scenarios. In this work, a novel approach of using Sparse Random Features for surrogate modelling in combination with self-supervised…
Density estimation plays a fundamental role in many areas of statistics and machine learning. Parametric, nonparametric and semiparametric density estimation methods have been proposed in the literature. Semiparametric density models are…
Conventional techniques for supervised classification constrain the classification rules considered and use surrogate losses for classification 0-1 loss. Favored families of classification rules are those that enjoy parametric…
We propose a multi-fidelity neural network surrogate sampling method for the uncertainty quantification of physical/biological systems described by ordinary or partial differential equations. We first generate a set of low/high-fidelity…