Related papers: Building effective models from sparse but precise …
The past century has seen a steady increase in the need of estimating and predicting complex systems and making (possibly critical) decisions with limited information. Although computers have made possible the numerical evaluation of…
For nearly any challenging scientific problem evaluation of the likelihood is problematic if not impossible. Approximate Bayesian computation (ABC) allows us to employ the whole Bayesian formalism to problems where we can use simulations…
Models with dimension more than the available sample size are now commonly used in various applications. A sensible inference is possible using a lower-dimensional structure. In regression problems with a large number of predictors, the…
We go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane. Standard weighted least-squares fitting is only appropriate when there…
Scientific fields such as insider-threat detection and highway-safety planning often lack sufficient amounts of time-series data to estimate statistical models for the purpose of scientific discovery. Moreover, the available limited data…
The fact that we can build models from data, and therefore refine our models with more data from experiments, is usually given for granted in scientific inquiry. However, how much information can we extract, and how precise can we expect…
The optimal selection of experimental conditions is essential to maximizing the value of data for inference and prediction, particularly in situations where experiments are time-consuming and expensive to conduct. We propose a general…
High-throughput data analyses are becoming common in biology, communications, economics and sociology. The vast amounts of data are usually represented in the form of matrices and can be considered as knowledge networks. Spectra-based…
Bayesian hierarchical models have been demonstrated to provide efficient algorithms for finding sparse solutions to ill-posed inverse problems. The models comprise typically a conditionally Gaussian prior model for the unknown, augmented by…
Mathematical models connect theory with the real world through data, enabling us to interpret, understand, and predict complex phenomena. However, scientific knowledge often extends beyond what can be empirically measured, offering valuable…
Computer models are used to model complex processes in various disciplines. Often, a key source of uncertainty in the behavior of complex computer models is uncertainty due to unknown model input parameters. Statistical computer model…
Modern technologies are generating ever-increasing amounts of data. Making use of these data requires methods that are both statistically sound and computationally efficient. Typically, the statistical and computational aspects are treated…
With continued advances in Geographic Information Systems and related computational technologies, statisticians are often required to analyze very large spatial datasets. This has generated substantial interest over the last decade, already…
Classically, Bayesian clustering interprets each component of a mixture model as a cluster. The inferred clustering posterior is highly sensitive to any inaccuracies in the kernel within each component. As this kernel is made more flexible,…
The Bayesian approach to data analysis provides a powerful way to handle uncertainty in all observations, model parameters, and model structure using probability theory. Probabilistic programming languages make it easier to specify and fit…
Model selection aims to determine which theoretical models are most plausible given some data, without necessarily asking about the preferred values of the model parameters. A common model selection question is to ask when new data require…
In the context of computer models, calibration is the process of estimating unknown simulator parameters from observational data. Calibration is variously referred to as model fitting, parameter estimation/inference, an inverse problem, and…
The demand for extracting rules from high dimensional real world data is increasing in various fields. However, the possible redundancy of such data sometimes makes it difficult to obtain a good generalization ability for novel samples. To…
Almost all fields of science rely upon statistical inference to estimate unknown parameters in theoretical and computational models. While the performance of modern computer hardware continues to grow, the computational requirements for the…
We consider Bayesian model selection in generalized linear models that are high-dimensional, with the number of covariates p being large relative to the sample size n, but sparse in that the number of active covariates is small compared to…