Related papers: Bayesian data selection
We consider the use of Bayesian information criteria for selection of the graph underlying an Ising model. In an Ising model, the full conditional distributions of each variable form logistic regression models, and variable selection…
Fairness concerns are increasingly critical as machine learning models are deployed in high-stakes applications. While existing fairness-aware methods typically intervene at the model level, they often suffer from high computational costs,…
In the problem of selecting variables in a multivariate linear regression model, we derive new Bayesian information criteria based on a prior mixing a smooth distribution and a delta distribution. Each of them can be interpreted as a fusion…
A critical step in data analysis for many different types of experiments is the identification of features with theoretically defined shapes in N-dimensional datasets; examples of this process include finding peaks in multi-dimensional…
Variable selection and classification are common objectives in the analysis of high-dimensional data. Most such methods make distributional assumptions that may not be compatible with the diverse families of distributions data can take. A…
Bayesian inference problems require sampling or approximating high-dimensional probability distributions. The focus of this paper is on the recently introduced Stein variational gradient descent methodology, a class of algorithms that rely…
A widely used method to create a continuous representation of a discrete data-set is regression analysis. When the regression model is not based on a mathematical description of the physics underlying the data, heuristic techniques play a…
There are many issues that can cause problems when attempting to infer model parameters from data. Data and models are both imperfect, and as such there are multiple scenarios in which standard methods of inference will lead to misleading…
Model selection is an indispensable part of data analysis dealing very frequently with fitting and prediction purposes. In this paper, we tackle the problem of model selection in a general linear regression where the parameter matrix…
Bayesian variable selection is a powerful tool for data analysis, as it offers a principled method for variable selection that accounts for prior information and uncertainty. However, wider adoption of Bayesian variable selection has been…
High-throughput scientific studies involving no clear a'priori hypothesis are common. For example, a large-scale genomic study of a disease may examine thousands of genes without hypothesizing that any specific gene is responsible for the…
A central problem in analyzing networks is partitioning them into modules or communities. One of the best tools for this is the stochastic block model, which clusters vertices into blocks with statistically homogeneous pattern of links.…
We consider the problem of reducing the dimensions of parameters and data in non-Gaussian Bayesian inference problems. Our goal is to identify an "informed" subspace of the parameters and an "informative" subspace of the data so that a…
Kernel estimation techniques, such as mean shift, suffer from one major drawback: the kernel bandwidth selection. The bandwidth can be fixed for all the data set or can vary at each points. Automatic bandwidth selection becomes a real…
Bayesian model selection is a tool to decide whether the introduction of a new parameter is warranted by data. I argue that the usual sampling statistic significance tests for a null hypothesis can be misleading, since they do not take into…
Model selection aims to determine which theoretical models are most plausible given some data, without necessarily asking about the preferred values of the model parameters. A common model selection question is to ask when new data require…
The choice of tuning parameters in Bayesian variable selection is a critical problem in modern statistics. In particular, for Bayesian linear regression with non-local priors, the scale parameter in the non-local prior density is an…
A general Bayesian framework for model selection on random network models regarding their features is considered. The goal is to develop a principle Bayesian model selection approach to compare different fittable, not necessarily nested,…
For many important problems the quantity of interest is an unknown function of the parameters, which is a random vector with known statistics. Since the dependence of the output on this random vector is unknown, the challenge is to identify…
When the data do not conform to the hypothesis of a known sampling-variance, the fitting of a constant to a set of measured values is a long debated problem. Given the data, fitting would require to find what measurand value is the most…