Related papers: A Bayesian nonparametric test for conditional inde…
Nonparametric and nonlinear measures of statistical dependence between pairs of random variables are important tools in modern data analysis. In particular the emergence of large data sets can now support the relaxation of linearity…
In this article, we propose a new method for the fundamental task of testing for dependence between two groups of variables. The response densities under the null hypothesis of independence and the alternative hypothesis of dependence are…
For a continuous random variable $Z$, testing conditional independence $X \perp\!\!\!\perp Y |Z$ is known to be a particularly hard problem. It constitutes a key ingredient of many constraint-based causal discovery algorithms. These…
In broad applications, it is routinely of interest to assess whether there is evidence in the data to refute the assumption of conditional independence of $Y$ and $X$ conditionally on $Z$. Such tests are well developed in parametric models…
In this article we describe Bayesian nonparametric procedures for two-sample hypothesis testing. Namely, given two sets of samples $\mathbf{y}^{\scriptscriptstyle(1)}\;$\stackrel{\scriptscriptstyle{iid}}{\s im}$\;F^{\scriptscriptstyle(1)}$…
We study the problem of independence and conditional independence tests between categorical covariates and a continuous response variable, which has an immediate application in genetics. Instead of estimating the conditional distribution of…
Causal discovery is to learn cause-effect relationships among variables given observational data and is important for many applications. Existing causal discovery methods assume data sufficiency, which may not be the case in many real world…
In this paper, a robust non-parametric measure of statistical dependence, or correlation, between two random variables is presented. The proposed coefficient is a permutation-like statistic that quantifies how much the observed sample S_n :…
Variable selection and classification are common objectives in the analysis of high-dimensional data. Most such methods make distributional assumptions that may not be compatible with the diverse families of distributions data can take. A…
Mutual information is a well-known tool to measure the mutual dependence between variables. In this paper, a Bayesian nonparametric estimation of mutual information is established by means of the Dirichlet process and the $k$-nearest…
In this article we provide a substantial discussion on the statistical concept of conditional independence, which is not routinely mentioned in most elementary statistics and mathematical statistics textbooks. Under the assumption of…
In this paper we present a method ofcomputing the posterior probability ofconditional independence of two or morecontinuous variables from data,examined at several resolutions. Ourapproach is motivated by theobservation that the appearance…
The problem of hypothesis testing is examined from both the historical and Bayesian points of view in the case that sampling is from an underlying joint probability distribution and the hypotheses tested for are those of independence and…
In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying…
This paper is concerned with test of the conditional independence. We first establish an equivalence between the conditional independence and the mutual independence. Based on the equivalence, we propose an index to measure the conditional…
Datasets with hundreds of variables and many missing values are commonplace. In this setting, it is both statistically and computationally challenging to detect true predictive relationships between variables and also to suppress false…
Distinguishing causal connections from correlations is important in many scenarios. However, the presence of unobserved variables, such as the latent confounder, can introduce bias in conditional independence testing commonly employed in…
An informative sampling design leads to the selection of units whose inclusion probabilities are correlated with the response variable of interest. Model inference performed on the resulting observed sample will be biased for the population…
It is often stated in papers tackling the task of inferring Bayesian network structures from data that there are these two distinct approaches: (i) Apply conditional independence tests when testing for the presence or otherwise of edges;…
We introduce a comprehensive Bayesian multivariate predictive inference framework. The basis for our framework is a hierarchical Bayesian model, that is a mixture of finite Polya trees corresponding to multiple dyadic partitions of the unit…