English
Related papers

Related papers: Detecting Dependencies in Sparse, Multivariate Dat…

200 papers

Databases are widespread, yet extracting relevant data can be difficult. Without substantial domain knowledge, multivariate search queries often return sparse or uninformative results. This paper introduces an approach for searching…

Artificial Intelligence · Computer Science 2017-04-05 Feras Saad , Leonardo Casarsa , Vikash Mansinghka

Is it possible to make statistical inference broadly accessible to non-statisticians without sacrificing mathematical rigor or inference quality? This paper describes BayesDB, a probabilistic programming platform that aims to enable users…

Artificial Intelligence · Computer Science 2015-12-17 Vikash Mansinghka , Richard Tibbetts , Jay Baxter , Pat Shafto , Baxter Eaves

In todays age of data, discovering relationships between different variables is an interesting and a challenging problem. This problem becomes even more critical with regards to complex dynamical systems like weather forecasting and…

Data Analysis, Statistics and Probability · Physics 2021-02-01 Sachin Kasture

Additive nonparametric regression models provide an attractive tool for variable selection in high dimensions when the relationship between the response and predictors is complex. They offer greater flexibility compared to parametric…

Machine Learning · Statistics 2016-07-12 Garret Vo , Debdeep Pati

In broad applications, it is routinely of interest to assess whether there is evidence in the data to refute the assumption of conditional independence of $Y$ and $X$ conditionally on $Z$. Such tests are well developed in parametric models…

Methodology · Statistics 2015-03-25 Tsuyoshi Kunihama , David B. Dunson

Bayesian nonparametric models offer a flexible and powerful framework for statistical model selection, enabling the adaptation of model complexity to the intricacies of diverse datasets. This survey intends to delve into the significance of…

Machine Learning · Computer Science 2024-04-02 Bahman Moraffah

The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric…

Information Retrieval · Computer Science 2015-07-07 Erik M. Ferragut , Jason Laska

An algorithm for automated construction of a sparse Bayesian network given an unstructured probabilistic model and causal domain information from an expert has been developed and implemented. The goal is to obtain a network that explicitly…

Artificial Intelligence · Computer Science 2013-04-08 Sampath Srinivas , Stuart Russell , Alice M. Agogino

In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying…

Methodology · Statistics 2016-04-28 Sarah Filippi , Chris C. Holmes , Luis E. Nieto-Barajas

Probabilistic programming is a rapidly developing programming paradigm which enables the formulation of Bayesian models as programs and the automation of posterior inference. It facilitates the development of models and conducting Bayesian…

Software Engineering · Computer Science 2025-10-31 Nathanael Nussbaumer , Markus Böck , Jürgen Cito

This article introduces a Bayesian nonparametric method for quantifying the relative evidence in a dataset in favour of the dependence or independence of two variables conditional on a third. The approach uses Polya tree priors on spaces of…

Methodology · Statistics 2021-02-15 Onur Teymur , Sarah Filippi

Mixed data refers to a type of data in which variables can be of multiple types, such as continuous, discrete, or categorical. This data is routinely collected in various fields, including healthcare and social sciences. A common goal in…

Methodology · Statistics 2025-05-22 Mauro Florez , Anna Gottard , Carrie McAdams , Michele Guindani , Marina Vannucci

An informative sampling design leads to the selection of units whose inclusion probabilities are correlated with the response variable of interest. Model inference performed on the resulting observed sample will be biased for the population…

Methodology · Statistics 2018-06-29 Matthew R. Williams , Terrance D. Savitsky

This paper presents Sparse Partitioning, a Bayesian method for identifying predictors that either individually or in combination with others affect a response variable. The method is designed for regression problems involving binary or…

Quantitative Methods · Quantitative Biology 2011-08-31 Doug Speed , Simon Tavaré

As technology advanced, collecting data via automatic collection devices become popular, thus we commonly face data sets with lengthy variables, especially when these data sets are collected without specific research goals beforehand. It…

Machine Learning · Statistics 2022-05-10 Wan-Ping Nicole Chen , Yuan-chin Ivan Chang

Despite major methodological developments, Bayesian inference for Gaussian graphical models remains challenging in high dimension due to the tremendous size of the model space. This article proposes a method to infer the marginal and…

Methodology · Statistics 2018-04-10 Gwenaël G. R. Leday , Sylvia Richardson

Models with dimension more than the available sample size are now commonly used in various applications. A sensible inference is possible using a lower-dimensional structure. In regression problems with a large number of predictors, the…

Statistics Theory · Mathematics 2025-11-25 Sayantan Banerjee , Ismaël Castillo , Subhashis Ghosal

Observed associations in a database may be due in whole or part to variations in unrecorded (latent) variables. Identifying such variables and their causal relationships with one another is a principal goal in many scientific and practical…

Machine Learning · Computer Science 2012-12-12 Ricardo Silva , Richard Scheines , Clark Glymour , Peter L. Spirtes

Instead of requiring a domain expert to specify the probabilistic dependencies of the data, in this work we present an approach that uses the relational DB schema to automatically construct a Bayesian graphical model for a database. This…

Artificial Intelligence · Computer Science 2012-12-07 Sameer Singh , Thore Graepel

Modern datasets commonly feature both substantial missingness and many variables of mixed data types, which present significant challenges for estimation and inference. Complete case analysis, which proceeds using only the observations with…

Methodology · Statistics 2023-04-10 Joseph Feldman , Daniel R. Kowal
‹ Prev 1 2 3 10 Next ›