Related papers: Detecting and Identifying Selection Structure in S…

Causal Discovery with Heterogeneous Observational Data

We consider the problem of causal discovery (structure learning) from heterogeneous observational data. Most existing methods assume a homogeneous sampling scheme, which leads to misleading conclusions when violated in many applications. To…

Methodology · Statistics 2022-02-01 Fangting Zhou , Kejun He , Yang Ni

Detecting hidden confounding in observational data using multiple environments

A common assumption in causal inference from observational data is that there is no hidden confounding. Yet it is, in general, impossible to verify this assumption from a single dataset. Under the assumption of independent causal mechanisms…

Methodology · Statistics 2023-11-07 Rickard K. A. Karlsson , Jesse H. Krijthe

Causal Discovery from Subsampled Time Series with Proxy Variables

Inferring causal structures from time series data is the central interest of many scientific inquiries. A major barrier to such inference is the problem of subsampling, i.e., the frequency of measurement is much lower than that of causal…

Machine Learning · Computer Science 2023-12-27 Mingzhou Liu , Xinwei Sun , Lingjing Hu , Yizhou Wang

Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference

Selection bias arises when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification. For example, in…

Machine Learning · Statistics 2026-04-21 Jonas Arruda , Sophie Chervet , Paula Staudt , Andreas Wieser , Michael Hoelscher , Isabelle Sermet-Gaudelus , Nadine Binder , Lulla Opatowski , Jan Hasenauer

Causal Inference for Social Discrimination Reasoning

The discovery of discriminatory bias in human or automated decision making is a task of increasing importance and difficulty, exacerbated by the pervasive use of machine learning and data mining. Currently, discrimination discovery largely…

Computers and Society · Computer Science 2019-11-05 Bilal Qureshi , Faisal Kamiran , Asim Karim , Salvatore Ruggieri , Dino Pedreschi

Selective Inference Approach for Statistically Sound Predictive Pattern Mining

Discovering statistically significant patterns from databases is an important challenging problem. The main obstacle of this problem is in the difficulty of taking into account the selection bias, i.e., the bias arising from the fact that…

Machine Learning · Statistics 2016-03-10 Shinya Suzumura , Kazuya Nakagawa , Mahito Sugiyama , Koji Tsuda , Ichiro Takeuchi

Identification of Nonlinear Latent Hierarchical Models

Identifying latent variables and causal structures from observational data is essential to many real-world applications involving biological data, medical data, and unstructured data such as images and languages. However, this task can be…

Machine Learning · Computer Science 2023-11-01 Lingjing Kong , Biwei Huang , Feng Xie , Eric Xing , Yuejie Chi , Kun Zhang

Causal Discovery in Linear Latent Variable Models Subject to Measurement Error

We focus on causal discovery in the presence of measurement error in linear systems where the mixing matrix, i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables, is identified up to…

Machine Learning · Computer Science 2022-11-09 Yuqin Yang , AmirEmad Ghassami , Mohamed Nafea , Negar Kiyavash , Kun Zhang , Ilya Shpitser

Identifiability Guarantees for Causal Disentanglement from Purely Observational Data

Causal disentanglement aims to learn about latent causal factors behind data, holding the promise to augment existing representation learning methods in terms of interpretability and extrapolation. Recent advances establish identifiability…

Machine Learning · Computer Science 2024-12-25 Ryan Welch , Jiaqi Zhang , Caroline Uhler

Causal learning with sufficient statistics: an information bottleneck approach

The inference of causal relationships using observational data from partially observed multivariate systems with hidden variables is a fundamental question in many scientific domains. Methods extracting causal information from conditional…

Machine Learning · Statistics 2020-10-13 Daniel Chicharro , Michel Besserve , Stefano Panzeri

Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables

Estimating causal effects from nonexperimental data is a fundamental problem in many fields of science. A key component of this task is selecting an appropriate set of covariates for confounding adjustment to avoid bias. Most existing…

Machine Learning · Computer Science 2025-10-28 Zheng Li , Xichen Guo , Feng Xie , Yan Zeng , Hao Zhang , Zhi Geng

Differentiable Structure Learning and Causal Discovery for General Binary Data

Existing methods for differentiable structure learning in discrete data typically assume that the data are generated from specific structural equation models. However, these assumptions may not align with the true data-generating process,…

Machine Learning · Computer Science 2025-10-28 Chang Deng , Bryon Aragam

Learning with Hidden Factorial Structure

Statistical learning in high-dimensional spaces is challenging without a strong underlying data structure. Recent advances with foundational models suggest that text and image data contain such hidden structures, which help mitigate the…

Machine Learning · Statistics 2025-02-04 Charles Arnal , Clement Berenfeld , Simon Rosenberg , Vivien Cabannes

Boosting Synthetic Data Generation with Effective Nonlinear Causal Discovery

Synthetic data generation has been widely adopted in software testing, data privacy, imbalanced learning, and artificial intelligence explanation. In all such contexts, it is crucial to generate plausible data samples. A common assumption…

Artificial Intelligence · Computer Science 2024-10-16 Martina Cinquini , Fosca Giannotti , Riccardo Guidotti

A General Identification Algorithm For Data Fusion Problems Under Systematic Selection

Causal inference is made challenging by confounding, selection bias, and other complications. A common approach to addressing these difficulties is the inclusion of auxiliary data on the superpopulation of interest. Such data may measure a…

Methodology · Statistics 2024-04-16 Jaron J. R. Lee , AmirEmad Ghassami , Ilya Shpitser

Data-Driven Confounder Selection via Markov and Bayesian Networks

To unbiasedly estimate a causal effect on an outcome unconfoundedness is often assumed. If there is sufficient knowledge on the underlying causal structure then existing confounder selection criteria can be used to select subsets of the…

Methodology · Statistics 2017-03-20 Jenny Häggström

Measuring Latent Causal Structure

Discovering latent representations of the observed world has become increasingly more relevant in data analysis. Much of the effort concentrates on building latent variables which can be used in prediction problems, such as classification…

Machine Learning · Computer Science 2010-01-08 Ricardo Silva

Identification of Causal Structure in the Presence of Missing Data with Additive Noise Model

Missing data are an unavoidable complication frequently encountered in many causal discovery tasks. When a missing process depends on the missing values themselves (known as self-masking missingness), the recovery of the joint distribution…

Machine Learning · Computer Science 2023-12-20 Jie Qiao , Zhengming Chen , Jianhua Yu , Ruichu Cai , Zhifeng Hao

Causal discovery under a confounder blanket

Inferring causal relationships from observational data is rarely straightforward, but the problem is especially difficult in high dimensions. For these applications, causal discovery algorithms typically require parametric restrictions or…

Methodology · Statistics 2022-06-29 David S. Watson , Ricardo Silva

Learning Measurement Models for Unobserved Variables

Observed associations in a database may be due in whole or part to variations in unrecorded (latent) variables. Identifying such variables and their causal relationships with one another is a principal goal in many scientific and practical…

Machine Learning · Computer Science 2012-12-12 Ricardo Silva , Richard Scheines , Clark Glymour , Peter L. Spirtes