Related papers: Bayesian Double Machine Learning for Causal Infere…
Confounding bias is a key challenge in causal effect estimation from observational data. Double Machine Learning (DML) addresses this issue by estimating treatment and outcome nuisance functions, constructing treatment and outcome…
We propose a doubly robust inference method for causal effects of continuous treatment variables, under unconfoundedness and with nonparametric or high-dimensional nuisance functions. Our double debiased machine learning (DML) estimators…
The estimation of causal effects with observational data continues to be a very active research area. In recent years, researchers have developed new frameworks which use machine learning to relax classical assumptions necessary for the…
Double (debiased) machine learning (DML) has seen widespread use in recent years for learning causal/structural parameters, in part due to its flexibility and adaptability to high-dimensional nuisance functions as well as its ability to…
The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide…
A major limitation of machine learning (ML) prediction models is that they recover associational, rather than causal, predictive relationships between variables. In high-stakes automation applications of ML this is problematic, as the model…
Double machine learning (DML) has become an increasingly popular tool for automated variable selection in high-dimensional settings. Even though the ability to deal with a large number of potential covariates can render…
This paper provides an introduction to Double/Debiased Machine Learning (DML). DML is a general approach to performing inference about a target parameter in the presence of nuisance functions: objects that are needed to identify the target…
We consider estimating a low-dimensional parameter in an estimating equation involving high-dimensional nuisances that depend on the parameter. A central example is the efficient estimating equation for the (local) quantile treatment effect…
We address the problem of two-variable causal inference without intervention. This task is to infer an existing causal relation between two random variables, i.e. $X \rightarrow Y$ or $Y \rightarrow X$ , from purely observational data. As…
Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically deliver good estimators of causal…
Proper hyperparameter tuning is essential for achieving optimal performance of modern machine learning (ML) methods in predictive tasks. While there is an extensive literature on tuning ML learners for prediction, there is only little…
The results from Genome-Wide Association Studies (GWAS) on thousands of phenotypes provide an unprecedented opportunity to infer the causal effect of one phenotype (exposure) on another (outcome). Mendelian randomization (MR), an…
Estimating causal treatment effects in observational settings is frequently compromised by selection bias arising from unobserved confounders. While traditional econometric methods struggle when these confounders are orthogonal to…
A common issue in learning decision-making policies in data-rich settings is spurious correlations in the offline dataset, which can be caused by hidden confounders. Instrumental variable (IV) regression, which utilises a key unconfounded…
Debiased machine learning (DML) offers an attractive way to estimate treatment effects in observational settings, where identification of causal parameters requires a conditional independence or unconfoundedness assumption, since it allows…
Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to…
One way to quantify exposure to air pollution and its constituents in epidemiologic studies is to use an individual's nearest monitor. This strategy results in potential inaccuracy in the actual personal exposure, introducing bias in…
Bayesian synthetic likelihood (BSL) is now a well established method for performing approximate Bayesian parameter estimation for simulation-based models that do not possess a tractable likelihood function. BSL approximates an intractable…
Unlike parametric regression, machine learning (ML) methods do not generally require precise knowledge of the true data generating mechanisms. As such, numerous authors have advocated for ML methods to estimate causal effects.…