Related papers: Demystifying statistical learning based on efficie…
We propose and analyze estimators for statistical functionals of one or more distributions under nonparametric assumptions. Our estimators are based on the theory of influence functions, which appear in the semiparametric statistics…
Epidemiologists increasingly use causal inference methods that rely on machine learning, as these approaches can relax unnecessary model specification assumptions. While deriving and studying asymptotic properties of such estimators is a…
Use of machine learning to estimate nuisance functions (e.g. outcomes models, propensity score models) in estimators used in causal inference is increasingly common, as it can mitigate bias due to model misspecification. However, it can be…
Despite the risk of misspecification they are tied to, parametric models continue to be used in statistical practice because they are accessible to all. In particular, efficient estimation procedures in parametric models are simple to…
Statistical inference, a central tool of science, revolves around the study and the usage of statistical estimators: functions that map finite samples to predictions about unknown distribution parameters. In the frequentist framework,…
Considering the increasing size of available data, the need for statistical methods that control the finite sample bias is growing. This is mainly due to the frequent settings where the number of variables is large and allowed to increase…
In various statistical settings, the goal is to estimate a function which is restricted by the statistical model only through a conditional moment restriction. Prominent examples include the nonparametric instrumental variable framework for…
Causal influence measures for machine learnt classifiers shed light on the reasons behind classification, and aid in identifying influential input features and revealing their biases. However, such analyses involve evaluating the classifier…
We provide a novel characterization of semiparametric efficiency in a generic supervised learning setting where the outcome mean function -- defined as the conditional expectation of the outcome of interest given the other observed…
We aim to construct a class of learning algorithms that are of practical value to applied researchers in fields such as biostatistics, epidemiology and econometrics, where the need to learn from incompletely observed information is…
This paper aims to provide a tutorial for upper level undergraduate and graduate students in statistics, biostatistics and epidemiology on deriving influence functions for non-parametric and semi-parametric models. The author will build on…
Instrumental variable methods are fundamental to causal inference when treatment assignment is confounded by unobserved variables. In this article, we develop a general nonparametric causal framework for identification and learning with…
Many causal and structural effects depend on regressions. Examples include policy effects, average derivatives, regression decompositions, average treatment effects, causal mediation, and parameters of economic structural models. The…
Model selection requires repeatedly evaluating models on a given dataset and measuring their relative performances. In modern applications of machine learning, the models being considered are increasingly more expensive to evaluate and the…
Sub-sampling is a common and often effective method to deal with the computational challenges of large datasets. However, for most statistical models, there is no well-motivated approach for drawing a non-uniform subsample. We show that the…
Estimation of causal effects using machine learning methods has become an active research field in econometrics. In this paper, we study the finite sample performance of meta-learners for estimation of heterogeneous treatment effects under…
Modern causal inference methods allow machine learning to be used to weaken parametric modeling assumptions. However, the use of machine learning may result in complications for inference. Doubly-robust cross-fit estimators have been…
We present machine learning estimators for causal and predictive parameters under covariate shift, where covariate distributions differ between training and target populations. One such parameter is the average effect of a policy that…
Estimators based on influence functions (IFs) have been shown to be effective in many settings, especially when combined with machine learning techniques. By focusing on estimating a specific target of interest (e.g., the average effect of…
An adaptive design adjusts dynamically as information is accrued and a consequence of applying an adaptive design is the potential for inducing small-sample bias in estimates. In psychometrics and psychophysics, a common class of studies…