统计方法学
The recent surge in valuations among AI related firms has renewed concerns that markets may be entering a new phase of speculative exuberance, especially in the technology and semiconductor sectors at the center of the AI investment wave.…
The Beta kernel estimator offers a theoretically superior alternative to the Gaussian kernel for unit interval data, eliminating boundary bias without requiring reflection or transformation. However, its adoption remains limited by the lack…
Hidden variable graphical models can sometimes imply constraints on the observable distribution that are more complex than simple conditional independence relations. These observable constraints can falsify assumptions of the model that…
In causal machine learning, the fitting and evaluation of nuisance models are often performed on separate partitions, or folds, of the observed data. This technique, called cross-fitting, eliminates bias introduced by the use of black-box…
We introduce a novel class of nonlinear tests for serial dependence in functional time series, grounded in the functional quantile autocorrelation framework. Unlike traditional approaches based on the classical autocovariance kernel, the…
Sequential monitoring of randomized trials traditionally relies on parametric assumptions or asymptotic approximations. We discuss a family of nonparametric sequential tests - collectively called e-RT - for binary, event-only, and…
Identifying causal effects in the presence of unmeasured variables is a fundamental challenge in causal inference, for which proxy variable methods have emerged as a powerful solution. We contrast two major approaches in this framework: (1)…
Integer-valued generalized autoregressive conditional heteroskedastic (INGARCH) models are a popular framework for modeling serial dependence in count time-series. While convenient for modeling, prediction, and estimation, INGARCH models…
The difference-in-differences (DID) research design is a key identification strategy which allows researchers to estimate causal effects under the parallel trends assumption. While the parallel trends assumption is counterfactual and cannot…
We evaluate the performance of targeted maximum likelihood estimation (TMLE) for estimating the average treatment effect in missing data scenarios under varying levels of positivity violations. We employ model- and design-based simulations,…
Stepped wedge designs (SWDs) are increasingly used to evaluate longitudinal cluster-level interventions but pose substantial challenges for valid inference. Because crossover times are randomized, intervention effects are intrinsically…
Model-based Bayesian inference for sample and population-level causal estimands has been growing in popularity. This literature routinely emphasizes clear specification of the target estimand, however blind implementation of standard…
Multivariate Hawkes processes are past-dependant point processes originally introduced to model excitation effects, later extended to a nonlinear framework to account for the opposite effect, known as inhibition. Motivated by applications…
The linear fractional stable motion (LFSM) extends the fractional Brownian motion (fBm) by considering $\alpha$-stable increments. We propose a method to forecast future increments of the LFSM from past discrete-time observations, using the…
In many economic applications, multiple source datasets are available, but their effective combination is challenging due to heterogeneity across datasets. To address this problem, we study a parameter-transfer framework that shares only…
Categorical Gini Correlation (CGC), introduced by Dang et al. (2020), is a novel dependence measure designed to quantify the association between a numerical variable and a categorical variable. It has appealing properties compared to…
Suppose that a data analyst wishes to report the results of a least squares linear regression only if the overall null hypothesis, $H_0^{1:p}: \beta_1= \beta_2 = \ldots = \beta_p=0$, is rejected. This practice, which we refer to as…
Unmeasured confounding, unethical exposure, and ill-defined interventions pose significant challenges to evaluating policy-relevant mediation estimands in medicine and public health. In observational studies involving harmful exposures, the…
Epidemic models are invaluable tools to understand and implement strategies to control the spread of infectious diseases, as well as to inform public health policies and resource allocation. However, current modeling approaches have…
Differences-in-differences (DiD) is a causal inference method for observational longitudinal data that assumes parallel expected potential outcome trajectories between treatment groups under the counterfactual scenario where all units…