统计方法学
Latent space models are widely used in statistical network analysis and are often fit by Markov chain Monte Carlo. However, posterior summaries of latent coordinates are not canonical because the likelihood depends only on pairwise…
Clinical evidence synthesis requires identifying relevant trials from large registries and aggregating results that account for population differences. While recent LLM-based approaches have automated components of systematic review, they…
This paper provides a statistical analysis of three common methods of regression for Poisson data in the presence of Poisson background, namely the joint fit with two parametric models for the source and the background, the use of a…
To address the multidimensional nature of health-related questions, advances in health research often require integrating information from various data sources within statistical analyses. When complementary information pertaining to the…
Large-scale online platforms and marketplace systems often evaluate new policies through experiments that randomize treatment across operational units (e.g., geographies, regions, or clusters) over many time periods. In these settings,…
Discrete choice models are fundamental tools in management science, economics, and marketing for understanding and predicting decision-making. Logit-based models are dominant in applied work, largely due to their convenient closed-form…
Modern applications have made ubiquitous high-dimensional data, especially time-dependent data, with more and more complicated structures, and it also has become more frequent to encounter the scenario of hierarchical relationships among…
Model selection in penalized regression critically depends on an accurate assessment of model complexity, commonly quantified through the effective degrees of freedom. While the Lasso admits a simple and unbiased characterization, given by…
Place-based epidemiology studies often rely on circular buffers to define ``exposure'' to spatially distributed risk factors, where the buffer radius represents a threshold beyond which exposure does not influence the outcome of interest.…
Dynamic multilayer networks are frequently used to describe the structure and temporal evolution of multiple relationships among common entities, with applications in fields such as sociology, economics, and neuroscience. However,…
Given two populations from which independent binary observations are taken with parameters $p_1$ and $p_2$ respectively, estimators are proposed for the relative risk $p_1/p_2$, the odds ratio $p_1(1-p_2)/(p_2(1-p_1))$ and their logarithms.…
Latent space models for network data characterize each node through a vector of latent features whose pairwise similarities define the edge probabilities among the pairs of nodes. Although this formulation has led to successful…
This paper presents a novel method for analytical derivations of marginal densities using the fractional derivatives of moment-generating functions. Although the method requires likelihood functions to take specific forms, its assumptions…
Two-sample hypothesis testing is a fundamental problem with various applications, which faces new challenges in the high-dimensional context. To mitigate the issue of the curse of dimensionality, high-dimensional data are typically assumed…
Matching is a widely used causal inference design that aims to approximate a randomized experiment using observational data by forming matched sets of treated and control units based on similarities in their covariates. Ideally, treated…
We consider the Sparse Principal Component Analysis (SPCA) problem under the well-known spiked covariance model. Recent work has shown that the SPCA problem can be reformulated as a Mixed Integer Program (MIP) and can be solved to global…
Microbial interaction networks can rewire in response to host and environmental factors, yet most existing methods for network estimation treat the covariance structure as static across samples. We propose TRECOR, a Bayesian covariance…
We study how sampling geometry contributes to uncertainty in modeling spatial geophysical observations as sampled random fields characterized by stationary, isotropic, parametric covariance functions. We incorporate the signature of…
Variational inference (VI) has become a widely used approach for scalable Bayesian inference, but its performance strongly depends on the flexibility of the chosen variational family. In this work, we propose a novel variational family that…
Almost every numerical task can be cast as extrapolation with respect to the fidelity or tolerance parameters of a consistent numerical method. This perspective enables probabilistic uncertainty quantification and optimal experimental…