统计方法学
Recent evidence suggests that analyzing the presence/absence of taxonomic features can offer a compelling alternative to differential abundance analysis in microbiome studies. However, standard approaches to differential prevalence analysis…
Large-scale neuroimaging studies often collect data from multiple scanners across different sites, where variations in scanners, scanning procedures, and other conditions across sites can introduce artificial site effects. These effects may…
Interval-censoring frequently occurs in studies of chronic diseases where disease status is inferred from intermittently collected biomarkers. Although many methods have been developed to analyze such data, they typically assume perfect…
Validation gating is a fundamental component of classical Kalman-based tracking systems. Only measurements whose normalized innovation squared (NIS) falls below a prescribed threshold are considered for state update. While this procedure is…
Understanding how habitats shape species distributions and abundances across river networks remains a longstanding and fundamental challenge in ecology, with direct implications for effective biodiversity management and conservation. We…
Conventional joint modeling approaches generally characterize the relationship between longitudinal biomarkers and discrete event occurrences within terminal, recurring or competing risk settings, thereby offering a limited representation…
The factor modeling for high-dimensional time series is powerful in discovering latent common components for dimension reduction and information extraction. Most available estimation methods can be divided into two categories: the…
In many image analysis problems, the contours of objects carry important statistical information about shape. Such contours are typically affected by deformation variables including scaling, translation, rotation, and reparametrization.…
Traditional geostatistical methods assume independence between observation locations and the spatial process of interest. Violations of this independence assumption are referred to as preferential sampling (PS). Standard methods to address…
Time series in natural sciences, such as hydrology and climatology, and other environmental applications, often consist of continuous observations constrained to the unit interval (0,1). Traditional Gaussian-based models fail to capture…
In clinical studies, the risk of the primary (terminal) event may be modified by intermediate events, resulting in semicompeting risks. To study the treatment effect on the terminal event mediated by the intermediate event, researchers wish…
We introduce a novel class of Bayesian mixtures for normal linear regression models which incorporates a further Gaussian random component for the distribution of the predictor variables. The proposed cluster-weighted model aims to…
Adaptive experiments use preliminary analyses of the data to inform further course of action and are commonly used in many disciplines including medical and social sciences. Because the null hypothesis and experimental design are…
In randomized controlled trials (RCTs) that focus on time-to-event outcomes, intercurrent events can arise in two ways: as semi-competing events, which modify the hazard of the primary outcome events, or as competing events, which make the…
The reuse of medico-administrative and synthetic spatial data may overcome some limitations of population-based registries, provided rigorous validation is performed. However, no tool exists to spatially validate a candidate-for-reuse…
We propose a new statistical estimation framework for a large family of global sensitivity analysis indices. Our approach is based on rank statistics and uses an empirical correlation coefficient recently introduced by Chatterjee [9]. We…
Randomized clinical trials typically aim to estimate a marginal treatment effect. While covariate adjustment can improve precision, it may change the estimand in nonlinear models due to noncollapsibility, leading to conditional rather than…
External validation of clinical prediction models is crucial for assessing whether they are fit for use. The $C$-statistic is a widely used measure of discriminative performance of such models predicting a binary outcome. A method for…
Quantitative practice across statistics, engineering, and machine learning has been transformed by the automation of inference. Predictions are produced, validated, and deployed at scale and speed that human-mediated reasoning could not…
Rank regression offers robustness to outliers and heavy-tailed response distributions, invariance to monotonic transformations, and improved efficiency under non-Gaussian errors, making it a versatile tool for analyzing complex data. This…