统计方法学
As standards of care advance, patients are living longer and once-fatal diseases are becoming manageable. Clinical trials increasingly focus on reducing disease burden, which can be quantified by the timing and occurrence of multiple…
The conditional average treatment effect (CATE) is a commonly targeted statistical parameter for measuring the effect of a treatment conditional on covariates. However, the CATE will fail to capture effects of treatments beyond differences…
Bridging the gap between internal and external validity is crucial for heterogeneous treatment effect estimation. Randomised controlled trials (RCTs), favoured for their internal validity due to randomisation, often encounter challenges in…
With reference to a binary outcome and a binary mediator, we derive identification bounds for natural effects under a reduced set of assumptions. Specifically, no assumptions about confounding are made that involve the outcome; we only…
Species distribution models (SDMs) are key tools in ecology, conservation and management of natural resources. They are commonly trained by scientific survey data but, since surveys are expensive, there is a need for complementary sources…
Estimation of parameters that obey specific constraints is crucial in statistics and machine learning; for example, when parameters are required to satisfy boundedness, monotonicity, or linear inequalities. Traditional approaches impose…
The Gaussian Process (GP) assumption is often used in functional data analysis. We propose a method to assess departures from the GP assumption, both in terms of the shape of the distribution and its potential dependence on covariates,…
Functional data consist of trajectories observed over a continuous domain, such as time, space, or wavelength. Here we consider curves observed on different groups of subjects and propose a Bayesian multi-group functional factor analysis…
The aim of survey statistics is to produce estimates with a minimal bias and a corresponding acceptable variance given a specific budget, preferable with a minor response burden for the participants. In recent years, considerable efforts…
Canonical correlation analysis is a classic well-known multivariate statistical method focusing on the relationships between two sets of variables. The visualisation of those relationships can be achieved by means of a biplot of the…
Stochastic natural gradient variational inference (NGVI) is a popular and efficient algorithm for Bayesian inference. Despite empirical success, the convergence of this method is still not fully understood. In this work, we define and study…
In the realm of high-dimensional data analysis, the estimation of covariance matrices is a fundamental task, and this holds true for interval-valued data as well. However, there is no unified definition for the covariance matrix of…
Modern medical research demands specialized causal inference methods evaluating complex continuous-time dynamic treatment regimens using observational data. For instance, obtaining the causal effects of intravenous administration, a…
Statistical inference on large-dimensional tensor data has been extensively studied in the literature and widely used in economics, biology, machine learning, and other fields, but how to generate a structured tensor with a target…
Linear stochastic transitivity is a central assumption in paired comparison models that is rarely verified in practice. Empirical violations, however, are common and can substantially affect inference and ranking. We develop a class of…
One-shot federated learning enables multi-site inference with minimal communication. However, sharing summary statistics can still leak sensitive individual-level information when sites have only a small number of patients. In particular,…
Evaluating the causal effect of an intervention on multivariate outcomes is challenging when the outcomes are interdependent and derived rather than directly observed. Effective connectivity, which summarizes the directional neural…
Understanding vaccine effects on post-infection outcomes is critical for evaluating the full value proposition of a vaccine. However, defining appropriate causal effects on such outcomes is challenging because infection is affected by…
Delayed primary outcomes and administratively censored follow-up create a general semiparametric estimation problem: the target causal functional depends on an endpoint observed only for a shrinking subset of units at analysis time, while…
Learning the dependence structure among variables in complex systems is a central problem across medical, natural, and social sciences. These structures can be naturally represented by graphs, and the task of inferring such graphs from data…