统计方法学
Debiased machine learning estimators for smooth functionals in nonparametric models can exhibit substantial variability and instability, often leading practitioners to instead rely on parametric or semiparametric working models. Such…
Distance covariance is a widely used statistical methodology for testing the dependency between two groups of variables. Despite the appealing properties of consistency and superior testing power, the testing results of distance covariance…
In a multiple testing task, finding an appropriate estimator of the proportion $\pi_0$ of non-signal in the data to boost power of false discovery rate (FDR) controlling procedures is a long-standing research theme, sometimes referred to as…
We consider a variant of sequential testing by betting where, at each time step, the statistician is presented with multiple data sources (arms) and obtains data by choosing one of the arms. We consider the composite global null hypothesis…
Detection of minimal residual disease (MRD) in cancer patients after surgery can provide an early marker for disease recurrence and guide subsequent treatment decisions. Accurate and sensitive estimation of tumour burden after cancer…
Statistical offices face a familiar and intensifying dilemma: rising demand for detailed regional and domain-level estimates under budgets that are fixed or shrinking. National statistical offices (NSOs) either ignore the problem of optimal…
Conditions ensuring optimal parameter estimation in the presence of missing data are well established in inference, typically relying on the Missing-at-Random (MAR) assumption. In prediction, similar principles are often assumed to apply.…
This article summarises the methods used by the team ``Ca' Foscari" for the EVA 2025 Data Challenge. The questions of the challenge concern the estimation of exceedance probabilities across several locations. Rather than modelling the…
Hidden Markov models (HMMs) are powerful tools for analysing time series data that depend on discrete underlying but unobserved states. As such, they have gained prominence across numerous empirical disciplines, in particular ecology,…
In this paper we discuss a well known computing problem -- inference for models with intractable normalizing functions. Models with intractable normalizing functions arise in a wide variety of areas, for instance network models, models for…
The Sen index and Sen-Shorrocks-Thon (SST) index are widely used measures of poverty indices. Developing reliable inference for these measures enables us to compare these measures in different populations of interest in an effective way. It…
As a general and robust alternative to traditional mean regression models, quantile regression avoids the assumption of normally distributed errors, making it a versatile choice when modeling outcomes such as cognitive scores that typically…
Gaussian process (GP) regression is widely used for uncertainty quantification, yet the standard formulation assumes noise-free covariates. When inputs are measured with error, this errors-in-variables (EIV) setting can lead to…
Lately, a New Transmuted Logistic-exponential (NTLE) distribution was introduced and studied as an extension of the Logistic-Exponential Distribution (LED) with wider applicability in lifetime modelling. However, the maximum likelihood…
We consider estimation of high-dimensional long-run covariance matrices for time series with nonconstant means, a setting in which conventional estimators can be severely biased. To address this difficulty, we propose a difference-based…
Persistent homology (PH) characterizes the shape of brain networks through persistence features. Group comparison of persistence features from brain networks can be challenging as they are inherently heterogeneous. A recent scale-space…
We develop estimators that improve precision of heterogeneous treatment effect estimates that allow borrowing information from observational studies when the available covariates in each data source do not perfectly match. Standard…
Randomized experiments (often known as "A/B tests") are widely used to evaluate product and service innovations. We study how to allocate limited experimentation resources across M concurrent experiments in an experiment-rich regime.…
Structural assumptions are central to the causal inference literature. In practice, it is often crucial to assess their validity or to test implications that follow from them. In many settings, such tests can be framed as evaluating whether…
AutoRegressive Conditional Heteroscedasticity (ARCH) models are standard for modeling time series exhibiting volatility, with a rich literature in univariate and multivariate settings. In recent years, these models have been extended to…