统计方法学 — Scifaro

On Heterogeneity in Wasserstein Space

Data represented by probability measures arise as empirical distributions, posterior distributions, and feature-based representations of complex objects. We study heterogeneity in a population of probability measures through the expected…

统计方法学 · 统计学 2026-03-17 Kisung You

Prior- and likelihood-free probabilistic inference with finite-sample calibration guarantees

Motivated by parametric models for which the likelihood is analytically unavailable, numerically unstable, or prohibitively expensive to compute or optimize, we develop a prior- and likelihood-free framework for fully probabilistic…

统计方法学 · 统计学 2026-03-17 Leonardo Cella , Emily C. Hector

Scalable Text-Embedding-informed Cognitive Diagnosis of Large Language Models

Large language models (LLMs) have achieved remarkable performance on diverse benchmarks, yet existing evaluation practices largely rely on coarse summary metrics that obscure underlying reasoning abilities. In this work, we propose novel…

统计方法学 · 统计学 2026-03-17 Jia Liu , Zhiyu Xu , Yuqi Gu

Gradient Boosting for Spatial Panel Models with Random and Fixed Effects

Due to the increase in data availability in urban and regional studies, various spatial panel models have emerged to model spatial panel data, which exhibit spatial patterns and spatial dependencies between observations across time.…

统计方法学 · 统计学 2026-03-17 Michael Balzer , Adhen Benlahlou

Label Noise Cleaning for Supervised Classification via Bernoulli Random Sampling

Label noise - incorrect labels assigned to observations - can substantially degrade the performance of supervised classifiers. This paper proposes a label noise cleaning method based on Bernoulli random sampling. We show that the mean label…

统计方法学 · 统计学 2026-03-17 Yuxin Liu , Xiong Jin , Yang Han

A Bayesian Critique of Rank-Based Methods for Surrogate Marker Evaluation

Surrogate markers are often employed in clinical trials to replace primary outcomes that may be difficult, expensive, or time-consuming to measure directly. These markers can accelerate the evaluation of new treatments, provided they…

统计方法学 · 统计学 2026-03-17 Pietro Carlotti , Layla Parast

Conformalized Robust Principal Component Analysis

Robust principal component analysis (RPCA) is a widely used technique for recovering low-rank structure from matrices with missing entries and sparse, possibly large-magnitude corruptions. Although numerous algorithms achieve accurate point…

统计方法学 · 统计学 2026-03-17 Liangliang Yuan , Lei Wang , Quan Kong , Liuhua Peng

Rank-based Maxsum test for high dimensional regression coefficient

We study global inference for regression coefficients in high-dimensional linear models under potentially heavy-tailed errors. While sum-type tests are powerful for dense alternatives and max-type tests excel for sparse alternatives,…

统计方法学 · 统计学 2026-03-17 Ping Zhao , Liangliang Yuan

Beyond Means: Topological Causal Effects under Persistent-Homology Ignorability

Average treatment effects (ATE) and conditional average treatment effects (CATE) are foundational causal estimands, but they target changes in expected outcomes and can miss treatment-induced changes in the shape of outcome distributions. A…

统计方法学 · 统计学 2026-03-17 Amir Saki , Usef Faghihi

Semiparametric copula-based quantile regression for semicontinuous outcomes with application to healthcare data

A semiparametric copula-based two-part quantile regression framework is developed for the analysis of semicontinuous outcomes characterized by a point mass at zero and a continuous positive component. The proposed approach models the…

统计方法学 · 统计学 2026-03-17 Guanjie Lyu , Mohamed Belalia , Abdulkadir Hussein

Structured Credal Learning

Real-world learning tasks often encounter uncertainty due to covariate shift and noisy or inconsistent labels. However, existing robust learning methods merge these effects into a single distributional uncertainty set. In this work, we…

统计方法学 · 统计学 2026-03-17 Varun Venkatesh , Eyke Hüllermeier , Bernd Bischl , Mina Rezaei

Spatially Varying Coefficient Mallows Model Averaging

Model averaging, as an appealing ensemble technique, strategically integrates all valuable information from candidate models to construct fast and accurate prediction. Despite of having been widely practiced in many fields such as…

统计方法学 · 统计学 2026-03-17 Zhuang Yong , Lv Jing , Tingting Li

Learning the Optimal Composite Mediator: Closed-Form Solution and Inference

Understanding how an exposure transmits its effect through high-dimensional intermediaries is a central problem in observational research. We study the problem of finding a composite mediator that maximises the indirect effect of an…

统计方法学 · 统计学 2026-03-17 Zihuai He

A Kernel-Based Nonparametric Test for Conditional Independence of Functional Data

Conditional independence is a fundamental concept in many areas of statistical research, including, for example, sufficient dimension reduction, causal inference, and statistical graphical models. In many modern applications, data arise in…

统计方法学 · 统计学 2026-03-17 Yin Tang , Bing Li

Fast Uncertainty Quantification for Kernel-Based Estimators in Large-Scale Causal Inference

Kernel methods are widely used in causal inference for tasks such as treatment effect estimation, policy evaluation, and policy learning. The bootstrap is a standard tool for uncertainty quantification because of its broad applicability. As…

统计方法学 · 统计学 2026-03-17 Matthew Kosko , Falco J , Bargagli-Stoffi , Lin Wang , Michele Santacatterina

Surrogate-Based Bayesian Inference: Uncertainty Quantification and Active Learning

Surrogate models - also called emulators - are widely used to facilitate Bayesian inference in settings where computational costs preclude the use of standard posterior inference algorithms. Their deployment is now standard practice across…

统计方法学 · 统计学 2026-03-17 Andrew Gerard Roberts , Michael C. Dietze , Jonathan H. Huggins

Measuring Extreme Tail Association

Simultaneous occurrences of extreme events need not imply symmetric or reciprocal tail dependence. However, most existing measures of extremal dependence are inherently symmetric and hence often fail to capture directional influence in tail…

统计方法学 · 统计学 2026-03-17 Bikramjit Das , Xiangyu Liu

Confidence intervals for two-stage adaptive designs with subpopulation selection

We consider clinical trials in which an experimental treatment is compared with a control in pre-specified patient subpopulations. In such settings, adaptive enrichment designs allow the enrolled population to be modified at an interim…

统计方法学 · 统计学 2026-03-17 Enyu Li , Nigel Stallard , Ekkehard Glimm , Dominic Magirr , Peter K. Kimani

Addressing both variable selection and misclassified responses with parametric and semiparametric methods

While variable selection has received extensive attention in the literature, its exploration in the presence of response measurement error remains underexplored. In this paper, we investigate this important problem within the context of…

统计方法学 · 统计学 2026-03-17 Hui Guo , Grace Y. Yi , Boyu Wang

Robust Inferential Methodology for Multidimensional Diffusion Processes

We investigate robust parameter estimation and testing procedure for multivariate diffusion processes observed at high frequency via the minimum density power divergence estimator (MDPDE). Within a general diffusion framework and under…

统计方法学 · 统计学 2026-03-17 Sourojyoti Barick