统计方法学 — Scifaro

Robust discriminant analysis

Discriminant analysis (DA) is one of the most popular methods for classification due to its conceptual simplicity, low computational cost, and often solid performance. In its standard form, DA uses the arithmetic mean and sample covariance…

统计方法学 · 统计学 2026-05-12 Mia Hubert , Jakob Raymaekers , Peter J. Rousseeuw

Nonparametric Motion Control in Functional Connectivity Studies in Children with Autism Spectrum Disorder

Autism Spectrum Disorder (ASD) is a neurodevelopmental condition associated with difficulties with social interactions, communication, and restricted or repetitive behaviors. To characterize ASD, investigators often use functional…

统计方法学 · 统计学 2026-05-12 Jialu Ran , Sarah Shultz , Benjamin B. Risk , David Benkeser

Semiparametric fiducial inference for Cox models

R. A. Fisher introduced the fiducial distribution as a potential replacement for the Bayesian posterior distribution in the 1930s. During the past century, fiducial approaches have been explored in various parametric and nonparametric…

统计方法学 · 统计学 2026-05-12 Yifan Cui , Jan Hannig , Paul Edlefsen

Extracting Mechanisms from Heterogeneous Effects: An Identification Strategy for Mediation Analysis

Understanding causal mechanisms is crucial for explaining and generalizing empirical phenomena. Causal mediation analysis offers statistical techniques to quantify the mediation effects. Although numerous methods have been developed for…

统计方法学 · 统计学 2026-05-12 Jiawei Fu

Nonconvex High-Dimensional Time-Varying Coefficient Estimation for Noisy High-Frequency Observations with a Factor Structure

In this paper, we propose a novel high-dimensional time-varying coefficient estimator for noisy high-frequency observations with a factor structure. In high-frequency finance, we often observe that noises dominate the signal of underlying…

统计方法学 · 统计学 2026-05-12 Minseok Shin , Donggyu Kim

A Generative Approach to Joint Modeling of Quantitative and Qualitative Responses

In many scientific areas, data with quantitative and qualitative (QQ) responses are commonly encountered with a large number of predictors. By exploring the association between QQ responses, existing approaches often consider a joint model…

统计方法学 · 统计学 2026-05-12 Xiaoning Kang , Lulu Kang , Wei Chen , Xinwei Deng

Mixture of Finite Mixtures Model for Basket Trial

With the recent paradigm shift from cytotoxic drugs to new generation of target therapy and immuno-oncology therapy during oncology drug developments, patients with various cancer (sub)types may be eligible to participate in a basket trial…

统计方法学 · 统计学 2026-05-12 Junxian Geng , Tianjian Zhou , Ruitao Lin , Guanyu Hu

Bayesian Auxiliary Variable Model for Birth Records Data with Qualitative and Quantitative Responses

Many applications involve data with qualitative and quantitative responses. When there is an association between the two responses, a joint model will provide improved results than modeling them separately. In this paper, we propose a…

统计方法学 · 统计学 2026-05-12 Xiaoning Kang , Shyam Ranganathan , Lulu Kang , Julia Gohlke , Xinwei Deng

Locally Optimal Design for A/B Testing in the Presence of Covariates and Network Connection

A/B test, a simple type of controlled experiment, refers to the statistical procedure of experimenting to compare two treatments applied to test subjects. For example, many IT companies frequently conduct A/B tests on their users who are…

统计方法学 · 统计学 2026-05-12 Qiong Zhang , Lulu Kang

A Maximin $\Phi_{p}$-Efficient Design for Multivariate GLM

Experimental designs for a generalized linear model (GLM) often depend on the specification of the model, including the link function, the predictors, and unknown parameters, such as the regression coefficients. To deal with uncertainties…

统计方法学 · 统计学 2026-05-12 Yiou Li , Lulu Kang , Xinwei Deng

Covariate Balancing Based on Kernel Density Estimates for Controlled Experiments

Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes. A completely randomized design is usually used to randomly assign treatment levels to…

统计方法学 · 统计学 2026-05-12 Yiou Li , Lulu Kang , Xiao Huang

Gaussian Process Assisted Active Learning of Physical Laws

In many areas of science and engineering, discovering the governing differential equations from the noisy experimental data is an essential challenge. It is also a critical step in understanding the physical phenomena and prediction of the…

统计方法学 · 统计学 2026-05-12 Jiuhai Chen , Lulu Kang , Guang Lin

High-dimensional semi-supervised learning: in search for optimal inference of the mean

A fundamental challenge in semi-supervised learning lies in the observed data's disproportional size when compared with the size of the data collected with missing outcomes. An implicit understanding is that the dataset with missing…

统计方法学 · 统计学 2026-05-12 Yuqian Zhang , Jelena Bradic

D-optimal Design for Network A/B Testing

A/B testing refers to the statistical procedure of conducting an experiment to compare two treatments, A and B, applied to different testing subjects. It is widely used by technology companies such as Facebook, LinkedIn, and Netflix, to…

统计方法学 · 统计学 2026-05-12 Victoria Pokhiko , Qiong Zhang , Lulu Kang , D'arcy P. Mays

Estimating the distribution of marks of a homogeneous marked Poisson process

In this paper we propose an estimator of the distribution of events of different kinds in a homogeneous Poisson process. We give an explicit solution for the maximum likelihood estimator of the distribution and derive its strong consistency…

统计方法学 · 统计学 2026-05-12 Dragi Anevski , Vladimir Pastukhov

Empirical Bayes Rebiasing

We study methods for simultaneous analysis of many noisy and biased estimates, each paired with an even noisier estimate of its own bias. The analyst's goal is to construct short calibrated intervals for each parameter. The standard…

统计方法学 · 统计学 2026-05-11 Wanyi Ling , Sida Li , Junming Guan , Nikolaos Ignatiadis

Semi-supervised Method for Risk Prediction with Doubly Censored EHR Data

The rapid expansion of large-scale electronic health record (EHR) data offers unique opportunities to improve the accuracy and efficiency of clinical risk estimation. Yet, because clinical events may occur outside the recording health…

统计方法学 · 统计学 2026-05-11 Jie Zhou , Enhao Wang , Xuan Wang

Randomization Tests for Distributions of Individual Treatment Effects via Combined Rank Statistics

What proportion of treated units actually benefited from an experimental intervention? What is the median or the largest individual treatment effect? This paper develops methods for answering such questions about the distribution of…

统计方法学 · 统计学 2026-05-11 David Kim , Yongchang Su , Jake Bowers , Xinran Li

BAMIFun: Bayesian Multiple Imputation for Functional Data

Missing data are pervasive in modern functional datasets, where trajectories are often sparsely or irregularly observed. Although Functional Principal Component Analysis (FPCA) is widely used to reconstruct incomplete curves, existing…

统计方法学 · 统计学 2026-05-11 Ziren Jiang , Lei Xuan , Eric F. Lock , Erjia Cui

Cellwise and Casewise Robust Multivariate Regression with Inference

Multivariate linear regression is a fundamental statistical task, but classical estimators such as ordinary least squares are highly sensitive to outliers. These may occur as casewise outliers that affect entire observations, or as outlying…

统计方法学 · 统计学 2026-05-11 Fabio Centofanti , Mia Hubert , Peter J. Rousseeuw