统计方法学 — Scifaro

Tweedie-based nonparametric estimation for semicontinuous mixed densities

Semicontinuous outcomes occur frequently in health services, insurance, and cost studies. Standard nonparametric density estimators are not well suited to such data because they do not naturally accommodate the mixed structure, the…

统计方法学 · 统计学 2026-05-06 Guanjie Lyu , Frédéric Ouimet , Cindy Feng

Adaptive Targeted Maximum Likelihood Estimation of the Mean Potential Outcome under a Treatment Rule

Estimating the mean counterfactual outcome under a treatment rule is a central problem in causal inference and policy evaluation. Standard estimators, including inverse probability weighting (IPW), augmented IPW (AIPW), and targeted maximum…

统计方法学 · 统计学 2026-05-06 Yichen Xu , Mark J. van der Laan

Evaluating the performance of GCM trajectories using Weather Type frequencies for persistence and transitions: the Iberian Peninsula and Lamb classification

This study evaluates the performance of 36 historical CMIP6 GCM trajectories (1979-2005) in reproducing atmospheric circulation over the Iberian Peninsula in the summer months (June-September) using the Lamb Weather Type (WT) classification…

统计方法学 · 统计学 2026-05-06 Elsa Barrio-Torres , Swen Brands , Jesús Asín , Jesús Abaurrea , Zeus Gracia-Tabuenca , Jorge Castillo-Mateo

TWICEBEE: A Two-stage Intra-patient Curve-free Bayesian Decision-Theoretic Dose Escalation Design

We propose a novel Phase I intra-patient dose-escalation design tailored for multi-cycle immunotherapy settings, in which toxicity at a fixed dose level is clinically expected to decrease over successive treatment cycles. This design was…

统计方法学 · 统计学 2026-05-06 Dehua Bi , Zina Good , Katherine Ryan , Sabine Heitzeneder , John S. Tamaresis , Robert Lowskey , Michelle Monje , Crystal Mackall , Ying Lu

Non-ignorable fuzziness in granular counts: the case of RNA-seq data

RNA-seq count data are often affected by read-to-gene alignment ambiguity, especially in high-dimensional transcriptomics. This type of ambiguity can be conveniently expressed through granular counts, namely fuzzy-valued observations of…

统计方法学 · 统计学 2026-05-06 Antonio Calcagnì , Arianna Consiglio , Przemyslaw Grzegorzewski , Corrado Mencar

Novel g-computation algorithms for time-varying actions with recurrent and semi-competing events

Background: A core aspect of epidemiology is determining the impacts of potential public health interventions over time. With long follow-up periods, epidemiologists may need to consider semi-competing events, in which a terminal event,…

统计方法学 · 统计学 2026-05-06 Alena Sorensen D'Alessio , Lucas M. Neuroth , Jessie K Edwards , Chantel L. Martin , Paul N Zivich

A Model-Robust G-Computation Method for Analyzing Hybrid Control Studies Without Assuming Exchangeability

There is growing interest in a hybrid control design for treatment evaluation, where a randomized controlled trial is augmented with external control data from a previous trial or a real world data source. The hybrid control design has the…

统计方法学 · 统计学 2026-05-06 Zhiwei Zhang , Peisong Han , Wei Zhang

INLA-RF: A Hybrid Modeling Strategy for Spatio-Temporal Environmental Data

Environmental processes often exhibit complex, non-linear patterns and discontinuities across space and time, posing significant challenges for traditional geostatistical modeling approaches. In this paper, we propose a hybrid…

统计方法学 · 统计学 2026-05-06 Mario Figueira , Michela Cameletti , Luca Patelli

A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows

We consider the problem of sampling from a probability distribution $\pi$ which admits a density w.r.t. a dominating measure. It is well known that this can be written as an optimisation problem over the space of probability distributions…

统计方法学 · 统计学 2026-05-06 Francesca Romana Crucinio

Conditional independence testing with a single realization of a multivariate nonstationary nonlinear time series

Identifying relationships among stochastic processes is a core objective in many fields, such as economics. While the standard toolkit for multivariate time series analysis has many advantages, it can be difficult to capture nonlinear…

统计方法学 · 统计学 2026-05-06 Michael Wieck-Sosa , Michel F. C. Haddad , Aaditya Ramdas

SID: A Novel Class of Nonparametric Tests of Independence for Censored Outcomes

We propose a new class of metrics, called the survival independence divergence (SID), to test dependence between a right-censored outcome and covariates. A key technique for deriving the SIDs is to use a counting process strategy, which…

统计方法学 · 统计学 2026-05-06 Jinhong Li , Jicai Liu , Jinhong You , Riquan Zhang

Examining the robustness of a model selection procedure in the binary latent block model through a language placement test data set

When entering French university, the students' foreign language level is assessed through a placement test. In this work, we model the placement test results using binary latent block models which allow to simultaneously form homogeneous…

统计方法学 · 统计学 2026-05-06 Vincent Brault , Frédérique Letué , Marie-José Martinez

Inference with non-differentiable surrogate loss in a general high-dimensional classification framework

Penalized empirical risk minimization with a surrogate loss function is often used to learn a high-dimensional linear decision rule in classification problems. Although much of the literature focus on the generalization error, there is a…

统计方法学 · 统计学 2026-05-06 Muxuan Liang , Yang Ning , Maureen A Smith , Ying-Qi Zhao

Assessing the Impact of Block Size on Block Likelihood Estimation: A Comparative Study

This paper focuses on block likelihood estimation for geostatistical data, a method that balances statistical accuracy and computational efficiency. Central to this approach is the choice of block size, which can significantly impact…

统计方法学 · 统计学 2026-05-06 Alfredo Alegría

A New Way to Look at Regional Survey Data: Differences in Vacancy Rates and Persons per Household by County, 2000-2005

Regional survey estimates and their significance levels are simultaneously displayed in maps that show all 3,141 U.S. counties and equivalents. An analyst can focus his attention on significant differences (or those with a different,…

统计方法学 · 统计学 2026-05-06 Charles D. Coleman , Jonathan F. Takeuchi

Post-selection Inference in Multiverse Analysis (PIMA): an inferential framework based on the sign flipping score test

When analyzing data researchers make some decisions that are either arbitrary, based on subjective beliefs about the data generating process, or for which equally justifiable alternative choices could have been made. This wide range of…

统计方法学 · 统计学 2026-05-06 Paolo Girardi , Anna Vesely , Daniël Lakens , Gianmarco Altoè , Massimiliano Pastore , Antonio Calcagnì , Livio Finos

A CV-TMLE global test approach to improve power in rare disease clinical studies with multiple-component endpoints

Rare disease trials face unique statistical challenges due to limited patient populations and heterogeneous clinical manifestations among patients. Multiple endpoints are often necessary to comprehensively capture treatment benefits. A…

统计方法学 · 统计学 2026-05-05 Tianyue Zhou , Susan Gruber , Hana Lee , Wonyul Lee , Lei Nie , Mark van der Laan

The Bayesian Reflex: Online Learning as the Autonomic Nervous System of Modern and Future AI

This chapter introduces the Bayesian reflex -- an analogy with the autonomic nervous system -- as a unifying framework for online learning in AI. Bayesian online algorithms automatically maintain equilibrium in dynamic environments via…

统计方法学 · 统计学 2026-05-05 Durba Bhattacharya , Sucharita Roy , Sourabh Bhattacharya

EstemPMM: Polynomial Maximization Method for Non-Gaussian Regression and Time Series in R

We describe the R package EstemPMM, which implements the Polynomial Maximization Method (PMM) for parameter estimation under non-Gaussian errors. PMM exploits higher-order cumulants of the error distribution -- specifically the third…

统计方法学 · 统计学 2026-05-05 Serhii Zabolotnii

The Ancestor Hawkes Process with an Application to Group Chat Data

The Hawkes process is used to model point process data where events occur in clusters and bursts. In a standard multivariate Hawkes process, every event that occurs in a dimension has an equal impact on the process intensity. However, this…

统计方法学 · 统计学 2026-05-05 Gordon J Ross , Isabella Deutsch